Tuesday, September 12, 2006

JRun Threadpool settings

The other day I was looking at JRun's Threadpool implementation and it really took me SOME time to understand that piece of code. It is one of those codes which are not meant to be understood by others :D. It got me really confused about 'Active Handler Threads', 'min Handler Threads' and 'max Handler threads'. It was very much different from what I had assumed. Too much of 'creating runnables', 'swapping runnables', 'destroying runnables'.. phew.. I think this is the exact reason why Doug Lea and team had to introduce a standard implementation of ThreadPool in 'Tiger' release. Its so simple, neat and elegant I wonder why wasn't it introduced earlier in JDK.

Anyways, enough of cribbing. After my enlightenment of JRun's or CFMX's threadpool, I thought it would be nice to share it with you all. So what are these thread counts? (I am sure most of you would have it figured out. Neverthless.. :) )

Min handler threads - It is the number of web threads that will be spawned initially and will be waiting for HTTP requests. Which effectively means that it is the no of threads which will be waiting on serversocket.accept(). Thus it controls the no of requests that will be accepted concurrently. It is ideally the minimum concurrent users that you expect on the server. As soon as a thread gets a client request (i.e comes out of serversocket.accept()), it enters into a throttle before processing the request. This is where active handler threads come into the picture. Before the thread starts processing the request, it spawns another thread, if required, which can listen to the incoming requests.

Active handler threads - This decides how many requests would be concurrently processed. The throttle we talked about above, allows a maximum of "active handler" threads to continue and rest of threads wait until another thread exits the throttle. This along with "min handler thread" controls the throughput of the server. The value of active handler threads must be between min Handler count and maxHandler Count.

Max handler threads - This is maximum number of threads that can be created in the pool. This includes the threads queued in the throttle + threads processing the requests + threads waiting on the server socket. Once the server reaches the max Handler thread count, server will start denying the request throwing "Server Busy Error". You can see the "Server Busy" error even without server reaching the "max handler" limit if the thread in the throttle queue timeout. So if you see this error, dont start increasing the max handler thread count. You might need to tune all the three counts.

Having both MinHandler and activeHandler counts help JRun in addressing any sudden spike in the load. Lets say your minHandler count is 20 and activeHandler count is 40 and suddenly you have 40 concurrent requests, all of them will be served without any queuing and delay. When the load eases down on the server, it will let the extra threads die and bring the thread count down to minhandler count i.e 20.

The next question that naturally comes to the mind is what should be the appropriate values of these for my server? Having too less value for it would mean that you are not utilizing the potential of the server well and requests start queuing up even though server can handle it. Having a too high value for these would mean too many context switches and the server performance will deteriorate. (Too many context switches means CPU is busy scheduling the threads rather than executing them and thus hurts the performance) So what should be the appropriate value? well.. there can not be one answer or a formula to compute these values. It depends on your application, the traffic that your application expects, memory and processors of the machine on which you are going to run it etc etc.

By default the values in "ColdFusion standalone" are

- Min Handler Thread - 1
- Active Handler Thread - 8
- max Handler thread - 1000

min and active counts here are fine for a development machine but definitely not for a production machine. And in my opinion the value of 'max handler' is bit high even for a production machine. Creating a large no of threads does not necessarily increases the throughput of your server. It can actually lower it down because of the high no of context switches VM will have to make. Moreover, it might not be possible to create 1000's of threads because of OS limitations. On many of the OS, you will get an OutOfMemory Error because the VM will not be able to spawn so many native threads for you. I think max handler count in the range of 300-400 should be good enough.

Regarding tuning these counts, there are huge no of articles around which will tell you how to go about it. Since notion of these counts exist on all the application/web servers, articles need not be CF or JRun specific. In brief, you would need to run some kind of load tests with different values of minHandler and active handler counts, note the throughput and plot a graph. This graph should help you arrive at the appropriate value for these settings.


Hemant Khandelwal said...

Nice post Rupesh

Chris said...

Great Post Rupesh. I was wondering what the difference is in the "threadWaitTimeOut" compared to "timeout"? You explained the other parts real well in this blog just wanted a better understanding of these settings. Thanks Chris

Steven Erat, ColdFusion QA said...

Thank you for your explanation. From years of supporting ColdFusion customers, I'm supplementing this with the links below to my related blog entries which elaborate on these topics. Although, I arrived at my explanations from the top down rather than bottom up.

activeHandlerThreads or Simultaneous Requests: Less is More

Unable to create new native thread

Timed out waiting for an available thread to run

JRun Closed Connection

Rupesh Kumar said...

Wow ! Great posts Steve. Must read for everyone specially CF administrators.

Rupesh Kumar said...

Hi Chris,
'threadWaitTimeOut' is the time for which threads will wait in the throttle I mentioned in the post. After this time, the thread will be assumed timed out and "server busy error" will be thrown. So lets say your active handler count is 5 and all five threads are busy processing requests. At this point, if a sixth request comes,it will wait for this time before it will be timed out.
'timeout' is the socket timeout value (SO_TIMEOUT in sockets) which actually means that socket read at server end will block for this duration and if it does read anything within this time, it will throw a socket timeout.

Chris said...

Thanks for all help. Chris

Anonymous said...

Knowledgeful post Rupesh !!!.
The problem I am facing is on single request jrun.exe consuming maximum CPU(sometime application nonrespondive).
I think you have great knowledge of coldfusion administration.
I would appreciate if any suggestion comes.
my email id :singhvijay2468@gmail.com
Thanks in advance.
vijay k. singh

Charlie Arehart said...

Thanks for this Rupesh. The observation about the default maxhandlerthreads being possibly too high and the cause of "unable to create new native threads" is very interesting.

One thing you've not clarified, though, is which set of handlerthread entries you were referring to. In the jrun.xml, there are 3 such sets:

- jrunx.scheduler.SchedulerService
- jrun.servlet.http.WebService
- jrun.servlet.jrpp.JRunProxyService

Since you refer to the activeHandlerThreads being connected to the Admin setting for simultaneous requests, and I find that that gets set for the latter two above, I'll assume you mean them. And the difference for them is that the WebService is for requests from the internal web server, while the JRunProxyService is for those from an external web server (such as IIS or Apache). Can you confirm?

And what about the SchedulerService? Where does it come into play for this discussion? You're right that so much of this stuff has always been very much a black art with confusing recommendations, within and without the CF community.

Finally, it seems worth clarifying for some readers where this jrun.xml is found. In a standalone deployment (on Windows, for CF8), it would be in C:\ColdFusion8\runtime\servers\coldfusion\SERVER-INF\, and in the multiserver mode, it would be C:\JRun4\servers\[instancename]\SERVER-INF.

Charlie Arehart said...

I'd like to ask another point of clarification: you made the comment that the default setting of 1000 for maxhandlerthreads may be too large, because when a server's traffic finally approaches using that that many, the OS may simply not be able to provide them. Do I have that right?

In that case, how do we best monitor that, and how do we mitigate it?

In the case of monitoring, if on Windows, are we talking about monitoring the threadcount counter in the process object for the jrun instance? And would this be the same value we'd see if we enabled Jrun metrics to watch jrpp.totalTh?

And since you said the total is those threads running + those queued + those "waiting on the server socket", how do we monitor each? I'll assume jrpp.totalth is the total currently used, and jrpp.busyth is those active. Is jrpp.delayTh those queued? What then are those you describe as "waiting on the server socket"? Is that jrpp.delayTh or jrpp.delayRq?

Also, can we monitor these through perfmon?

Finally, as for how to mitigate the problem, one may wonder what they can do when they do hit the error "unable to create new native threads". If we lower the max, that will increase the chance of people getting "server unavailable" errors, right?

But then maybe one could handle more total threads by using the -Xss jvm argument to lower the amount of stack space used per thread, as discussed in the first comment at http://www.talkingtree.com/blog/index.cfm/2005/3/11/NewNativeThread. What do you think of that?

FInally, what if one is getting the "Unable to create new native thread" error, but analysis shows that there's not a problem at the time with there being too many running or queued requests? How then might we still get this error? It would seem surprising for the requests "waiting on the server socket" would be that high.

But maybe this points to something else not said in your entry (and hinted at in my last comment). What if the inability to create a new thread is due to the total thread count being more than just the threads from the jrun.servlet.jrpp.JRunProxyService?

Maybe it's also requests coming in on the jrun.servlet.http.WebService, if the internal server is enabled. And maybe it's also requests/threads running per the jrunx.scheduler.SchedulerService. I see that the Jrun metrics let us monitor them, too, but what sort of things happen on those threads?

And what about CFTHREADs? and CFREPORT threads? And though only Enterprise lets you configure thread counts for them, what about flash remoting, web service, and http-based CFC method calls?

Finally, for others facing this problem, what if additional threads are being grabbed by things other than JRun in the JVM. Consider if you're running something like FusionReactor or SeeFusion, since those have their own web server that they deploy within the CF instance.

If we could accurately measure what CF thinks it's using (across all 3 thread types), and what the jrun process reports is being used to the OS, we might then be able to infer the difference to be associated with them. Or am I off base?

Of course, the CF8 Server Monitor also uses threads, but it's using Flash Remoting I think.

I realize this is a long couple of comments, but this is all very interesting stuff that's often been hard to understand.

Charlie Arehart said...

Sorry for two too many "finally"s in my last note there. That's what I get for writing a long comment in a tiny box (and not using preview). :-)

Rupesh Kumar said...

Thats a lot of questions :-) I would try to answer some of them here and take rest of them probably in a day or two.

Yes. You are right about handler threads. Though the definition for this set applies to all the threadpools in JRun, I was mainly talking about the web threads - which actually mean WebService incase of the internal web server or jRunProxyService in case of the connector.

As far as I know, Schedulerservice is used by the Metrics service of the Jrun. Thats the service which collects various metric information that you can see using cfstat.

I'd like to ask another point of clarification: you made the comment that the default setting of 1000 for maxhandlerthreads may be too large, because when a server's traffic finally approaches using that that many, the OS may simply not be able to provide them. Do I have that right?

Yes. There are two factors here - 1) OS will not be able to give you that much thread. This is too high a number and this is just a part of total number of threads in the JVM. 2)Even if it does, the VM might not scale upto that. There will be too much of thread contention and scheduling and the processor will be busy more in that rather than doing any meaningful processing.

Now how do you find the optimum number? - you have to run a load test and keep increasing the load and the numbers and arrive at some optimum.

And since you said the total is those threads running + those queued + those "waiting on the server socket", how do we monitor each? I'll assume jrpp.totalth is the total currently used, and jrpp.busyth is those active. Is jrpp.delayTh those queued? What then are those you describe as "waiting on the server socket"? Is that jrpp.delayTh or jrpp.delayRq?

You can refer to this document for jrpp metrics - http://livedocs.adobe.com/jrun/4/JRun_Administrators_Guide/netmon2.htm. This should answer most of your questions. jrpp.delayTh gives you the count of threads which have accepted the request but are queued because of high load. This is different from threads waiting on the server socket. If you see the description of min Handler threads in my post above, I have mentioned clearly that when a thread accepts the request i.e comes out of serversocket.accept(), it goes through a throttle and if the total no of threads processing the request has reached the activeHandler count, then it will be queued.

Charlie Arehart said...

Thanks, Rupesh. Your responses to this will contribute greatly to the understanding of many. I look forward to your continued replies to other aspects of the questions I'd raised.

In the meantime, about the last comment you made, I still wonder if I'm missing how we can tell how many requests are in that state of "waiting on the server socket", in other words, that have not yet "come out of serversocket.accept()". It's those concepts you've raised which I'm most interested in.

I have a sense that we don't often measure that like we might and it could be important. Or have I missed something you said that clarified that.

Charlie Arehart said...

Hey Rupesh, back in July I offered several questions and you answered a a couple hoping you'd get back to the rest another day. If you could take some time to reconsider the questions, I'm sure others would welcome and benefit from the insight you're in a unique position to be able to offer. Thanks for what you've shared.

Charlie Arehart said...

Rupesh, you had left a couple comments from July unanswered and said you'd try to get back to them. The questions still remain. Any thoughts?