Extreme Threadpoolin’

Recently I was approached by a colleague who asked me a couple of questions about ThreadPool. Firstly what is the maximum number of concurrent threads when using ThreadPool? And secondly he was experiencing some undesirable uneven processor load on a multi core system, what could be the reason? All in all some interesting questions.

The Windows operating system have no hardcoded fixed upper thread limit, but the number of threads is ultimately limited by the size of available memory and other basic resources. Generally speaking on a 64-bit system the default stack reserve for a single thread is 256K and typically 24K is committed. I suggest reading Pushing the Limits of Windows: Processes and Threads by Mark Russinovich for a more in-depth discussion about memory limits.

But according to Microsoft documentation, the ThreadPool itself has a built-in upper limit of 32.737 threads on 64 bit platforms as of .NET version 4.0 (however only 1.023 threads on 32 bit platforms). Ok, that was pretty easy, but unfortunately here in the year 2013 it is only possible to exhaust the ThreadPool and still have a responsive system, if your threads aren’t doing any real work. In real life applications you will hit other constraints before you manage to exhaust the ThreadPool.

Besides the memory limitations, the other obvious limitation is the number of physical CPU threads. These threads also have to serve the ThreadPool along with the operating system, needless to say that CPU time slices quickly can become a limited resource. The best practice for economizing with CPU threads is to suspend ThreadPool thread execution whenever possible and especially by using asynchronous methods, following this simple design rule will give you better thread mileage even if you are not performing CPU intensive tasks.

Another major constraint is when your application is relying on 3rd. party services, this is very difficult to diagnose because it is not always obvious when you have reached a limit, and the lack of immediate performance indicators can make the diagnostics an exercise in guesswork. I found that, implementing “thread wait reporting” is very helpful i.e. when a thread reaches a certain wait threshold it reports the offending task e.g. a WebRequest, and from then on it is a simple matter of addition to identify the bottleneck. Remember that your multimega-threaded high-performance application is only as high-performance as the bottleneck.

So the answer to the question of maximum number of concurrent threads, I would say ”it depends” but theoretically it is 32.737 concurrent threads.

Regarding the matter about undesirable uneven processor load, the ThreadPool was specifically designed to remove all the headaches of manual managing of threads, it has been developed over many years and is doing a very good job at scheduling tasks for execution. Therefore I must admit that I was really surprised to hear about such behavior, so I made a simple application that fires up 4000 threads that each calculates a factorial recursively. As you can see from below taskmanager screenshot, ThredPool does a really awesome job of managing the threads and spreading the load evenly.


So the only explanation that I can come up with is that something in the application have altered the process’s ProcessorAffinity which forces spawned threads to run on a specific subset of processors. As you can see from below taskmanager screenshot, I have set my application to run on only 2 processors, so I think this might be the most plausible explanation.


A final note, ThreadPool may not be suitable for all purposes, but it’s my favorite choice when it comes to quickly fire up several thousands identical threads without having to waste time on thread management considerations.

Month List