Concurrent programming aims to execute potentially different pieces of work at the same time and parallel computing aims to reduce a piece of work into multiple pieces to execute concurrently. Parallel computing has been around for decades, but it has remained a difficult problem. It aims to support multi-core, multi-CPU and distributed systems. The continued work for supporting these paradigms is great, because it has always been an issue to keep user interfaces responsive and handle operations quickly. In recent years, consumption of asynchronous services has exploded and parallel operations to some extent. As devices continue to grow and gain a significant amount of cores, expect parallel and asynchronous functionality to become more and more common.

With the release of .NET 4, Microsoft added a new namespace under the System.Threading namespace called System.Threading.Tasks. All of the previous threading abilities are still available, but the new additions provide a different way to work with multi-threaded constructs.

With the evolution of multi-threaded capabilities within the .NET framework, things can get a little confusing. Here is a brief history with some notes.

What are Threads?

When running an application or program, it is executing a process. Multiple processes can be executed concurrently, like when an email client and a web browser are used at the same time. A look at what is going on inside of a process shows threads. In much of the same way that a process is to an operating system, a thread is to a process. The major difference is that processes do not share any memory between them and threads have that ability in a restricted fashion. Synchronization of mutable objects between threads is often a pitfall for developers.

Threads in .NET 1.0+

In the System.Threading namespace, the Thread class exists along with with other classes to provide fine-grained multi-threading capabilities for the .NET framework. This means thread synchronization support and data access using classes like the following.

  • Mutex – Used for inter-process communication (IPC).
  • Monitor – Provides a mechanism that synchronizes access to objects.
  • Interlocked – Provides atomic operations for variables that are shared by multiple threads.
  • AutoResetEvent – Notifies a waiting thread that an event has occurred.
  • Semaphore – Limits the number of threads that can access a resource or pool of resources concurrently. (Added with .NET 2.0)
  • and many more.

The ThreadPool in .NET 1.0+

The System.Threading.ThreadPool class provides a group of system managed threads to the application and is often a more efficient way to handle multi-thread programming. This is because it helps the developer avoid having threads spending a majority of the time waiting on another thread or sitting in a sleep state. In order to execute a method in the ThreadPool, you can call QueueUserWorkItem that specifies the method to execute and an optional parameter for any data that is needed. It is also allowed to use an asynchronous delegate with BeginInvoke and EndInvoke methods. The method specified will begin executing when a Thread in the ThreadPool becomes available.

Each process is limited to one system level thread pool. The ThreadPool manages background threads, so if all foreground threads exit, then the ThreadPool will not keep the application alive. In this case, finally and using blocks are not handled correctly, so using a method call to Join, Wait or Timeout to avoid this should be practiced.

The default ThreadPool limits are:

  • .NET 2.0 – 25 threads
  • .NET 3.5 – 250 threads
  • .NET 4.0 (32-bit) – 1,023 threads
  • .NET 4.0 (64-bit) – 32,768 threads

The BackgroundWorker in .NET 2.0+

When there is a need to execute some non-UI process, the System.ComponentModel.BackgroundWorker will spawn a new thread and execute the operations. It offers a progress indicator to report back to the calling thread, forwarding of exceptions and canceling the processing. If the situation warrants using multiple BackgroundWorkers though, consideration should be given to the Task Parallel Library.

The BackgroundWorker class follows the event-based asynchronous pattern (EAP). The EAP means it abstracts and manages the multi-threading capabilities while allowing for basic interaction via events. When the words Async and Completed are appended to a class methods, it may be implementing some form of the EAP. Another similar pattern is the asynchronous programming model (APM), which use Begin and End methods. Both the EAP and APM work well with the new .NET 4.0 construct Task that is mentioned later in this post.

Besides directly using the BackgroundWorker implementation, it can also be subclassed. It would involve overriding the OnDoWork method and handling of the RunWorkerCompleted and ProgressChanged events in the consuming class. The subclass provides a better level of abstraction for a single asynchronously executing method.

BackgroundWorker uses the ThreadPool, so it benefits from improvements that have been made with later versions of the .NET framework. Using the ThreadPool also means that calling Abort should not be done. In a case where you want to wait for completion or cancellation of the BackgroundWorker, you may want to consider using the Task Parallel Library.

The Dispatcher in .NET 3.0+

The System.Windows.Threading.Dispatcher class is actually single-threaded in that it doesn’t spawn a new thread. It places operations in a state to execute when BeginInvoke is called, but it executes on the same thread that it’s instantiated in and then communicates to another thread. The reason for the Dispatcher’s existence¬†boils down to thread affinity. A user interface Control or DependencyObject is forced to strictly belong to its instantiating thread. For example, in the case of Windows Presentation Foundation (WPF) and Silverlight , the Dispatcher class allows a non-UI thread to “update” a TextBox control’s Text property on the UI thread through marshaling.

Parallel LINQ (PLINQ) in .NET 4.0+

PLINQ is a parallel implementation of the Language-Integrated Query (LINQ) pattern. Just like LINQ to Objects and LINQ to XML, PLINQ can operate against any IEnumerable or IEnumerable<T>. The namespace for PLINQ is System.Linq.ParallelEnumerable, but this implementation of LINQ doesn’t force parallel operations on everything. There are additional methods too, such as:

  • AsParallel – This is how to enable PLINQ. If the rest of the query can be parallelized, it will do so.
  • AsSequential<T> – Will turn a previously parallelized query back into a sequential one.
  • AsOrdered – Preserve ordering until further instructed by something like an order by clause or AsUnordered<T>.
  • AsUnordered<T> – No longer preserve ordering of the query.
  • ForAll<T> – Allows for processing in parallel instead of requiring a merge back to the consuming thread.
  • Aggregate – Provides intermediate and final aggregation of results.
  • and a few more.

The AsParallel method is very straightforward to try, as the call is made directly on the data source within a LINQ query or foreach loop.

PLINQ does not guarantee that the query will be executed in parallel. It checks if it is safe to parallelize and if doing so will likely provide an improvement. If the check conditions are not satisfied, it will execute the query sequentially. By using the optional WithExecutionMode, PLINQ will guarantee parallel execution.

Exceptions are bundled up together from all the threads and placed into an AggregateException, which you can then iterate through to process each exception or flatten into a single exception. This special type of exception is used in other areas of .NET 4.0 multi-threading too.

Custom partitioning is offered for a way that a developer can specify how the data source should be parallelized. For instance, if the data source contains hundreds of thousands of rows and testing shows that some of the threads are only given a few hundred rows, a partition can be created on the data source accordingly. Custom partitioning is done to the data source before the query and the resulting object replaces the data source within the query.

The Task Parallel Library (TPL) in .NET 4.0+

The TPL is a collection of constructs in the System.Threading and System.Threading.Tasks namespaces. This post has split PLINQ out above because it resides in a different namespace, but some documentation refers to them together. Some of the same characteristics mentioned in PLINQ apply here too, since PLINQ actually reduces a query into Tasks (defined below).

As mentioned in the opening statement, all of the fine-grained constructs of multi-threading are still available, so what is the need for the TPL? The goal is to make parallel programming easier. The TPL uses an algorithm to dynamically update during the execution for the most effective utilization of resources. Under PLINQ, there is a section on custom partitioning, which is to override the built in partitioning. Collectively, the TPL handles the the default partitioning of data, the ThreadPool, cancellations and state.

“The Task Parallel Library is the preferred way to write multi-threaded and parallel code.” – MSDN

The Parallel Class

The Parallel class provides the methods For, Invoke and ForEach to process operations in parallel.

  • For – parallel equivalent of the for keyword
  • Invoke – executes Action delegates in parallel
  • ForEach – parallel equivalent of the foreach keyword
The Task Class

Tasks offer much of the same functionality as previous solutions like Thread, but also include continuations, cancellation tokens, forwarding and context synchronization. The Parallel class reduces its For, ForEach and Invoke methods into Tasks. A Task is semantically the same as a Thread, but does not require creating an operating system thread, because it is put into the ThreadPool. Also, multiple Tasks may run on the same Thread. That can be confusing at first, but it offers a lot of flexibility.

In comparison to directly using the ThreadPool by starting a parallel execution of a method by calling QueueUserWorkItem, the Task class has a Factory property which is of type TaskFactory. From the TaskFactory, a call to StartNew and passing in a lambda expression will queue up the work. By default, Tasks will be placed in the ThreadPool. If the option for a long running operation is specified, the Task will be created on a separate thread. Regardless, these ways of creating a Task mean that execution will be in a background thread. If you want a reference to the Task created, the StartNew method returns a Task object. Using that object, traditional functionality is available for things like waiting. Tasks also support setting up parent-child relationships which can be very useful for wait and continuation operations.


Continuations provide a way to execute code after a Task completes with the option of using the result from the Task. Continuations provide a very nice fluent syntax that resembles having a Completed method tied to an event. The fluent syntax isn’t required if there is a reference to the Task so determination can be done of what to continue with. Multiple continuations can be specified to handle error conditions, cancelations and normal completion of a Task. One of the major goals of continuations was to provide a situation for non-blocking on waiting for a Thread or Task to complete.

Parallel Primitives and Data Structures in .NET 4.0+

Thread Safe Collections

In .NET 1.0, the System.Collections namespace provides some built in support for thread safety with the Synchronized property. Microsoft states that the implementation is not scalable and is not completely protected from race conditions.

With .NET 2.0, the System.Collections.Generic namespace brought generic collections, but removed any thread safe capabilities. This means the consumer needs to handle all synchronization, but the type safety, improved performance, and scalability are significant.

Bring in .NET 4.0 and the addition of System.Collections.Concurrent. This provides even better performance than the .NET 2.0 collections and provides a more complete implementation of thread safety than .NET 1.0. This namespace includes:

  • BlockingCollection<T>
  • ConcurrentBag<T>
  • ConcurrentDictionary<TKey, TValue>
  • ConcurrentQueue<T>
  • ConcurrentStack<T>
Lazy Initialization

Lazy initialization of objects comes into play when those operations are expensive. The application may not require the expensive objects, so using these new constructs can have a significant impact on performance.

  • Lazy<T> – Thread-safe lazy initialization.
  • ThreadLocal<T> – Lazy initialization specific to each thread.
  • LazyInitializer – Alternative to Lazy<T> by using static methods.

The Barrier class is interesting because it allows for Threads to have checkpoints. Each Barrier represents the end of some block or phase of work. At at checkpoint, it allows for specifying a single thread to do some post-block work before continuing. Microsoft recommends using Tasks with implicit joins if the Barriers are only doing one or two blocks of work.

SpinLock and SpinWait

The SpinLock and SpinWait structs were added because sometimes it’s more efficient to spin than block. That may seem counter-intuitive, but if the spin will be relatively quick it can produce major benefits in a highly parallelized application because of not having to perform a context switches.

Miscellaneous Notes

Deadlock Handling

In the case of a deadlock, SQL Server will determine one of the offending threads and terminate it. This doesn’t happen within .NET. A developer must take careful consideration to avoid deadlocks and should use timeouts to help avoid this situation.

Forcing Processor Affinity

In some cases, running in parallel can be problematic. One way to avoid such complications is to set the processor affinity on through the Process class. Call the GetCurrentProcess method and then use the ProcessorAffinity property to get or set the affinity as needed.

Debugging Parallel Applications in Visual Studio 2010+

There are two new additional debugging windows added with Visual Studio 2010. They are the Parallel Stacks window and Parallel Tasks window. The Parallel Stacks window provides a diagram layout based on either Tasks or Threads and lets the developer see the call stack for each construct. The Parallel Tasks window resembles the Threads window with a grid of all Tasks.

Task Parallel Library (TPL) in .NET 4.5

The most notable changes in .NET 4.5 will most likely be the async and wait keywords. There is a major focus on making continuations as fast as possible and the wait keyword will hopefully simplify writing continuations.


There is a lot of support for multi-threaded, parallel and asynchronous programming within the .NET framework. Hopefully you now have a better understanding of what each construct does. The latest addition, the TPL, has some major improvements and should be added to your toolbox. Pay attention to what .NET 4.5 will provide as it aims to make things even easier.

Further Reading