Using Parallel Extensions for .NET 4 in ASP.NET apps

ASP.NET

ASP.NET applications already get a lot of concurrency for free. The .NET Framework load balances incoming requests among ThreadPool worker threads, striving for optimal use of available CPUs. As long as you minimize blocking in your ASP.NET page code, ASP.NET will process requests concurrently. In most cases, and in particular for Web applications with heavy usage, it is probably not necessary to introduce extra parallelism since adding more work items will only result in competition for CPU time and ultimately reduce request throughput.

Dealing with I/O bound work

If most of the work being done in an ASP.NET request is asynchronous in nature (such as I/O), doing the asynchronous work synchronously can be a huge scalability bottleneck. Solutions based on Asynchronous Programming Model (APM) and Event-based Asynchronous Pattern (EAP) have been recommended to ease this bottleneck. For an in-depth discussion on this refer to Scalable Apps with Asynchronous Programming in ASP.NET and Asynchronous Pages in ASP.NET 2.0. The article Improving ASP.NET Performance also has some good pointers to improving the scalability of your web applications.

New features in the .NET Framework 4 can also be used to make programming asynchronous pages easier. The System.Threading.Tasks.Task class (and the Task<TResult> class that derives from it) can be used to represent asynchronous operations, both classes implement IAsyncResult, and they provide capabilities for coordinating between multiple asynchronous activities. Since part of ASP.NET’s asynchronous pages support is based on the Asynchronous Programming Model (APM) pattern and IAsyncResult, Task can play a role in easing the implementation of asynchronous pages. In particular, Task is most useful if you want to structure your code with continuations, which can be useful if you have multiple stages of asynchronous activity that need to happen before the rest of the page continues execution. For more details, refer to Tasks and the Event-based Asynchronous Pattern and Tasks and the APM Pattern

Dealing with CPU intensive work

Web applications that need to perform expensive computations may still benefit from parallelism if the latency of an individual request is more important than overall request throughput. If this is the case, the new APIs for parallelism in .NET 4 such as Task Parallel Library and PLINQ can simplify writing the parallel code. When integrating parallelism into your web application, consider the following factors:

ASP.NET thumb Using Parallel Extensions for .NET 4 in ASP.NET apps  

If requests are computationally cheap to process, then parallelism is probably an unnecessary overhead.

If the incoming request rate is high, then adding more parallelism will likely yield few benefits and could actually decrease performance, since the incoming rate of work may be high enough to keep the CPUs busy.

If the incoming request rate is low, then the Web application could benefit from parallelism by using the idle CPU cycles to speed up the processing of an individual request. We can use either PLINQ or TPL (either Parallel loops or the Task class) to parallelize the computation over all the processors. Note that by default, however, the PLINQ implementation in .NET 4 will tie-up one ThreadPool worker per processor for the entire execution of the query. As such, it should only be used in Web applications that see few but expensive requests.

If the incoming request rate is variable, i.e. there are long periods when request rate is low (say, at night) and then other periods when request rate is high (say, midday), we need a strategy that will dynamically adjust to the available resources. When the load is high, we don’t want to add to the contention but when the load is low, we want to use the idle resources. For this scenario, we can use TPL’s Parallel or Task constructs since they can adapt to use available resources within a process. If the server is already loaded, the Parallel loops can use as little as one worker and make forward progress. If the server is mostly free, they can grow to use as many workers as the ThreadPool can spare.

Developing libraries for ASP.NET

If you’re developing a library that uses the parallel programming features of .NET 4, you should consider whether it is going to be to be used within ASP.NET. If it is, you should consider exposing knobs from your library that enable controlling how much parallelism is employed by the library. This is particularly important for libraries that utilize PLINQ. In .NET 4, PLINQ by default uses a fixed number of workers equal to the number of logical processors. By exposing control to the consumer of the library, the consumer can specify a maximum amount of parallelism to be employed, and this value can be configured based on the environment. The number of workers PLINQ utilizes is controllable through the WithDegreeOfParallelism operator; the maximum number of workers utilized by the Parallel loops is controllable through the ParallelOptions class, an instance of which is supplied as a parameter to overloads of the looping constructs.

Conclusion

ASP.NET already takes advantage multiple processors on your server. Most developers will not need to explicitly add any parallelism into their ASP.NET Web applications. However, if your particular situation requires explicit parallelism, the new parallelism APIs in .NET 4 can be beneficial to you.

 Using Parallel Extensions for .NET 4 in ASP.NET apps

Using Parallel Extensions for .NET 4 in ASP.NET apps

ASP.NET

ASP.NET applications already get a lot of concurrency for free. The .NET Framework load balances incoming requests among ThreadPool worker threads, striving for optimal use of available CPUs. As long as you minimize blocking in your ASP.NET page code, ASP.NET will process requests concurrently. In most cases, and in particular for Web applications with heavy usage, it is probably not necessary to introduce extra parallelism since adding more work items will only result in competition for CPU time and ultimately reduce request throughput.

Dealing with I/O bound work

If most of the work being done in an ASP.NET request is asynchronous in nature (such as I/O), doing the asynchronous work synchronously can be a huge scalability bottleneck. Solutions based on Asynchronous Programming Model (APM) and Event-based Asynchronous Pattern (EAP) have been recommended to ease this bottleneck. For an in-depth discussion on this refer to Scalable Apps with Asynchronous Programming in ASP.NET and Asynchronous Pages in ASP.NET 2.0. The article Improving ASP.NET Performance also has some good pointers to improving the scalability of your web applications.

New features in the .NET Framework 4 can also be used to make programming asynchronous pages easier. The System.Threading.Tasks.Task class (and the Task<TResult> class that derives from it) can be used to represent asynchronous operations, both classes implement IAsyncResult, and they provide capabilities for coordinating between multiple asynchronous activities. Since part of ASP.NET’s asynchronous pages support is based on the Asynchronous Programming Model (APM) pattern and IAsyncResult, Task can play a role in easing the implementation of asynchronous pages. In particular, Task is most useful if you want to structure your code with continuations, which can be useful if you have multiple stages of asynchronous activity that need to happen before the rest of the page continues execution. For more details, refer to Tasks and the Event-based Asynchronous Pattern and Tasks and the APM Pattern

Dealing with CPU intensive work

Web applications that need to perform expensive computations may still benefit from parallelism if the latency of an individual request is more important than overall request throughput. If this is the case, the new APIs for parallelism in .NET 4 such as Task Parallel Library and PLINQ can simplify writing the parallel code. When integrating parallelism into your web application, consider the following factors:

ASP.NET thumb Using Parallel Extensions for .NET 4 in ASP.NET apps  

If requests are computationally cheap to process, then parallelism is probably an unnecessary overhead.

If the incoming request rate is high, then adding more parallelism will likely yield few benefits and could actually decrease performance, since the incoming rate of work may be high enough to keep the CPUs busy.

If the incoming request rate is low, then the Web application could benefit from parallelism by using the idle CPU cycles to speed up the processing of an individual request. We can use either PLINQ or TPL (either Parallel loops or the Task class) to parallelize the computation over all the processors. Note that by default, however, the PLINQ implementation in .NET 4 will tie-up one ThreadPool worker per processor for the entire execution of the query. As such, it should only be used in Web applications that see few but expensive requests.

If the incoming request rate is variable, i.e. there are long periods when request rate is low (say, at night) and then other periods when request rate is high (say, midday), we need a strategy that will dynamically adjust to the available resources. When the load is high, we don’t want to add to the contention but when the load is low, we want to use the idle resources. For this scenario, we can use TPL’s Parallel or Task constructs since they can adapt to use available resources within a process. If the server is already loaded, the Parallel loops can use as little as one worker and make forward progress. If the server is mostly free, they can grow to use as many workers as the ThreadPool can spare.

Developing libraries for ASP.NET

If you’re developing a library that uses the parallel programming features of .NET 4, you should consider whether it is going to be to be used within ASP.NET. If it is, you should consider exposing knobs from your library that enable controlling how much parallelism is employed by the library. This is particularly important for libraries that utilize PLINQ. In .NET 4, PLINQ by default uses a fixed number of workers equal to the number of logical processors. By exposing control to the consumer of the library, the consumer can specify a maximum amount of parallelism to be employed, and this value can be configured based on the environment. The number of workers PLINQ utilizes is controllable through the WithDegreeOfParallelism operator; the maximum number of workers utilized by the Parallel loops is controllable through the ParallelOptions class, an instance of which is supplied as a parameter to overloads of the looping constructs.

Conclusion

ASP.NET already takes advantage multiple processors on your server. Most developers will not need to explicitly add any parallelism into their ASP.NET Web applications. However, if your particular situation requires explicit parallelism, the new parallelism APIs in .NET 4 can be beneficial to you.

 Using Parallel Extensions for .NET 4 in ASP.NET apps

Gallery of Processor Cache Effects

Igor Ostrovsky is a developer on the Parallel Extensions team.  On his blog, he’s documented a great set of examples for how caches can effect application performance; this is important to think through when writing parallel applications, but as Igor demonstrates, it applies equally to serial applications.  Check out his post. Gallery of Processor Cache Effects

DryadLinq now available also for non-academic use

Several months ago, Microsoft announced for academic customers the availability of DryadLINQ.  DryadLINQ is a LINQ provider developed by Microsoft Research that enables  .NET developers to use the LINQ programming model for writing distributed queries and computations against a cluster of computers using Windows HPC Server. DryadLINQ enables developers to harness and tame the distributed data storage and computational resources of a cluster, all with a familiar LINQ-based syntax, just as PLINQ enables developers to more easily take advantage of multi-core and manycore. (In fact, DryadLINQ is capable of using PLINQ internally to harness multiple cores available on each cluster node.)


 


It’s a pleasure to announce that, as of today, MSR has also released DryadLINQ under an additional license agreement, one that allows for non academic use. The academic and non academic releases are largely identical: key differences are in the licenses themselves, and that the academic release supplies source code for the programming model layer whereas the commercial release is a binary-only distribution.


 


In order to download the new release, you will need to register on the Dryad connect site. You will also need a Windows HPC Server cluster (three nodes will suffice), for which you can download a free evaluation version at http://www.microsoft.com/hpc/en/us/try-it.aspx.


 


We are looking forward to receiving your feedback about this release!

 DryadLinq now available also for non academic use

Colorful Stickers Part 6

Colorful Stickers Part 6 Preview Image

01 February, 2010

Colorful Stickers Part 6 Preview Image

20 Icons

“Colorful Stickers part 6″, the new addition of the popular “Colorful Stickers” series, is finally available. We’ve listened the needs of the many, and composed the set as a compilation of user requested icons.
The pencil and brush icons, along with 18 other icons fit perfectly into a compact whole recognizable by the unique “Colorful Stickers” style. They will surely embellish the look of your projects, and perhaps even inspire new development directions.
The icons are available in these sizes: 16×16px, 24×24px, 32×32px, 48×48px, 64×64px, 128×128px and 256×256px in 32-bit transparency PNG file format. The icons also come in ICO and ICNS file formats in 128×128px size.

Are you using the CCR? We’d love to hear about it.

Are you using the CCR (Microsoft Robotics’ “Concurrency & Coordination Runtime”) today in production applications or libraries, and in particular for non-robotics purposes?  If so, we’d love to hear about your experiences, and any and all information you’re willing to share would be very welcome. 


What do you like about it and the programming model it employs?  What don’t you like about it?  What features are crucial to you, and what features do you never use? How are you architecting your applications with it?  Any key code samples that are representative of your application you’d like to share, or even better, a standalone implementation that highlights how you use it?  If you experiemented with it but ended up not using it, why? Etc. If you’re interested and willing to share, please send me an email at stoub at microsoft dot com.


We’re excited to hear from you!


Thanks,
Stephen

 Are you using the CCR? Wed love to hear about it.

FAQ :: Are all of the new concurrent collections lock-free?

(This answer is based on the .NET Framework 4.  As the details below are undocumented implementation details, they may change in future releases.)


No.  All of the collections in the new System.Collections.Concurrent namespace employ lock-free techniques to some extent in order to achieve general performance benefits, but traditional locks are used in some cases.


It’s worth noting that purely relying on lock-free techniques is sometimes not the most efficient solution.  When we say “lock-free,” we mean that locks (in .NET, traditional mutual exclusion locks are available via the System.Threading.Monitor class, typically via the C# “lock” keyword or the Visual Basic “SyncLock” keyword) have been avoided by using memory barriers and compare-and-swap CPU instructions (in .NET, “CAS” operations are available via the System.Threading.Interlocked class).


ConcurrentQueue<T> and ConcurrentStack<T> are completely lock-free in this way. They will never take a lock, but they may end up spinning and retrying an operation when faced with contention (when the CAS operations fail).


ConcurrentBag<T> employs a multitude of mechanisms to minimize the need for synchronization.  For example, it maintains a local queue for each thread that accesses it, and under some conditions, a thread is able to access its local queue in a lock-free manner with little or no contention.  Therefore, while ConcurrentBag<T> sometimes requires locking, it is a very efficient collection for certain concurrent scenarios (e.g. many threads both producing and consuming at the same rate).


ConcurrentDictionary<TKey,TValue> uses fine-grained locking when adding to or updating data in the dictionary, but it is entirely lock-free for read operations.  In this way, it’s optimized for scenarios where reading from the dictionary is the most frequent operation.

 FAQ :: Are all of the new concurrent collections lock free?

FAQ :: You talk about performance, speedup, and efficiency…what do you mean exactly?

All of these terms are overloaded, even in the context of parallel computing.  However, we’ve used them extensively to describe how well our parallel algorithms and demo applications work.  And sometimes, we throw them around carelessly on the blog, forums, etc., so here are our general definitions.


Performance is an attribute that refers to the total elapsed time of an algorithm’s execution.  Less elapsed time means higher performance.


Speedup is a metric that quantifies performance by comparing two elapsed time values.  In parallel computing, these two values are usually generated by the execution of a serial algorithm and a parallelized version of the same algorithm.  Speedup is then calculated using the following equation:


Speedup = Serial Execution Time / Parallel Execution Time


So if a serial algorithm takes 100 seconds to complete, and the parallel version takes 40 seconds, the speedup is “2.5x”.


Efficiency is a metric that builds on top of speedup by adding awareness of the underlying hardware.  It is usually calculated using the following equation:


Efficiency = Speedup / # of cores


So if speedup is “2.5x” on a 4-core machine, efficiency is 0.625 or 62.5%.


Scalability is an attribute that refers to the speedup of an algorithm given different numbers of cores/processors.  The efficiency metric is good for quantifying scalability, because if efficiency holds constant as the number of cores changes, we have linear scaling (or awesome scalability).

 FAQ :: You talk about performance, speedup, and efficiency…what do you mean exactly?

FAQ :: The Debugger does not correctly handle Task exceptions?

Just My Code Mode

The following code correctly observes and handles a Task exception and should print “gotcha!” to the console.  By default though, the Debugger will report a crash.

Task t = Task.Factory.StartNew(() => { throw new Exception(“poo”); });


try { t.Wait(); }


catch (AggregateException) { Console.WriteLine(“gotcha!”); }



The issue has to do with the “Just My Code” mode (enabled by default), which causes the Debugger to break in immediately when an exception leaves user code (the Task delegate) and enters non-user code (TPL internal).  This is usually a good thing, because it allows you to pinpoint exactly where an exception is going unhandled.  However, in this case, the Debugger is breaking before TPL can observe the exception.


Running without debugging or disabling “Just My Code” (Tools -> Options -> Debugging -> General) should resolve the issue.  Also, note that the Debugger actually broke in as though it had hit a breakpoint, so Continuing (F5) or Stepping (F10/F11) should allow further execution.


Just My Code Mode

 FAQ :: The Debugger does not correctly handle Task exceptions?

Colorful Stickers Part 5

Colorful Stickers Part 5 Preview Image

11 January, 2010

Colorful Stickers Part 5 Preview Image

20 Icons

DryIcons releases the new sequel of the popular “Colorful Stickers” series, the new free icon set “Colorful Stickers part 5″. All is said about the icons from this set, so just enjoy it.
This set is created to easily stick to all sorts of projects websites, blogs and back-end applications and providing them with that special “extra” look. Fun, elegant and chic, these icons easily “stick” to everyone’s hearts.
This free icon set contains 20 high quality, free icons in these sizes: 16×16px, 24×24px, 32×32px, 48×48px, 64×64px, 128×128px and 256×256 px in 32-bit transparency PNG file format. The icons also come in ICO and ICNS file formats in 128×128px size.