08.01.2010 at
11:38 am · Saved under
.NET Help
ConcurrentyDictionary<TKey,TValue> is a new type in the .NET Framework 4, living in the System.Collections.Concurrent namespace. As noted in the MSDN documentation, ConcurrentDictionary “represents a thread-safe collection of key-value pairs that can be accessed by multiple threads concurrently.”
While ConcurrentDictionary implements IDictionary<TKey, TValue> just as does Dictionary<TKey,TValue>, it also has several unique mechanisms for adding / updating key/value pairs in the dictionary, mechanisms specific to its concurrent focus. Here is a summary of when to use which mechanism.
- If you want to…
- Add a new item to the dictionary only if the key doesn’t currently exist in the dictionary…
- Use TryAdd. TryAdd accepts the key and the value, and adds the pair to the dictionary if the key doesn’t currently exist in the dictionary. The method returns whether or not the new pair was added.
- public bool TryAdd(TKey key, TValue value)
- Update the value for an existing key in the dictionary, but only if the existing value for that key is equal to a specific value…
- Use TryUpdate. TryUpdate accepts the key and the new value to set that key to, but it also accepts the value that the key in the dictionary must currently contain in order for the update to succeed. You can think of TryUpdate like Interlocked.CompareExchange but for an element in the dictionary.
- public bool TryUpdate(TKey key, TValue newValue, TValue comparisonValue)
- Store a key/value pair into the dictionary unconditionally, overwriting any value for that key if the key already exists…
- Use the indexer’s setter, e.g. dictionary[key] = newValue.
- public TValue this[TKey key] { get; set; }
- Add a key/value pair to the dictionary if the key doesn’t exist in the dictionary, or if the key does exist, update the value for the key based on the key’s existing value…
- Use AddOrUpdate. AddOrUpdate has two overloads:
- One overload accepts the key as well as two delegates. The first delegate is used if the key doesn’t exist in the dictionary; it accepts the key and returns the value that should be added for the key. The second delegate is used if the key does exist: it accepts both the key and the current value for the key, and it returns the new value that should be set for the key.
- public TValue AddOrUpdate(TKey key, Func<TKey, TValue> addValueFactory, Func<TKey, TValue, TValue> updateValueFactory)
- The other overload accepts the key, an add value, and the update delegate. This is exactly the same as the former overload, except that if the key isn’t in the dictionary, the provided value is added for the key, rather than invoking a delegate to get the necessary value
- public TValue AddOrUpdate(TKey key, TValue addValue, Func<TKey, TValue, TValue> updateValueFactory)
- Get the value for a key in the dictionary, adding the value to the dictionary (and returning it) if the key doesn’t exist…
- Use GetOrAdd. You can think of this as lazy-initialization for a key/value pair in the dictionary, getting the value if it’s there, or adding it and then getting it if it’s not. As with AddOrUpdate, GetOrAdd provides two overloads: one takes the value to be added if the key doesn’t exist, and the other takes a delegate that will generate the value if the key doesn’t exist.
- public TValue GetOrAdd(TKey key, TValue value)
- public TValue GetOrAdd(TKey key, Func<TKey, TValue> valueFactory)
All of these operations are atomic and are thread-safe with regards to all other operations on the ConcurrentDictionary. The only caveat to the atomicity of each operation is for those which accept a delegate, namely AddOrUpdate and GetOrAdd. For modifications / writes to the dictionary, ConcurrentDictionary employs fine-grained locking to ensure thread-safety (reads on the dictionary are performed in a lock-free manner); however, these delegates are invoked outside of the locks in order to avoid the myriad of problems that can arise from executing unknown code under a lock. As such, the code executed by these delegates is not subject to the atomicity of the operation.

05.11.2009 at
9:44 pm · Saved under
.NET Help
Rejoice! Visual Studio 2010 Beta 2 is upon us and it includes lots of great changes for the Parallel Extensions. TPL guru Danny Shih has already covered what’s new in TPL and PLINQ aficionado Ed Essey has covered the best way to use LINQ (specifically between Beta 1 and Beta 2) but your favorite set of thread-safe collections and synchronization primitives are in on the action too.
So what’s changed? Overall, we’ve been fixing bugs, of course, but we’ve also spent a lot of time fine-tuning these types to get the best performance possible. That said, performance varies greatly from architecture to architecture and from scenario to scenario, so make sure you stick on your detective hat, fire up the Concurrency Visualizer and go sleuthing around your app’s performance profile to ensure that you’re using the right types in the right way. We’re always happy to see perf results and field perf-related questions on our forum.
I’m not going to talk about specific bugs or perf improvements here but let’s jump into functionality that’s been added or changed:
Barrier
In Beta 1, Barrier used to track phase numbers using a 32-bit integer. While most scenarios for Barrier couldn’t conceive of exhausting a range of 4+ billion numbers, for long running services, it’s not infeasible. That’s why we’ve decided to change Barrier to track phases using a 64-bit integer (so check for any code that called AddParticipant{s} or CurrentPhaseNumber). We encourage you to take comfort in the fact that your tasks and threads can joyously join together in cyclic fashion for an eternity*.
Also, in previous releases, if you decided to give Barrier a Post Phase Action to execute when all threads completed a phase, any exceptions thrown from that action would be thrown unwrapped and only on the thread that actually ended up executing the action. We’ve since realized that that’s a little silly. If your Post Phase Action faulted, more than likely, your algorithm is corrupt but only one thread would know about it. In Beta 2, all of your threads participating in a Barrier are now privy to these kinds of failures: any exceptions thrown from your action are wrapped in a BarrierPostPhaseException and thrown on all the threads that participated in that phase.
*Not really an eternity. A really really long time though.
BlockingCollection<T>
Good ol’ blocking collection. It’s giving producer threads and consumer threads everywhere a common ground upon which to communicate and brimming with thread-safety at every API. What’s that you say? In Beta 1 BlockingCollection<T>.Add() and BlockingCollection<T>.CompleteAdding() were not thread-safe with respect to each other? Using them concurrently might’ve corrupted the data structure so in Beta 1 you couldn’t do something like a nice parallel graph-traversal?
var targetNode = …;
var bc = new BlockingCollection<Node>(startingNodes);
// since we expect GetConsumingEnumerable to block, limit parallelism to the number of
// procs, avoiding too much thread injection
var parOpts = new ParallelOptions() { MaxDegreeOfParallelism = Enivronment.ProcessorCount };
Parallel.ForEach(bc.GetConsumingEnumerable(), parOpts, (node,loop) =>
{
if (node == targetNode)
{
Console.WriteLine(“hooray!”);
bc.CompleteAdding();
loop.Stop();
}
else
{
foreach(var neighbor in node.Neighbors) bc.Add(neighbor);
}
});
Well worry no longer. Our team of concurrency-geniuses has crammed even more thread-safety into BlockingCollection<T> for your graph-traversing pleasure by making these two methods thread-safe. (Note that Add() will now throw an exception if called after CompleteAdding() has been called so with Beta 2, you’ll need to put a catch block around the call to Add in this example.)
ConcurrentDictionary<TKey,TValue>
Since Beta 1 has released, we’ve gotten a lot of feedback, both internal and external, about ConcurrentDictionary<TKey,TValue>. CD is slightly more special than the other concurrent collections added in .NET 4: unlike the other collections, it provides direct access to all of it’s elements. This, of course, supports a whole different set of scenarios. While ConcurrentStack<T>, ConcurrentQueue<T>, and ConcurrentBag<T> are really about processing arbitrary elements, ConcurrentDictionary<TKey,TValue> is about processing elements associated with known keys. With the former, threads will only race on access to the end elements of the collection. With the latter, access to every element may result in a race. That said, in Beta 1, it was very difficult to do any multi-step operations with the elements in a ConcurrentDictionary<TKey,TValue>. Suppose you just wanted to create a simple thread-safe cache. You were forced to check if the value existed and add it if it did not. Of course, in the time between checking for the item and adding it, an element with the same key could have been added, forcing you to check for such a condition and trying again.
Enter two new atomic multi-step operations* on CD: GetOrAdd and AddOrUpdate.
GetOrAdd allows you to specify a key as well as a value (or delegate to produce the value). The method will atomically check for the existence of a given key and, if the key already exists, return the existing value. If the key does not exist, this method will add the new value you’ve specified (with or without the delegate). For example:
private ConcurrentDictionary<string, Data> _cache =
new ConcurrentDictionary<string,Data>();
// on multiple threads
public Data ProcessRequest(string input)
{
return _cache.GetOrAdd(input, (key) =>
{
…;
return …;
}
}
AddOrUpdate enables you to either add a new element at a given key or update an existing one if that key already exists, great for scenarios that need thread-safe lookup-tables like counting the number of occurrences of a given string in parallel:
private ConcurrentDictionary<string,int> _counts =
new ConcurrentDictionary<string,int>();
// on multiple threads
public int CountRequest(string input)
{
return _counts.AddOrUpdate(input, 1, (key,old) => old + 1);
}
With AddOrUpdate, you specify the key that you’re interested in, the value to add if the key is not present (or a delegate to create that value), and, finally, a delegate that will generate the new value in the case that the key is already present. This delegate passes along the key and existing value key so your delegate can use it to determine the new value.
Nifty!
*Atomic with regards to other mutating methods on the collection (e.g. TryAdd/TryUpdate/TryRemove/etc.), excluding the execution of the user-provided delegate. This means that if multiple threads race to call GetOrAdd all of their delegates may be called and some delegates may be called multiple times. Users are expected to rollback any unwanted side-effects.
ConcurrentLinkedList<T> and ConcurrentLinkedListNode<T>
In each and every software professional’s career there comes a point where he or she might have to swallow their pride and let a creation that they love go. For some reason or another, their fancy invention just ultimately, doesn’t provide enough value to justify its existence. Now I know we went and got you all excited about ConcurrentLinkedList<T> in Beta 1 but we had to let it go (we did warn you though!). Unfortunately, with the time we had available we just couldn’t get it to be usable and perform well. There seem to be many thread-safe linked list implementations that are very scalable but usually that scalability is based on some assumption or odd caveat in the design that severely degrades the practicality of the type. It hurts us deeply to take CLL<T> out but its performance just wasn’t good enough to ship it. No need to send any flowers.
Lazy<T>, LazyInitializer, and ThreadLocal<T>
Oh boy has Lazy<T> been fun! In Beta 1, we told you that we renamed LazyInit<T> to Lazy<T>, added a new struct-based version called LazyVariable<T> and moved the thread-local mode over to its own type (ThreadLocal<T>). Well, in the interest of making these primitives as usable as possible, we’ve gone and done it again! Here’s a breakdown of the lazy-initialization primitives in Beta 2:
- Lazy<T> is still a class-based lazy initialization primitive that’s safe to pass around. We’ve given it it a simple constructor that accepts a bool to indicate whether it needs to be thread-safe or not.*
Also, we now detect recursive calls to Value in the valueFactory delegate. If your valueFactory delegate attempts to access the Value property on the Lazy<T> to which it was passed, an InvalidOperationException will be thrown.
Furthermore, we firmed up our stance on exception handling. In the interest of keeping the type observationally pure, if a valueFactory delegate throws an exception, that exception will be cached and subsequent calls to Value will result in that same exception being rethrown.
- LazyExecutionMode has been removed now that we have only two simple initialization modes on Lazy<T>*
- LazyVariable<T> has also been removed (we said it would!). There were just too many issues with having a struct-based version. If you need lighter-weight lazy initialization, use LazyInitializer. If you need the lighter-weight and an object to pass around, it’s simple enough to create your own struct that uses LazyInitializer’s methods internally.
- LazyInitializer still contains all the advanced static methods for lazy initialization patterns.
- ThreadLocal<T> remains unchanged other than some great perf improvements!
*Though in Beta 2 we reduced the number of execution modes to simply thread-safe and not-thread-safe, customer requests have proven that we still need the three. By RTM, there will be an additive constructor to support advanced developers that need tighter control over how thread-safety is achieved. Mostly, this is to enable library writers to use third-party initialization delegates without having to worry if they take locks and will potentially cause a deadlock. Regardless, feel free to use the constructors provided today as any changes are purely additive.
Serialization
In Beta 1, ConcurrentBag<T>, ConcurrentDictionary<TKey,TValue>, ConcurrentQueue<T>, ConcurrentStack<T>, and Lazy<T> all implemented ISerializable. Turns out we were a little behind the times. ISerializable is not version tolerant. We’ve remedied the situation and all of these types now use VTS for serialization, with little to no impact on you!
Have fun with Beta 2 and drop us a line. We’d love any feedback you have on the changes!
Josh and the coordination data structures crew