There is a lot to like in the upcoming release of .NET 4.0 but one thing that recently caught my eye was parallel programming support. In a nutshell, the parallel programming classes allow you to leverage multithreading without having to deal with all the complexity of threads. In my specific case, I was dealing with a collection of objects where I needed to perform a specific operation on each item in the collection. The current code was simply using a For Each loop and calling the method on each object within the body of the loop. I decided to see how big of an improvement I could get by refactoring the code to use the parallel programming classes.
First, the let’s look at the class that was being stored in the collection:
public class MyAccountingMethod
{
public MyAccountingMethod() { }private int _theAnswer;
public int TheAnswer
{
get { return _theAnswer; }
}public void CalculateTheAnswer()
{
//Substitute this with actual work
System.Threading.Thread.Sleep(10);
Random rnd = new Random();
_theAnswer = rnd.Next();
}
}
So, the scenario was that I had a collection of MyAccountingMethod objects and needed to call the CalculateTheAnswer method on each one (the results would be consumed later). You’ll have to use your imagination and replace the body of the method with something more useful.
The first step is to actually generate the collection of MyAccoutningMethod objects. The Enumerable.Repeat method comes in very handy for this:
private List<MyAccountingMethod> GetAccountingList(int count)
{
return Enumerable.Repeat<MyAccountingMethod>(new MyAccountingMethod(), count).ToList<MyAccountingMethod>();
}
Next, let’s mimic the existing code by using a typical For Each loop:
private void button1_Click(object sender, EventArgs e)
{
List<MyAccountingMethod> accountingList = GetAccountingList(1000);System.Diagnostics.Stopwatch timer = System.Diagnostics.Stopwatch.StartNew();
foreach (MyAccountingMethod accounting in accountingList)
{
accounting.CalculateTheAnswer();
}
timer.Stop();MessageBox.Show(timer.ElapsedMilliseconds.ToString());
}
Followed by the parallel programming approach:
private void button2_Click(object sender, EventArgs e)
{
List<MyAccountingMethod> accountingList = GetAccountingList(1000);System.Diagnostics.Stopwatch timer = System.Diagnostics.Stopwatch.StartNew();
System.Threading.Tasks.Parallel.ForEach(accountingList, accounting =>
{
accounting.CalculateTheAnswer();
}
);
timer.Stop();
MessageBox.Show(timer.ElapsedMilliseconds.ToString());
}
Notice the For Each has been swapped out for a call to Paralell.ForEach. The method (which has several overloads) takes two arguments: the IEnumerable collection to iterate over and a delegate to be executed in each iteration of the loop. The code uses a lambda expression to define the delegate. See John’s excellent discussion on delegates and lambdas if the syntax is unfamiliar.
The result? Your mileage may vary depending on how many cores your machine has, but on my test system the processing of the loop went from an average of 10,700 milliseconds using the For Each loop to 3,400 milliseconds using Parallel.ForEach. A fairly significant boost.
[UPDATE]
As Pablo Gazmuri has correctly pointed out, you still need to make sure that whatever it is you are doing inside the Parallel.ForEach is being done in a thread-safe manner so keep that in mind. Specific to collections, .NET 4.0 introduces a new set of thread safe collections.
Finally, for anyone interested in digging deeper, Patterns for Parallel Programming is a good read.


0 comments:
Post a Comment