PLINQ in .NET 4.0

Continuing the parallel programming discussion, I’d just like to give a brief introduction to PLINQ. Over the last few years, we have not seen a big jump in CPU’s clock speed. What we have seen are chip makers creating multi-CPU chips and there is not an easy way for developers to take advantage of this. One of the areas that would benefit from parallel processing is LINQ. The LINQ syntax is very easy to write and understand but for larger queries the retrieval is too slow. Fortunately in .NET 4.0, Microsoft add extension methods to LINQ so that we may access multiple cores.


Let’s take a look at the syntax and speed differences between a PLINQ and LINQ query. First is our LINQ query:


_sequentialQuery = from b in _babies
where b.Name.Equals(_userQuery.Name, StringComparison.InvariantCultureIgnoreCase) &&
b.State == _userQuery.State &&
b.Year >= YEAR_START && b.Year <= YEAR_END
orderby b.Year
select b;


and the PLINQ query:


_parallelQuery = from b in _babies.AsParallel().WithDegreeOfParallelism(numProcs)
where b.Name.Equals(_userQuery.Name, StringComparison.InvariantCultureIgnoreCase) &&
b.State == _userQuery.State &&
b.Year >= YEAR_START && b.Year <= YEAR_END
orderby b.Year
select b;


So the only difference is the extension method AsParallel().WithDegreeOfParallelism(numProcs) where numProcs is equal to the number of CPU you want to employ for this query.


Now, not all is rosy. There are some caveats to using PLINQ. There is some overhead when these queries run, and for smaller recordsets the PLINQ version may be slower. Below are 2 examples of this.


Here is screenshot of running the above query and returning 3 million records:


PLINQ_3M


We see a big improvement with PLINQ. If we only return 300 records however, we get:


PLINQ_300



So we can see the cost of the overhead. As always, use of PLINQ (or any new technology)requires a lot testing to be sure it is the right method for your current project.

0 comments: