Performance of lock() vs. Interlocked.CompareExchange()

Trips

Well-known member
Joined
Aug 7, 2010
Messages
2,788
<p style="margin:0cm 0cm 10pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri We have an analytical application which runs a bunch of tasks in parallel. For the most part the tasks do not need to modify any shared data, but
there is one exception, which causing us some headache.<span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri Basically all the tasks need to perform some adds into an array of doubles, like this:
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri m_sharedData[j] += number;
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri We have tried two different ways protecting the shared data:
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri lock(m_lock)
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri {
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri <span>
m_sharedData[j] += number;
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri }
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri And

<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri double oldVal;
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri double unChangedVal;
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri do
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri {
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri <span>
oldVal = m_sharedData[j];
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri <span>
unChangedVal = Interlocked.CompareExchange(ref m_sharedData[j], oldVal + val, oldVal);
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri }
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri while (unChangedVal != oldVal);
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri A few things to know:
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri This code is in the innermost loop and to will be called very frequently. A typical analysis takes about an hour and that may generate something
like 50 billion calls to this code.
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri On the other hand: There is a significant memory overhead for each task, so we do not run any more tasks than there are CPU-cores on the system,
so on a pretty big box something like 24 tasks/threads, and m_sharedData will typically have something like 20000 to 100000 elements, so in many cases it will only be a (relatively) rare occurrence that two threads hit the same array-element at the same time.
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri The parallelism is introduced with a .net 4 parallel loop:
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri var options = new ParallelOptions { MaxDegreeOfParallelism = m_numberOfThreads };
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri Parallel.ForEach(GetJobs(), options, DoWork);<span>

<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri There will typically be around 1000 Jobs to run. The entire ForEach-operation will take about 5 seconds to complete (with 24 CPU-cores). (We get
to a full hour since there is an outer loop around this.)
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri We were expecting the CompareExchange-version to clearly outperform the lock-version of the code, but that is not what we see.
<span> In fact, CompareExchange seems to be anywhere from 30 to 300 % slower that lock.
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-family:Calibri; font-size:small
<p style="margin:0cm 0cm 0pt <span lang="EN-US <span style="font-size:small <span style="font-family:Calibri Weâve wanted to measure how frequently two threads hit the same array element at the same time and we came across a performance counter: Contention
Rate / sec in the .NET CLR LocksAndThreads category, but I am not sure that we can trust it? In some situations (where we do run with the lock version of the code, this counter just gives 0 contentions over extended periods of time. Thatâs strange to
me.

View the full article
 

Similar threads

Back
Top