My Blahg

February 21, 2007

List<T> is slow across multiple threads

Filed under: c#, dotnet, threading — treyhutcheson @ 3:46 pm

I recently saw some interesting performance characteristics when dealing with List. I had 300 methods queued to the thread pool. Each worker item used a local copy of List for some processing. At first, I cloned this list for each worker thread. Then I realized that since the data were read-only, I decided to just pass a reference to the original list. The logic was that the cloning operation was potentially expensive depending on the size of the list(in this case, 22000 items long), and that by merely passing a reference I wouldn’t need to clone the list any more.

So I changed the code, and performance plummeted. I used my high performance timer object to track certain blocks of code, and sure enough, removing the cloning operation sped that portion up considerably. But the elapsed time for each worker thread execution increased by an order of magnitude.

My guess is that internally the list serializes access to its contents, which caused an indeterminate number of worker threads to block, over and over again.

I’ll demonstrate through some sample code. The first bit is the code that clones the original list:


List universe = AcquireRecordIds(); //this returns a list of approximately 22,000 integers

//fire off each bucket into the thread pool; approximately 300 buckets
foreach( Bucket in buckets )
{
  //clone the list for each bucket
  List bucketUniverse = new List( universe );

  //send off the bucket to the thread pool
  EnqeueuBucketWorkerItem( bucketUniverse , bucket );
} //END bucket loop

In this case of of this bit of pseudocode, the cost of cloning the list was ~40 milliseconds on my laptop. Total execution time for all 300 worker items was ~100 milliseconds.

Now here’s some sample pseudocode that just used an object reference for the list


List universe = AcquireRecordIds(); //this returns a list of approximately 22,000 integers

//fire off each bucket into the thread pool; approximately 300 buckets
foreach( Bucket in buckets )
{
  //send off the bucket to the thread pool
  EnqeueuBucketWorkerItem( universe , bucket );
} //END bucket loop

In this case, there was zero time for cloning because that operation was removed. However, total execution time for all 300 worker items approached 2000 milliseconds.

I have to assume this is because of internal locking on behalf of the List. In my scenario, I was simply reading data from the list(each bucket read different, but potentially overlapping portions) from it’s copy of the list. The more often the list reference was accessed, the more often threads would be blocked. Basically all I mean is that your mileage may vary, so to speak.

In summary, the cloning operation cost me ~40 milliseconds, but saved me ~1900 milliseconds when a bunch of background threads were processed.

February 19, 2007

Custom Threading Event Objects Part2

Filed under: c#, dotnet, threading — treyhutcheson @ 9:22 pm

In my last post I detailed the StackResetEvent. In this post I will describe the CounterResetEvent.

We’ve all seen threading examples for using the Thread Pool. But what happens when you need to queue a bunch of items to the pool but you have to wait for each of them to complete? If it’s a deterministic situataion, in this case meaning that you know how many worker items you will spawn beforehand, you can use the CounterResetEvent.

The CounterResetEvent extends EventWaitHandle, and is capable of using both Manual or Auto reset semantics. The concept is pretty simple: the event is unsignaled until its internal counter achieves a certain threshold, then it becomes signaled. So for example, if your main thread is going to queue 1000 items to the thread pool, it can initialize a new CounterResetEvent with a max value of 1000. The main thread would then be suspended with a call to WaitOne on the reset event. Each worker item, when complete, will call the event’s Increment method to increment it’s internal counter. When the internal counter reaches the maximum value of 1000, the thread becomes signaled, resuming any blocked threads(the main thread in this case).

It’s a pretty simple object, really. But like the StackResetEvent, it suffers from the fault that if the counter is not incremented as expected, for whatever reason, blocked threads will never become unblocked because the signal condition will never be true.

Here’s the code. Again forgive the formatting and lack of comments.

public class CounterResetEvent : EventWaitHandle
{
private int _counter;

private int _maxValue;

private object _lockObject;

public CounterResetEvent( int maxValue )
: this( maxValue , false )
{ } //END constructor

public CounterResetEvent( int maxValue , bool initialState )
: this( maxValue , initialState , EventResetMode.ManualReset )
{
} //END constructor

public CounterResetEvent( int maxValue , bool initialState , EventResetMode resetMode )
: base( initialState , resetMode )
{
} //END constructor

public int MaxValue
{
get
{
return _maxValue;
}
} //END MaxValue property

public void Increment()
{
lock( _lockObject )
{
_counter++;
if( _counter >= _maxValue )
Set();
} //release

} //END Increment method

} //END CounterResetEvent class

February 16, 2007

Custom Threading Event Objects

Filed under: c#, dotnet, threading — treyhutcheson @ 4:00 pm

I recently ran into two scenarios where dotnet’s built-in threading objects weren’t sufficient, so I ended up rolling my own and thought these classes might be useful for some one else.

The first scenario is pretty simple. I’ve got simple service that will be hosted either as a Windows Service or by ASP.Net. I’m not managing the threads used by the incoming requests. If two request arrive on two separate threads, then each will be executed on it’s own thread. Each request will access some read-only data stored globally(singleton/service locator pattern, not a real global). This global data will become stale over time, and I need to unload it after it hasn’t been used in a while. This data is basically a big blob allocated on the unmanaged heap. So I developed a simple cache model that tracks cache hits any time the data is accessed.

You can think of this as my own custom unsafe garbage collector. As cache items age, they are advanced to the next generation. When a cache item reaches its third generation, it’s buffer is freed from the heap and the item is removed from the cache. I’ve got a background timer thread that runs every few hours to perform cache analysis and collection.

There are a few caveats to make this model work. First is that the cache items must be thread protected so that cache hits will correctly renew the cache item. Another condition requires that the cache analysis and collection cannot execute while the cache is being accessed. This condition leads to a third condition: request threads must notify the cache manager somehow that a request is being serviced, and when it has completed.

The problem is that the requests enter the service non-deterministically; i.e., I have no way of knowing when a request will arrive. Another problem is that I don’t know how long it will take for a request to execute, and that multiple requests may be serviced simultaneously.

A standard EventWaitHandle does not meet these conditions. So I created a new class named StackResetEvent that derives from EventWaitHandle. The concept is pretty simple: it’s an event wait handle that behave like a stack. Each incoming request call’s the event’s Push method, and when the request is complete, the event’s Pop method is called. Each time Push is invoked, the reset event is reset to non-signaled. When Pop has been called and the stack is empty, the event becomes signaled. Internally, there isn’t a stack. There’s a simple counter that is incremented or decremented on calls to Push/Pop. These operations are thread-safe.

It is possible that the internal counter may become out-of-sync; if a thread calls Push, encounters an exception and never calls Pop, then the event will never become signaled. For me that’s not an issue because I call Pop in a finally clause.

So how do I use this object to solve my problem? The cache manager has an instance of this class publicly available. When a request is serviced, it Pushes the reset event and asks the cache manager for specific data. When the request is complete, it Pops the reset event. If there are no more requests, the event is now signaled.

The cache collection background thread runs every hour. Before it attempts and cache analysis or collection, it calls WaitOne to wait for all requests to be serviced. This will suspend the timer thread indefinitely. It is possible that the thread will be suspended beyond the timer’s callback timespan, causing another entrance into the timer callback method. I simply check to see if an analysis is already being performed and leave if this is so.

That takes care of making sure that the cache manager doesn’t accidentally unallocate heap memory that might be in use in another thread. However, what happens if another request is received while a cache analysis/collection is being performed?

That’s simple. I have a normal ManualResetEvent that is reset to unsignaled when the cache manager is doing it’s thing. When it’s done, the event becomes signaled. So before an incoming request does anything else, it invokes WaitOne on this event to make sure it is suspended while any memory is being collected.

Here’s the source code for the StackResetEvent class. I have removed comments because they’re not formatting correctly with this blog tool.


public class StackResetEvent : EventWaitHandle
{

private int _counter;

private object _lockObject = new object();

public StackResetEvent() : this( false )
{
} //END constructor

public StackResetEvent( bool initialState )
: this( initialState , EventResetMode.AutoReset )
{
} //END constructor

public StackResetEvent( bool initialState , EventResetMode mode )
: base( initialState , mode )
{
} //END constructor

public void Push()
{
lock( _lockObject )
{
_counter++;
Reset();
}
} //END Push method

public void Pop()
{
lock( _lockObject )
{
_counter–;
if( _counter == 0 )
Set();
}
} //END Pop method

} //END StackResetEvent class

Blog at WordPress.com.