Why developers think spiffy dual-core processors suck
Both Wesner Moise (in a blogpost on Concurrency) and Joe Duffy (in an MSDN Magazine End Bracket column on Transactions for Memory) have predicted that Software Transactional Memory may have the same or comparable impact on concurrency that garbage collection has had on memory management. I think I agree with that, along with the fact that I think that these technologies may hit the mainstream sooner than we may expect. One of the reasons being that since the multi-core revolution is upon us, concurrency has just taken an enormous jump up on the ladder of complicated issues that are important to average developers. As Herb Sutter puts it in the title of his article in the March 2005 issue of DDJ: The free lunch is over.
Multi-threading has always been a pain. The biggest problem, deadlock, I’ve found to be reasonably easy to deal with, mostly because when they occur the application is stalled in a specific place and you can rewrite it to prevent it from reaching that state. Not always easy but doable. Throughout the years however, no matter how many multi-threaded applications I’ve written, occassionally some strange race condition pops up, often related to some undocumented behaviour of a component I’m using. And the typical problems are usually there: hard to reproduce and identify and most of the time even harder to validate that it has actually been completely fixed afterwards. Everything just gets harder, like writing good unit-tests to validate multi-threading behaviour for instance.
But that was not such an incredible problem, because most applications running on standard machines (either desktop or server) didn’t really need to maximize on multi-threading. Most of it was done out of convenience, such as dividing the GUI and application work between threads so that the GUI would stay responsive, display a progress bar or even allow cancelling during long operations. But that will change with the increased performance on processors coming from additional cores instead of additional speed of the single core itself. If one of your applications is running too slowly and you upgrade your ~3GHz machine to a brand-new dual-core ~3GHz machine, the old applications will run at approximately the same speed.
Software Transactional Memory can provide us developers with a tool that can make it easy to write applications that split up their tasks between multiple threads without introducing a large amount of complicated problems to watch out for (and end up with anyway). The concept is the same as with a database transaction: whenever you have a group of operations whose results you want to commit to memory atomically, just group them using a transactional memory construct. Inside the transaction, reading from a memory location yields the most recent write from inside the transaction and outside the transaction, for instance in another thread, the same read will yield the result of the most recent write from before the transaction was started.
Currently, transactional memory is still mostly in development at research labs. Some interesting papers to read on the subject are the original paper on Software Transactional Memory by Nir Shavit and Dan Touitou, Concurrent Programming Without Locks (which is a good introduction although a bit theoretical) by Tim Harris and Keir Fraser and Software Transactional Memory Should Not Be Obstruction-Free by Robert Ennals. The last two express some opposing views with regard to how transactional memory should be implemented.
Using such constructs, spawning a bunch of threads to work on the same problem will be a lot easier than it is now. Although designing your software so that the tasks can effectively be broken up into separate tasks is still something you will have to figure out yourself. But that’s a different story.