The wacky world of parallel SQL

I was asked to look a SQL Server problem where a single user was running a single batch query and getting a deadlock, which on the face of it is a pretty neat trick. The error stated; ‘was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim’. Why piqued my interest was the phrase ‘communication buffer’. The query was pretty complex with a large number of joins and some casts on the join conditions but I was reassured that the code had worked fine on the development servers just not on the pre-live server. So what is a communication buffer in this context? I made a leap and assumed it was where the plan has split the work into its various stages and the communication was where the data was been processed and combined as the plan is executed. But still why would that cause a deadlock…ah well the title of the blog probably gives it away ;). The pre-live server is a pretty big beast with more cores than the average orchard and with only one user (no stress) it was likely that SQL was going to attempt to use a number of them. So in the great tradition of using a hammer to crack to a nut we switched the degree of parallelism on the server to be 1 rather than 0, i.e. don’t do it. The query ran fine. I probably should report this as a bug with SQL 2005 but if you do run into this problem then I’d suggest you use the maxdop hint in your query…or turn it off at the server, afterall what other nasties could this feature cause?
 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s