My colleagues were investigating a deadlock issue with a couple of long-ish transactions. The basic activity is:
TXA -> 1) Inserts Row X -> 2) Join using X 3) Complete
TXB -> 1) Inserts Row Y -> 2) Join using Y 3) Complete
The deadlock was occurring because (greatly simplified):
TXA -> 1) Exclusive key locks (on X) -> 2) Range Locks (X+ others)
TXB -> 2) Exclusive key locks (on Y) !Deadlock
So the problem seemed to be that the optimizer has asked for more rows to be locked in (2) than was necessary. Examining the query it was using a table var and the var usually only had one row in it, so why did it think it needed to take out such a wide lock? Drawing on my experience from OPENXML I suggested we try putting PRIMARY KEY on the table var. Previously I’d not done this because I was concerned about the overhead of creating a key for table var that would usually only have one row in it. However, the effect was dramatic. With the primary key in place the optimizer now seemed to realize that the table var would now only have key data in it, and therefore should be able to take out individual key locks rather than range locks. The result was that the deadlocks disappeared. The is obviously great new since this represents a significant boost in concurrency. I do have a warning note, should the number of rows inserted into the temp var grow to…well I don’t know and wouldn’t like to 2nd guess, the optimizer may at some point switch back to range locks. I hope not and by that time I hope to be exclusively on SQL 2005+ and can supply my own plans if necessary.
The (ahem) key point here is that don’t assume the query optimizer will get it right each time, especially on "tables" with no statistics, you need to give it as much information as you can.