Another reason for a different query plan

While investigating an issue with a poor performing query a colleague realised that although it ran quickly in SQL Workbench it was causing timeouts  from .net clients. He used a nice trick of running the same query using OSQL (ADO client) to run the query and simply wait until the query ran. Why does this work? Well I believe the problem is that the query is too complicated for the plan to be created before the command to execute the query has timed out. But why doesn’t boot-strapping the plan in SQL Workbench help? I’ve struggled to answer this before but I may (may) have found the answer in SET OPTIONS. Query plans are cached for each group of set options, therefore there is a fair chance that the default ADO set options differ from those used by SQL Workbench. Although that in itself is interesting what it still doesn’t explain is why one set of options produces plans faster than another. It sounds a little odd to me, but it’s worth investigating further.

The wacky world of parallel SQL

I was asked to look a SQL Server problem where a single user was running a single batch query and getting a deadlock, which on the face of it is a pretty neat trick. The error stated; ‘was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim’. Why piqued my interest was the phrase ‘communication buffer’. The query was pretty complex with a large number of joins and some casts on the join conditions but I was reassured that the code had worked fine on the development servers just not on the pre-live server. So what is a communication buffer in this context? I made a leap and assumed it was where the plan has split the work into its various stages and the communication was where the data was been processed and combined as the plan is executed. But still why would that cause a deadlock…ah well the title of the blog probably gives it away ;). The pre-live server is a pretty big beast with more cores than the average orchard and with only one user (no stress) it was likely that SQL was going to attempt to use a number of them. So in the great tradition of using a hammer to crack to a nut we switched the degree of parallelism on the server to be 1 rather than 0, i.e. don’t do it. The query ran fine. I probably should report this as a bug with SQL 2005 but if you do run into this problem then I’d suggest you use the maxdop hint in your query…or turn it off at the server, afterall what other nasties could this feature cause?
 

SQL 2008 MSDN seminar

Just got back from a SQL 2008 MSDN UK seminar at Thames Valley Park (TVP). I was a fairly simple set of presentations to give a quick flavour of what 2008 may mean for developers. So here is an even quicker flavour…
Data Types: Some new data types including lots of date/time types (probably the ones that were in the 2005 CTPs), File Stream (for files that are actually stored on disk) but the most interesting one was the Hierarchy Id. This allows the developer to construct real hierarchy relationships and ask questions such as who is the ancestor of x. However it does need to be manipulated via a whole set of related functions. The seminar also spent far too long on spatial types, ok map data is all the rage but who cares about this? Ever since SQL 2005 I’ve been worried about the potential explosion of CLR types in the database, and this just seems to meet my worst fears. I think you can smell a CLR type because of the extra functions, I really wonder how well (or not) they’ll perform.

Semi-Structured data: Interesting change, where you can add sparse columns to a table. The idea is that you can create many columns (wide table) to cover many attributes without overloading the table. This could be useful for storing derived types in the database. The classic example would be Animal classification. Not every animal has the same attributes but you can put the whole range in the table schema when in reality only a few will be used on any one row. I’m not sure how the Codd brigade will take to this!

Filtered Index: Didn’t really cover this but it seems that you can create an index that only includes data that meets a criteria. For example you could create a dog index for the animal table that would only contain data that matches the "dog" value.

Integration Services: The use of parallel execution and the inclusion of ADO.NET sources.

Tracking Changes: Oh yes this stuff could be great, but will it perform well? Two types; Change Tracking – aimed at synchronisation activities, conflict resolution, etc. Change Data Capture – aimed at replaying changes to a different database.

Merge: This looked very powerful, the ability to use one statement to merge data from one table to another. This seems to avoid the age old SQL problem of knowing if you need to insert a new row or update and existing one, with Merge you can do it all with one statement. Looks suspiciously like an XQuery to me, again the spectre of performance rises.

Query Analyser vs. SQL Management Studio

I’m a big fan of using SQL Server 2005’s Management Studio for writing my TSQL regardless if my database is SQL 2000 or 2005. However, I recently hit a strange problem. I’d written a number of scripts to automate a very dull tasks and published it to my colleagues. However, whenever they ran the script they got lots of errors complaining about incorrectly terminated strings. The only thing we could see that was different was I was using SQL Management Studio and he was using Query Analyser. Once we both used Management Studio it worked fine. I’m quite surprised by this since I always thought they’d simply send the text in their editor to SQL. But obviously that isn’t true, looks like some sort of parsing or transforming is going on. Very odd. If I have time I’ll get a network sniffer on the job but for now I’ll simply issue this warning…you can’t assume the scripts you write in Management Studio will work in Query Analyser.

Better locking behaviour from Table variables

SQL 2000 introduced a variable type called TABLE. This for me was one of the most interesting changes in the product since I’ve had a long a bitter battle with temp-tables (perhaps a subject for another blog) and I use table vars in a number of places. I’ve already discovered that you can use table vars to get better performance from queries that use involve OPENXML. I’ve never got the bottom of the exact reason but I certainly think that since it’s so easy to reproduce it’s more Law than Theory. The premise seems to be that the query optimizer really has no clue what OPENXML will produce, so any query requiring a optimizer plan (i.e. involving decent amounts of joined data). However, if you simply insert all the data from OPENXML into a table var and substitute the var for the OPENXML in the join the optimizer seems to do produce much better plans. So I can sort of understand that and, as I’ve said, the proof is in the executing and it does work…almost always, the is an issue…

My colleagues  were investigating a deadlock issue with a couple of long-ish transactions. The basic activity is:
TXA -> 1) Inserts Row X -> 2) Join using X 3) Complete
TXB -> 1) Inserts Row Y -> 2) Join using Y 3) Complete

The deadlock was occurring because (greatly simplified):
TXA -> 1) Exclusive key locks (on X) -> 2) Range Locks (X+ others)
TXB -> 2) Exclusive key locks (on Y) !Deadlock

So the problem seemed to be that the optimizer has asked for more rows to be locked in (2) than was necessary. Examining the query it was using a table var and the var usually only had one row in it, so why did it think it needed to take out such a wide lock? Drawing on my experience from OPENXML I suggested we try putting PRIMARY KEY on the table var. Previously I’d not done this because I was concerned about the overhead of creating a key for table var that would usually only have one row in it. However, the effect was dramatic. With the primary key in place the optimizer now seemed to realize that the table var would now only have key data in it, and therefore should be able to take out individual key locks rather than range locks. The result was that the deadlocks disappeared. The is obviously great new since this represents a significant boost in concurrency. I do have a warning note, should the number of rows inserted into the temp var grow to…well I don’t know and wouldn’t like to 2nd guess, the optimizer may at some point switch back to range locks. I hope not and by that time I hope to be exclusively on SQL 2005+ and can supply my own plans if necessary.

The (ahem) key point here is that don’t assume the query optimizer will get it right each time, especially on "tables" with no statistics, you need to give it as much information as you can.

Reporting Services

Over the years I’ve kept an eye on SQL Reporting Services but I’ve never really tried it. Some of the demo’s have looked very promising and a particular project I’m currently involved with could really do with overhauling the reporting mechanism. So I booked myself onto a evening Reporting Services session at Microsoft to learn a bit more. However, before I attend the course I thought I’d better crack open the seal and have a poke around for myself…
 
One of the constant complaints levelled at Report Services is that it is difficult to install and if it goes wrong it’s really difficult to figure out why. Well I now have some sympathy for that. The installation was pretty easier but you do have to wear a number of different hats and there isn’t a lot of help when you miss part of the configuration. For example, I was happily running through a tutorial and it said, "now select run and deploy". Only to be faced with an authentication prompt for a web site. "What site? Ah I was supposed to have created a site? Ok off to the configuration tool. Big green tick next to the site so that seems to have been done, hmm. Oh well press create and see what happens. Ah no prompts, that seems to have done the trick". Not exactly the nicely task led process I expect, neh demand, of modern products. So I decided to plough through the tutorials. Nothing really exciting although the mix of vb like expression and SQL syntax is a bit odd. I’d would’ve expected you’d select one or the other rather than typing in one box and letting it figure it out and the code checking wasn’t very helpful, wot-no-intellisense. My first problem came with using parameters. I followed the instructions, even used the "copy text" facility but it just displayed an error when I went to the preview. But the error was the very definition of vague. I poked around and couldn’t persuade it to run. So I created a new report with my own settings and used a parameter from the start and that seemed to work. I copied the same code over and it carried on working, very strange. However, when I ran the report it would just repeat the first rows data for every row returned in the join, WHAT!!! So the join works ’cause I can see that in the "run grid" and the number of rows is correct. But it’s not getting the data for each row, very odd. Finally I just junked the whole layout and started again and it worked fine. So it seems very sensitive to the order you do things, the design surface seems to get very confused very easily. Good grief and I’m expecting some business user to write some of these, plus if I’m having this much hassle with a simple tutorial what happens when I’m writing something far more complicated? I’ll have no idea if I’ve written it incorrectly or I’ve simply done something in the wrong order. This is certainly a topic I’ll want to raise with the good folks at Microsoft. Still it promises so much that I really want to work, perhaps it will just be hope vs. reality?
 
 

Cannot add databases on Windows XP or can I?

When I installed SQL Server 2005 on my Windows XP machine everything seemed to work fine. However, whenever I tried to create a new database via the Workbench I’d get an error complaining about COM+ initialization. I tried "everything" but couldn’t persuade it to work. Eventually I gave up! However, I recently installed IE7 and discovered that I couldn’t type a new URL into the address bar and have IE goto that page. After trawling the web I found someone with the same problem and the solution was to uninstall the NVidia (motherboard) firewall client. It suddenly struck me that perhaps that was the cause of my SQL Server problems and…yes it was! SQL Workbench is now working correctly. Bad old NVidia 😉
 

Problem with SqlCommand.DeriveParameters

SqlCommandBuilder.DeriveParameters is a helper function that returns the parameters used by a stored procedure, a bit like the old ADO Command.Refresh. However, there is a problem with it. Consider the following stored procedure…

Create Proc Bob.MyProcTest(@Name nvarchar(100))

as

select ‘hello’ + @Name

 

and…

Create Proc Fred.MyProcTest(@Name nvarchar(100))

as

select ‘hello’ + @Name

 

So we’ve got the same procedure differing only by the owner. When you ask SqlCommandBuilder.DeriveParameters to fetch the parameters it incorrectly returns 4 parameters rather than two (remember that the return value is counted as a parameter). It’s fairly obvious why but it shouldn’t do it and can, as I’ve found to my cost, easily cause problems.

[Edit] I’ve since been informed that if you prefix the stored procedure name with the owner it works fine. It does, and that’s certainly better than nothing, but I still believe returning an appended result could never be the correct the answer. SqlCommand.DeriveParameters is there to aid in constructing calling code so I believe it should always make useful answers, so if the params match it should only return one, if they’re is more than one possible set it should raise an exception and ask you to be more specific.