Catalog,  Commerce,  Episerver,  Performance,  SQL Server

A curious case of SQL execution plan, part 2

Recently I wrote about how to look into, identify and solve the problem with a SQL Server execution plan – as you can read here:

I have some more time to revisit the query now, and I realized I made a “small” mistake. The “optimized” query is using a Clustered Index Scan

So it’s not as fast as it should be, and it will perform quite poorly in no cache scenario (when the buffer is empty, for example) – it takes about 40s to complete. Yes it’s still better than the original one, both in non cached and cached cases. But it’s not good enough. An index scan, even cached, is not only slower, but also more prone to deadlocks. It’s also worse in best case scenario, when the original one can use the proper index seek.

I don’t really know why I let that slipped out of my attention. I blame the late hour when I worked on it. Yes, I usually work more than I should and my boss even “warned” me about that. (In a side note, I must be very lucky to have a boss who cares about your well being and work-life balance. I usually take the excuses of “this is interesting”, and “this is my personal learning” for working outside of working hours)

Time to fix it then!

The first attempt is from one of my colleagues:

So he replaced the INNER JOIN with CROSS APPLY. Looks reasonable. Is it fast? Yes! Is it stable? Uhm – no. It has the sample problem with the original query when it is tested with the big query. SQL Server is too smart to optimize the CROSS APPLY, and when it sees then chance, it swaps the order of join, and … fails. So we took a step into a … different direction, not just forward.

I was wondering why SQL Server use the index scan with the query hint. Can we force it to use the index we want? Yes – we can. My second attempt looks like this:

So we replace the query hint to force the join order with the query hint to force use the clustered index. Is it fast? Yes! Is it stable? Yes! Is it perfect? Uhm, no?

I’ve learned to be skeptical about query hints – they should be used with caution. Especially we are working on a framework, so we have to take that seriously. Performance aside, we need to care about maintainability as well. OPTION (FORCE ORDER)  is quite clear about what we are doing, but WITH (INDEX(IDX_ecfVersionProperty_ContentID)) will probably raise questions about why do we need to use it. And even worse, the index might be renamed (it’s unlikely, but it’s still a possibility). We have tests that would cache the error very soon in development cycle, but still.

Do we have to fix the query just by changing the query?

Let’s look at another angle. SQL Server try to join on MetaFieldId because it sees an index there (and is fooled that is a good index, while it’s not really is). A good index is an index which very good selectivity (the perfect case it’s unique), and is used by (multiple) queries.

What if we remove that index? That is a suggestion from another colleague.

Let’s try to drop that index. And as expected, SQL Server is forced to use the clustered index always. So yes, it’s fast. Yes, it’s stable. But should we go with it?

You would likely hear about adding an index, more than removing an index. But removing an unused index has its benefits. Having an index is not without drawbacks – it adds overheads to insert, delete and update operations, it takes more disk space, so if the index is “useless”, or if its value does not outweigh those, it should be removed.

Unlike the query which can be quickly verified by SQL Server Management Studio (of course to make sure it works well in all scenarios, you will have to test a bit), removing an index needs more work – we have to be sure that it is not used by any other queries. At least it’s less risky knowing this is not a good index, so even if we remove it by mistake and another query is using it, that would not be a big performance hit.

After a careful checking (and with the help of our performance tests), we found that the index was not effectively used by any queries. Checking history, it was not added by any reasons, more than of a “let’s add an index, just to be sure”.

And it’s gone!

Lessons learned from this investigation:

  • Try to test your queries on a big-enough dataset, on multiple scenarios.
  • Data distribution is important
  • Fixing a query can be done by removing an index, not just by adding one.

One Comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: