The art of paging

No this is not really “art” – I’m just trying to have a more clickbait title. It’s more about understanding what you have at your disposal and use them for your benefits – in this case – how new SQL statement can drastically improve your performance.

In this blogpost we will look into paging feature of SQL Server. in Commerce we usually work with large set of data – millions of rows are fairly common, and it’s natural to load data by page. There is no point loading thousands, or even millions of rows in one go. First it’s not practical to display all of them. Second you’ll likely end up with an timeout exception and/or an out of memory exception. Even if you are lucky enough to get through, it’s still able to take your SQL Server instance to a knee, and transferring that much data over network will be another bottleneck for your system. So my friends, the best practice for loading data is to do it by batches, and to not load everything at once.

How many items to load in one batch is up to question, there is no definitive number which is better than others, so you might want to try a bit – 10, 20, 50, 100, all of them can work depending on the types of data you are loading. Note that opening the connection to SQL Server is not free, even if we reuse the connection from the pool, there is always an overhead. If you want the optimal value, you might try different settings to see how it behaves.

Let’s go back to the main topic of this post. Several months ago we received this report:  . So basically it suggests us to disable the initial search in the order search screen in Commerce Manager. Yes we could do that, but it’s just hiding the problem. Can we do better by fixing the actual problem here – it’s about a slow SP that needs fixed. If we have a good enough SP, then an extra initial search is not a problem.

As a side note – this is why we highly appreciate the performance reports from you. We try to do our best to optimize performance, but we 1. don’t know everything. 2 don’t have the data set which exposes the problem. Having the problem reported allows us to look into it and probably make improvements which do not only benefit you, but also other customers.

The reported SP is ecf_Search_PurchaseOrder, but the actual one behind the slowness is ecf_OrderSearch. You can of course look at it, the SPs are “open source” in a term that you can easily view the what’s in them, but mind you, that specific SP is probably the most complex we have, with a lot of string concatenation. This is to serve the extremely powerful and flexible order search APIs. But for the purpose of this blog post, we don’t need to go into detail of all that. The stored procedure call in the post above:

declare @p8 int
set @p8=476
exec ecf_Search_PurchaseOrder @SQLClause=N'(1=1)',@MetaSQLClause=N'',@OrderBy=N'OrderGroupId DESC',@Namespace=N'Mediachase.Commerce.Orders',@Classes=N'PurchaseOrder',@StartingRec=0,@NumRecords=10,@RecordCount=@p8 output
select @p8

ultimately leads to this call:

declare @Page_temp 
table (TotalRecords int, OrderGroupId int)
;with OrderedResults 
count([OrderGroup].OrderGroupId) OVER() TotalRecords, 
FROM [OrderGroup] OrderGroup 
INNER JOIN OrderGroup_PurchaseOrder META 
ON OrderGroup.[OrderGroupId] = META.ObjectId  
WHERE 1 = 1 AND ((1=1))) 
INSERT INTO @Page_temp (TotalRecords, OrderGroupId) 
SELECT top(10) TotalRecords, OrderGroupId FROM OrderedResults WHERE RowNumber > 0;
SELECT OrderGroupId from @Page_temp;

If you’ve written T-SQL before, you probably recognize the pattern. The title of this post has “paging” for one reason. This code will try to get the 10 latest OrderGroupId, and the total number of OrderGroupId. To support the paging, it uses the CTE with ROW_NUMBER OVER pattern. However, because you can only access the CTE once, you have to insert the the results into a temporary table, and query again from it.

It’s quite a simple query (not that simple, but if you know the pattern, it’s easy enough to understand). But for SQL, the most important thing is performance (correctness is of course non negotiable), does the query have the performance we need?

If you have a few dozen thousands of orders, it runs pretty fast, and everything is “good enough”. But what if you have a few millions of orders? The particular database I have has around 5.2M orders. And can you guess how long that query runs on my machine?

10 – 20 minutes, or even more, depending on how warm the cache is. Yes the database is on my slow HDD, but that because of this enormous amount of I/O required:

Table '#A48F835D'. Scan count 0, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 3, logical reads 15786703, physical reads 12575, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'OrderGroup'. Scan count 0, logical reads 27667862, physical reads 23115, read-ahead reads 352939, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'OrderGroup_PurchaseOrder'. Scan count 1, logical reads 392365, physical reads 3, read-ahead reads 392196, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The execution plan also shows clearly what is wrong: When you see a scan with this number of rows, it’s almost always bad

It also explains why the first view is so slow – because at first load it loads without parameters, so all of the rows must be examined, and that makes it even slower. When you search with filters, the query must check less rows and the performance can be improved.

Now we know what is wrong, let’s fix it, don’t we? As usual, my favorite method of tackling a problem is to start small, fixing things that is “obvious”. If those improvements are not enough, then we can take step back and look at the problem in a different angle.

At first it looks like the temp table and the insert can be removed. The reason we used a temp table was because we need both the OrderGroupId(s) and the total number. However, we can get rid of the temp table by query all of them from the CTE. That’d make the result less “pretty”, but that’s not really a concern. After all, the result will be processed by the SP which calls ecf_OrderSearch , and with a bit of convention we can make that work. But reading carefully we can see that it actually inserts only 10 rows, so the impact is too small to make any difference.

Let’s take step back to think about it. If we don’t need paging, this query should run very fast

SELECT TOP(10) OrderGroupId FROM [OrderGroup] OrderGroup 
INNER JOIN OrderGroup_PurchaseOrder META 
ON OrderGroup.[OrderGroupId] = META.ObjectId

it takes less than 1s, and with very few reads:

Table 'OrderGroup'. Scan count 0, logical reads 45, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'OrderGroup_PurchaseOrder'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

We don’t even need to sort, because OrderGroupId is already a clustered index. So what makes the original query so slow. Apparently it’s not the CTE, because we can update the query above to use CTE and it will execute more or less the same. It’s clear what is the culprit here: ROW_NUMBER

In order to do the paging, SQL Server has to walk through every row and assign a row number to it. That’s 5.2M rows. If we want to get faster, we need to get rid of that.

Luckily for us, Commerce 11 requires at least SQL Server 2012, and that version allows us to  use the “new” syntax of OFFSET ... ROWS FETCH NEXT ... ROWS ONLY.

The original query can be rewritten like this

SELECT OrderGroupId FROM dbo.OrderGroup INNER JOIN OrderGroup_PurchaseOrder META ON OrderGroup.[OrderGroupId] = META.ObjectId  
WHERE 1=1  

It’s not only easier to write, easier to understand, it’s also much faster. The IO operations are the same as the “simplified query” above, and the execution plan is perfect:

The only thing we can’t really optimize is the count for total number. To do that, we still need to an index scan, and that can’t be lightning fast. On my machine, it’s about 10s on a cold cache, and ~1s on a warm cache. Still, that is much better than the previous approach.

In the end, we rewrote the ecf_OrderSearch to use the new statements and it performance was like day and night. We had to take care about the case when paging size = 0 (when you only get the total number of orders, without getting any actual orders), T-SQL complains that FETCH NEXT can’t take 0 ROWS. But that was some simple conditional checking.

Morals of the story:

  • Use the latest version. We go through these struggles so you don’t have to. Just enjoy the performance gains, for (almost) free.
  • ROW_NUMBER is so 2010. Now OFFSET ... ROWS FETCH NEXT ... ROWS ONLY is the next cool thing
  • Make sure your development and production environments have at least SQL Server 2012 installed 😉


4 thoughts on “The art of paging

  1. Another great post!

    I use this technique as well for my paging from database and the only thing that bothers me is what you pointed out, it needs to be run double to get total number.

    1. You can – create a new type which implements IContent as a wrapper for order. But that would be non trivial task I would say.

Leave a Reply

Your email address will not be published. Required fields are marked *