Update: In Episerver CMS 11, released today (November 21st 2017), the simpleaddress router has been moved to the last of the route table.
This is an unusual post – it is not about Commerce – my area of expertise, but about CMS. Recently I’ve been working on some support cases where SQL Server instance is on high utilization, and in some scenarios it eventually slows down the site. After investigation, it’s likely to come from a small, simple and helpful feature: Simple address.
CMS content can have a property named “Simple address”, which allows you to create a “shortcut” url for that content. So if you have one page with “name in url” as “contact-us” under a page name “about-us” under Home page, you can access it via https://mysupercoolsite.com/en/about-us/contact-us. Or you can set the Simple address for that page as “contact-us”, and then you can access it directly via https://mysupercoolsite.com/contact-us.
One of the questions customers usually raise during evaluation of Episerver Commerce is : “Can it support our catalog size? We have (a very big number of ) SKUs. Will it work?”
I’ve seen “big” catalogs. Some very big. Million and more products.
The correct term would be “Entries”, as in a standard implementation your catalog can contain products, variants (of those products, or standalone ones), bundle and package. However for the sake of simplicity, we will refer to them as “products”. So when we talk about 2 million products, it is the 2 million entries. (you can have 100k products, 1.800 variants and 100k packages)
Theoretically, Episerver Commerce can support up to 512 millions 1 billion of entries (!), so you can have pretty much anything in your catalog until you reach a hard technical limit. Just for comparison, Amazon.com, which is arguably the biggest eCommerce site on the world, has about 500 millions SKU(s) in 2015. But the number of entries is not everything. There are several factors which determine your catalog “size”. the number of entries is an important factor, but there are several other factors as well.
Recently we were tasked to help a customer having a problem with a query. This specific query ate a lot of CPU resources (30-40%) and causing performance problem for other queries – as it slows the entire SQL Server instance down.
Upon investigation, we discovered that the query was accessing a table with an outdated index. The index was supposedly updated in Episerver Commerce 7.10.3, which was released almost 3 years ago.
For some reasons, the index was not updated in customer’s table. Instead of just having to do a index seek, SQL Server was forced to do a full table scan, which is much slower, causing the problem.
If you want to go into details, it’s mdpsp_getchildrenbysegment stored procedure, which looks into UriSegment column of CatalogItemSeo table, previously, the index was like this:
CREATE NONCLUSTERED INDEX [IX_CatalogItemSeo_UniqueSegment_CatalogEntry] ON [dbo].[CatalogItemSeo]
You can see the problem: The order of the index was bad – because ApplicationId was not distinctive (in fact, in most of the cases it’s the same for every row), and because UriSegment was not the first column in the index, this index will not be used if a query uses UriSegment only. (more…)
I supposed this is a well known feature, but I was asked more than once about it, so it’s better to write something here to clarify the confusions.
If you have some very, very big catalogs, you probably have seen this “notification” in Catalog UI
By default, the Catalog UI groups a product and its variations in a parent-children view (they are not exactly parent-children, by the way). However, to do that, it needs to know about all the entries in that specific category. If it’s a small category, it should be no problem, but if it’s big one, then it’s inevitable slow. The lazy loading which the catalog content list only loads the contents when you scroll to them is not helping in this matter. Moreover, the grouping introduces an overhead for the UI, and having too many groups can severely affect the performance. Trust me, you won’t like a sluggish UI.
This improvement was introduced way back – 7.11 if I remember correctly – thanks to my colleague Magnus Rahl. To this day it’s still valuable – the performance was improved – but not that much to remove the threshold completely (And the improvement to the catalog versioning in Commerce 9 should have nothing to do with this).
When you see this notification, and if you’re unhappy with it, you have two (primary) options: Either to sub-categorize your category – i.e. introduce sub categories so each will have a smaller number of entries. Or increase the value of threshold.
Each approach has its own disadvantages. Sub-categorizing might break your SEO, while the second approach will undoubtedly effect the UI performance. Your call!
Now – the tricky part – which number to configure in SimplifiedCatalogListingThreshold setting. Obviously, it must be greater than the biggest number of entries in a category. But how to obtain that number? I’ve seen the confusion to raise that value to 3000, 5000, or even 10000 and it’s still not working. No, you can’t guess, you have to know for sure.
One simple option is to look at Commerce Manager Catalog Management. There is a small text in right corner of the list which shows the number of entries in that category (No, it’s not available in the Catalog UI, but I assume it would be helpful?)
The nuke option is to look at the database. Usually we recommend to avoid manipulate the database directly, as it can be dangerous – but here is a little code which only queries data (so practical harmless)
SELECT CatalogNodeId, Count(CatalogEntryId)
GROUP BY CatalogNodeId
ORDER BY Count(CatalogEntryId) DESC
Now you know the biggest number of the entries in a category – just change the threshold in setting. Try it and see if the UI Performance is acceptable to you.
Why LOH? It’s a special heap contains the memory objects which are more than 85000 bytes in size – which, previously, never compacted (that was changed with .NET 4.5 when you have an option to compact LOH, but beware of the consequences).
If you know about the generation garbage collection in CLR, you already know that when an object is no longer used, its memory will be claimed back for later use. GC do more than that, by trying to compact the “free” memory – so it’ll move (via copy and delete) the survived objects next to each other, therefore you’ll be a big, continuous free memory. This helps improving performance in upcoming allocations, but only until a point. If the objects are too big, then the costs of moving the objects around might outweigh the benefits. Microsoft did a bunch of performance test and they decided that 85000 bytes is the break point, where the costs are bigger than the performance gain. Then we have LOH, where the memory is almost never compacted.
This is the second part of the series on how can you improve the performance of Episerver Commerce site – or more precisely, to avoid the deadlocks and 100% CPU usage. This is not Commerce specific actually, and you can apply the knowledge and techniques here for a normal CMS site as well.
It’s a common and well-known best practice to store the slow-to-retrieve data in cache. These days memory is cheap – not free – but cheap. Yet it is still much faster than the fastest PCIe SSD in the market (if your site is running on traditional HDD, it’s not even close). And having objects in cache means you won’t have to open the connection to SQL Server, wait for it to read the data and send back to you – which all cost time. And if the object you need is a complex one, for example a Catalog content, you will also save the time needed to construct the object. Even if it’s fast, it is still not instantaneous, and it will cost you both memory and CPU cycles. All in all – caching is the right way to go. But how to get it right?
One common mistake for to have no lock when you load the data for the first time and insert it into cache.
This is a first part of a long series (which have no planned number of parts) as the lessons I learned during trouble shouting customers’ performance problems. I’m quite of addicted to the support cases reported by customers, especially the ones with performance problems. Every time I jump into such support case, I’ll be with less hairs, but I also learn some new things: Implementations are different from cases to cases, but there are some common mistakes which will hurt your website performance. This series will try to point out those mistakes so you get your performance gain, for (almost) free:
Mistake 1: Loading to much content
It’s easy to load contents, especially with the new content APIs. Given an universal ContentReference, you can load a content with a simple line of code. By default, the loaded content is cached, so you might think it’s cheap, or even free to load a content. Think again.
Recently I worked on two support cases from our customers as they see SQL Server errors, such as “System.Data.SqlClient.SqlException (0x80131904): The INSERT statement conflicted with the FOREIGN KEY constraint “FK_ShipmentEx_Shipment”. The conflict occurred in database “dbCommerce”, table “dbo.Shipment”, column ‘ShipmentId’“, or “System.Data.SqlClient.SqlException (0x80131904): The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.”
These errors happened randomly, during the high load times – it seems to be affected by the concurrency level.
What was wrong? and why?
It took me a good amount of time, and good amount of hairs, too. The actual error is another one, and the one above is just the “by product”.
The cart system in Episerver Commerce suffers from a design flaw: it shares (almost) everything with the purchase orders. ShoppingCart is just another metaclass extended OrderGroup, so it’ll use the same OrderGroup, OrderForm, Shipment, LineItem and OrderAddress tables in the database, like PurchaseOrder and PaymentPlan. At first, it seems to be reasonable approach. But when you have hundreds, or thousands of customers visiting your website (and you would be happy to see that ;)) – problems start to appear.