Fixing ASP.NET Membership performance – part 1

Even though it is not the best identity management system in the .NET world, ASP.NET Membership provider is still fairly widely used, especially for systems that have been running for quite long time with a significant amount of users: migrating to a better system like AspNetIdentity does not comes cheap. However, built from early days of ASP.NET mean Membership provider has numerous significant limitations: beside the “architecture” problems, it also has limited performance. Depends on who you ask, the ultimate “maximum” number of customers that ASP.NET membership provider can handle ranges from 30.000 to 750.000. That does not sound great. Today if you start a new project, you should be probably better off with AspNetIdentity or some other solutions, but if your website is using ASP.NET membership provider and there is currently no plan to migrate, then read on.

The one I will be used for this blog post has around 950.000 registered users, and the site is doing great – but that was achieved by some very fine grained performance tuning, and a very high end Azure subscription.

A performance overview

I have been using ASP.NET membership provider for years, but I have never looked into it from performance aspects. (Even though I have done some very nasty digging to their table structure). And now I have the chance, I realize how bad it is.

It’s a fairly common seen in the aspnet_* tables that the indexes have ApplicationId as the first column. It does not take a database master to know it is a very ineffective way to create an index – in most of the cases, you only have on ApplicationId in your website, making those indexes useless when you want to, for example, query by UserId. This is a rookie mistake – a newbie tends to make order of columns in the index as same as they appear in the table, thinking, that that SQL Server will just do magic to exchange the order for the best performance. It’s not how SQL Server – or in general – RDBMS systems work.

It is OK to be a newbie or to misunderstand some concepts. I had the very same misconception once, and learned my lessons. However, it should not be OK for a framework to make that mistake, and never correct it.

That is the beginning of much bigger problems. Because of the ineffective order of columns, the builtin indexes are as almost useless. That makes the queries, which should be very fast, become unnecessarily slow, wasting resources and increasing your site average response time. This is of course bad news. But good news is it’s in database level, so we can change it for the better. It if were in the application level then our chance of doing that is close to none.

Missing indexes

If you use Membership.GetUserNameByEmail on your website a lot, you might notice that it is … slow. It leads to this query:

        SELECT  u.UserName
        FROM    dbo.aspnet_Applications a, dbo.aspnet_Users u, dbo.aspnet_Membership m
        WHERE   LOWER(@ApplicationName) = a.LoweredApplicationName AND
                u.ApplicationId = a.ApplicationId    AND
                u.UserId = m.UserId AND
                LOWER(@Email) = m.LoweredEmail

Let’s just ignore the style for now (INNER JOIN would be a much more popular choice), and look into the what is actually done here. So it joins 3 tables by their keys. The join with aspnet_Applications would be fairly simple, because you usually have just one application. The join between aspnet_Users and aspnet_Membership is also simple, because both of them have index on UserId – clustered on aspnet_Users and non-clustered on aspnet_Membership.

The last one is actually problematic. The clustered index on aspnet_Membership actually looks like this

CREATE CLUSTERED INDEX [aspnet_Membership_index]
    ON [dbo].[aspnet_Membership]([ApplicationId] ASC, [LoweredEmail] ASC);

Uh oh. Even if this contains LoweredEmail, it’s the worst possible kind of index. By using the least distinctive column in the first, it defeats the purpose of the index completely. Every request to get user name by email address will need to perform a full table scan (oops!)

This is a the last thing you want to see in a execution plan, especially with a fairly big table.

It should have been just

CREATE CLUSTERED INDEX [aspnet_Membership_index]
    ON [dbo].[aspnet_Membership]([LoweredEmail] ASC);

which helps SQL Server to use the optimal execution plan

If you look into Azure SQL Database recommendation, it suggest you to create a non clustered index on LoweredEmail. That is not technically incorrect, and it still helps. However, keep in mind that each non clustered index will have to “duplicate” the clustered index, for the purpose of identify the rows, so keeping the useless clustered index actually increases wastes and slows down performance (even just a little, because you have to perform more reads to get the same data). However, if your database is currently performing badly, adding a non clustered index is a much quicker and safer option. The change to clustered index should be done with caution at low traffic time.

Tested the stored procedure on database above, without any additional index

Table 'aspnet_Membership'. Scan count 9, logical reads 20101, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'aspnet_Applications'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'aspnet_Users'. Scan count 0, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 237 ms,  elapsed time = 182 ms.

With new non clustered index


(1 row affected)
Table 'aspnet_Applications'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'aspnet_Users'. Scan count 0, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'aspnet_Membership'. Scan count 1, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 89 ms.

With new clustered index:

(1 row affected)
Table 'aspnet_Applications'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'aspnet_Users'. Scan count 0, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'aspnet_Membership'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row affected)

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 89 ms.

Don’t we have a clear winner?

Speed up catalog routing if you have multiple children under catalog

A normal catalog structure is like this: you have a few high level categories under the catalog, then each high level category has a few lower level categories under it, then each lower level category has their children, so on and so forth until you reach the leaves – catalog entries.

However it is not uncommon that you have multiple children (categories and entries) directly under catalog. Even though that is not something you should do, it happens.

But that is not without drawbacks. You might notice it is slow to route to a product. It might not be visible to naked eyes, but if you use some decent profilers (which I personally recommend dotTrace), it can be fairly obvious that your site is not routing optimally.

Why?

To route to a specific catalog content, for example http://commerceref/en/fashion/mens/mens-shirts/p-39101253/, the default router have to figure out which content is mapped to an url segment. So with default registration where the catalog root is the default routing root, we will start with the catalog which maps to the first part of route (fashion ). How do it figure out which content to route for the next part (mens ) ?

Until recently, what it does it to call GetChildren on the catalog ContentReference . Now you can see the problem. Even with a cached result, that is still too much – GetChildren with a big number of children is definitely expensive.

We noticed this behavior, thanks to Erik Norberg. An improvement have been made in Commerce 12.10 to make sure even with a number of children directly under Catalog, the router should perform adequately efficient.

If you can’t upgrade to 12.10 or later (you should!), then you might have a workaround that improve the performance. By adding your own implementation of HierarchicalCatalogPartialRouter, you can override how you would get the children content – by using a more lightweight method (GetBySegment)

    public class CustomHierarchicalCatalogPartialRouter : HierarchicalCatalogPartialRouter
    {
        private readonly IContentLoader _contentLoader;

        public CustomHierarchicalCatalogPartialRouter(Func<ContentReference> routeStartingPoint, CatalogContentBase commerceRoot, bool enableOutgoingSeoUri) : base(routeStartingPoint, commerceRoot, enableOutgoingSeoUri)
        {
        }

        public CustomHierarchicalCatalogPartialRouter(Func<ContentReference> routeStartingPoint, CatalogContentBase commerceRoot, bool supportSeoUri, IContentLoader contentLoader, IRoutingSegmentLoader routingSegmentLoader, IContentVersionRepository contentVersionRepository, IUrlSegmentRouter urlSegmentRouter, IContentLanguageSettingsHandler contentLanguageSettingsHandler, ServiceAccessor<HttpContextBase> httpContextAccessor) : base(routeStartingPoint, commerceRoot, supportSeoUri, contentLoader, routingSegmentLoader, contentVersionRepository, urlSegmentRouter, contentLanguageSettingsHandler, httpContextAccessor)
        {
            _contentLoader = contentLoader;
        }

        protected override CatalogContentBase FindNextContentInSegmentPair(CatalogContentBase catalogContent, SegmentPair segmentPair, SegmentContext segmentContext, CultureInfo cultureInfo)
        {
            return _contentLoader.GetBySegment(catalogContent.ContentLink, segmentPair.Next, cultureInfo) as CatalogContentBase;
        }
    }

And then instead of using CatalogRouteHelper.MapDefaultHierarchialRouter , you register your router directly

 var referenceConverter = ServiceLocator.Current.GetInstance<ReferenceConverter>();
            var contentLoader = ServiceLocator.Current.GetInstance<IContentLoader>();
            var commerceRootContent = contentLoader.Get<CatalogContentBase>(referenceConverter.GetRootLink());
            routes.RegisterPartialRouter(new HierarchicalCatalogPartialRouter(startingPoint, commerceRootContent, enableOutgoingSeoUri));

(ServiceLocator is just to make it easier to understand the code. You should do this in an IInitializationModule, so use context.Locate.Advanced instead.

This is applicable from 9.2.0 and newer versions.

Moral of the story:

Catalog structure can play a big role when it comes to performance.
You should do profiling whenever you can
We do that too, and we make sure to include improvements in later versions, so keeping your website up to date is a good way to tune performance.

Commerce batching performance – part 2: Loading prices and inventories

UPDATE: When looked into it, I realize that I have a lazy loading collection of entry codes, so each test had to spent time to resolve the entry code(s) from the content links. That actually costs quite a lot of time, and therefore causing the performance tests to return incorrect results. That was corrected and the results are now updated.

In previous post we talked about how loading orders in batch can actually improve your website performance, and we came to a conclusion that 1000-3000 orders per batch probably yields the best performance result.

But orders are not the only thing you would need to load on your website. A more common scenario is to load prices and inventories for product. So If you are displaying a product listing page, it’s quite common to load prices and inventories for all products in that page. How should it be loaded?

Continue reading “Commerce batching performance – part 2: Loading prices and inventories” →

Commerce batching performance – part 1: Loading orders

One of best practices for better performance – not just with Commerce or Episerver Commerce, is to batch your calls to load data. In theory, if you want to load a lot of data, loading by both end will be problematic: if you load each record one by one, the overhead for opening the connection and retrieve data will be too much. But if you load all of them, then it is likely that you will end up with either time out exception in database end, or out of memory exception in your application. The better way is to of course, loading them by smaller batch: either 10, 20, or 50 records at one and repeat until the end.

That is the theory, but is it really better in practice? And if it is, which size of batch works best? As they usually say, reality is the golden test for theory, so let’s do it.

Continue reading “Commerce batching performance – part 1: Loading orders” →

Index or no index, that’s the question

If you do (and you should) care about your Episerver Commerce site performance, you probably know that database access is usually the bottleneck. Allowing SQL Server works smoothly and effectively is a very important key to the great performance.

We are of course, very well aware of this fact, and we have spent a considerable amount of time making sure Commerce database works as fast as we could. Better table schema, better stored procedures, better indexes, … we have done all of that and will continue doing so when we have the chances. (And if you find anything that can be improved, you are very welcome to share your finding with us)

But there are places where the database performance improvement is in your hand.

Continue reading “Index or no index, that’s the question” →

Useful T-SQL snippets for development and troubleshooting

This post is more of a note-to-self. These are the useful T-SQL statements which can be incredibly useful in development and troubleshooting

SET STATISTICS IO ON

Turn on the IO statistics for statements run after that until set to OFF explicitly. We then switch to Messages tab to see how many IO operations were done on each table.

SET STATISTICS TIME ON

Find out about the statements were executed: which statements, its texts, how many reads (logical), how many time was spent on CPU and how many time was spent total

Continue reading “Useful T-SQL snippets for development and troubleshooting” →

Speed up your Catalog incremental indexing

As your products are being constantly updated, you would naturally want them to be properly (and timely) indexed – as that’s crucial to have the search results that would influence your customers into buying stuffs. For example, if you just drop the prices of your products , you would want those products to appear in new price segment as soon as possible.

This should be very easy with Find.Commerce – so if you are using Find (which you should) – stop reading, nothing for you here. Things, however, can be more complicated if you are using the more “traditional” SearchProvider.

Continue reading “Speed up your Catalog incremental indexing” →

Mass update catalog entries

This is something you don’t do daily, but you will probably need one day, so it might come in handy.

Recently we got a question on how to update the code of all entries in the catalog. This is interesting, because even thought you don’t update the codes that often (if at all, as the code is the identity to identify the entries with external system, such as ERPs or PIMs), it raises a question on how to do mass update on catalog entries.

- Update the code directly via database query. It is supposedly the fastest to do such thing. If you have been following my posts closely, you must be familiar with my note regarding how Episerver does not disclose the database schema. I list it here because it’s an option, but not the good one. It easily goes wrong (and cause catastrophes), you have to deal with versions and cache, and those can be hairy to get right. Direct data manipulation should be only used as the last resort when no other option is available.

Continue reading “Mass update catalog entries” →

A curious case of SQL Server function

This time, we will talk about ecfVersion_ListFiltered, again.

This stored procedure was previously the subject of several blog posts regarding SQL Server performance optimizations. When I thought it is perfect (in term of performance), I learned something more.

Recently we received a performance report from a customer asking about an issue after upgrading from Commerce 10.4.2 to Commerce 10.8 (the last version before Commerce 11). The job “Publish Delayed Content Versions” starts to throw timeout exceptions.

This scheduled job calls to a ecfVersion_ListFiltered to load the content versions which are in status DelayedPublish, it looks like this when it reaches SQL Server:

declare @s [udttIdTable]
insert into @s values(6)
exec dbo.ecfVersion_ListFiltered @Statuses = @s, @StartIndex = 0, @MaxRows = 2147483646

This query is known to be slow. The reason is quite obvious – Status contains only 5 or 6 distinct values, so it’s not indexed. SQL Server will have to do a Clustered Index Scan, and if ecfVersion is big enough, it’s inevitably slow.

Continue reading “A curious case of SQL Server function” →

Optimizing T-SQL COUNT

This is a continuation of my previous post about paging in SQL Server. When it comes to paging, you would naturally want to know the total number of rows satisfying, so you can display some nice, useful information to your end-users.

You would think, well, it’s just a count, and a simple query like this would be enough:

SELECT COUNT(Id) FROM MySecretTable

There should be nothing to worry about, right? Actually, there is.

Let’s get back to the example in previous post – we have to count the total number of orders in that big table.

SELECT COUNT(ObjectId) FROM OrderGroup_PurchaseOrder

Because ObjectId is the clustered index of OrderGroup_PurchaseOrder, I did expect it to be use that index and be pretty fast. But does it? To my surprises, no.

Continue reading “Optimizing T-SQL COUNT” →