Potential performance issue with Maxmind.db

From time to time, I have to dig into some customers’ profiler traces to figure out why their site is slow (yeah, if you follow me, you’d know that’s kind of my main job). There are multiple issues that can eat your website performance for breakfast, from loading too much content, to unmaintained database indexes. While my blog does not cover everything, I think you can get a good grasp of what mistakes to avoid.

But sometimes the problem might come from a 3rd party library/framework. It’s not new, as we have seen it with A curious case of memory dump diagnostic: How Stackify can cause troubles to your site – Quan Mai’s blog (vimvq1987.com). The problem with those types of issues is that they are usually overlooked.

The library we’ll be investigating today would be Maxmind.db. To be honest, I’ve never used it my own, but it seems to be a very popular choice to geography-map the visitors. It’s usually used by Optimizely sites for that purpose, using VisitorGroup (which is why it came under my radar).

For several sites that use it, it seems more often than not stuck in this stack

It’s essentially to think that CreateActivator is doing something heavy here (evidently with the LambdaCompiler.Compile part. A peek from decompiling actually shows that yes, it’s heavy. I’m not quite sure I can post the decompiled code here without violating any agreement (I did, in fact, accepted no agreement at this point), but it’s quite straightforward code: TypeActivatorCreator uses reflection to get the constructors of the Type passed to it, to sees if there is any constructor decorated with MaxMind.Db.Constructor attribute, then prepares some parameters, and creates an LambdaExpression that would create an instance of that Type, using found constructor (which is a good thing because a compiled expression would perform much better than just a reflection call).

(I’m using Mindmax.db 2.0.0, for the record)

The code is straightforward, but it is also slow – as any code which involves reflection and lambda compilation would be. The essential step would be to cache any result of this. This is actually a very good place to cache. The number of types are fixed during runtime (except for very edge cases where you dynamically create new types), so you won’t have to worry about cache invalidation. The cache would significantly improve the performance of above code.

And in TypeActivatorCreator there is a cache for it. It is a simple ConcurrentDictionary<Type, TypeActivator> , which would return an TypeActivator if the Type was requested before, or create one and cache it, it it hasn’t been. As I said, this is a very good place to add cache to this.

There is a cache for that, which is good. However, the very important tidbit here is that the dictionary is not static. That means, the cache only works, if the class is registered as Singleton (by itself, or by another class down the dependency chain), meaning, only one of the instance is created and shared between thread (which is why the ConcurrentDictionary part is important).

But except it’s not.

When I look at a memory dump that collected for a customer that is using Maxmind.db, this is what I got:

0:000> !dumpheap -stat -type TypeAcivatorCreator
Statistics:
MT Count TotalSize Class Name
00007ffa920f67e0 1 24 MaxMind.Db.TypeAcivatorCreator+<>c
00007ffa920f6500 147 3528 MaxMind.Db.TypeAcivatorCreator
Total 148 objects

So there were 147 instances of TypeAcivatorCreator. Note that this is only the number of existing instances. There might be other instances that were disposed and garbaged by CLR.

Now it’s clear why it has been performing bad. For supposedly every request, a new instance of TypeActivatorCreator is created, and therefore its internal cache is simply empty (it is just newly created, too). Therefore each of request will go through the expensive path of CreateActivator, and performance suffers.

The obvious fix here is to make the dictionary static, or making the TypeActivatorCreator class Singleton. I don’t have the full source code of Mindmax.Db to determine which is better, but I’d be leaning toward the former.

Moral of the story:

  • Caching is very, very important, especially when you are dealing with reflection and lambda compilation
  • You can get it right 99%, but the 1% left could still destroy performance.

Update:

I reached out to Maxmind.db regarding this issue on November 9th, 2021

About 6h later they replied with this

I was at first confused, then somewhat disappointed. It is a small thing to fix to improve overall performance, rather than relying on/expecting customers to do what you say in documentation. But well, let’s just say we have different opinions.

Announcing Pro Optimizely Commerce Cloud

Yes, you guess it right, it’s a(nother) book

It has been 5 years since I started Pro Episerver Commerce back in early 2016. The book was a success, not as big as I hoped for, but definitely bigger than I expected. Tackling a niche market, it was fairly popular within the community, and it gave me a lot of happiness (and some pocket changes) to see that it helped many developers to understand and use the framework – which I help created, and love – better.

So much has changed in the last 5 years.

I have my first kid, and a second one. I left Commerce development team, to work on my own, then have a small team. Episerver bought Optimizely, then rebrand.

And so much more has happened with Episerver Commerce, more than just being renamed to Optimizely Commerce Cloud.

It deserves a new book!

To celebrate my 10th anniversary with Episerver (now Optimizely), I am proud, and excited to announce the second edition of Pro Episerver Commerce – Pro Optimizely Commerce Cloud. Most of the content written in Pro Episerver Commerce is still very much applicable, but I feel there is a need to refocus and expand on important parts.

You can register your interests today at https://leanpub.com/prooptimizelycommercecloud

Purchasers of Pro Episerver Commerce – even if you obtained the free version – will receive a 40% discount code for the new book – so don’t miss it.

I will see you there!

Commerce relation(ship), a story

There are two big types of relations in Episerver (Optimizely) B2C Commerce: relations between entries and nodes, and between nodes. As you will most likely have to touch one or another, or both, sooner or later, this post aims to give you some understanding on how they are structured/work.

Node-Entry relation

When you add a product (or variant, or package, or bundle) to a category, you are creating a NodeEntryRelation. And there are a two types of NodeEntryRelation

  • Primary NodeEntryRelation, which means the category is counted as true parent of the entry. Each entry can only have at most one primary NodeEntryRelation (Which means it can have no primary NodeEntryRelation at all).
  • Secondary NodeEntryRelation, which means the entry is linked to the category. You do that when it makes sense for the product to be in additional categories. for example, a GoPro can be in Camera (primary category), but can also be in Sport Gears (linked). An entry can have no, or multiple secondary NodeEntryRelation.

The concept of primary NodeEntryRelation was added to Commerce 11. Before that, it’s a bit more of a guess work – the “main” category is determined by the sort order – the relation with lowest sort order is considered “main relation”. That introduces some inconsistency, which prompted the rework on that.

What is the main different between those two things? For one thing, access rights. For Commerce, you can set access rights down to categories, but not entries. The entries will inherit access rights from their true parents (i.e. primary nodes). An entry without primary node-entry relation is considered the direct children of a catalog, and inherits its access right settings.

Another smaller difference is that if you delete a category, true children entries will be deleted. Linked entries will only be detached from that category.

NodeEntryRelation can be managed fully by IRelationRepository, using the NodeEntryRelation type, and you can use a few extension methods to make it easier – for example EntryContentBase.GetCategories().

How are your actions in Catalog UI reflected on a data level:

  • When you create a new entry (product/SKU/etc.) in a category, you create a primary node-entry relation for them
  • When you move (cut then paste) an entry to a new category, you are creating a new primary node-entry relation. If the entry already has a primary node-entry relation, the new one will take over.
  • When you link/detach an entry to/from a new category, you are creating/removing a non-primary node-entry relation

Node-Node Relation

Like Node-Entry relation, a node can be a true parent of a node, or just be a “linked” parent.

Unlike Node-Entry relation, Node-Node relation is quite different that it’s separated in two places.

  • Linked nodes are represented by NodeRelation(s) (it might be interesting to know that NodeRelation is the parent class of NodeEntryRelation. The interpretation is that a NodeRelation is – a relation of a node, which can be with another node, or an entry)
  • There is no primary NodeRelation, the true parent node is identified by a property of the category itself. When you have a NodeContent content, then the ParentLink property points to the true parent.

For that reason, a node will always have a true parent, either a catalog, or a node. You can’t use IRelationRepository (and therefore, ServiceAPI) to manage (delete or update) a true parent of a node , you would have to:

  • Set its ParentLink to something else
  • Use IContentRepository.Move to move it to a new parent.

Note that this is the limitation of the Relations API in ServiceAPI. You can technically change the ParentLink of a node and update via POST. It’s just more work and not as intuitive as the Relations API.

Why the disparity, you might ask? Well, a lot of design decisions in Commerce comes from historical reasons, and after that, constrained resources (time/man power) and priority. While the disparity is probably not the best thing you want, it still works fairly well, and if you understand the quirk then it is all well.

Get contact by email address

If you are using Episerver Commerce (or should I say, Optimizely B2C Commerce), you will, at some point, need to get the contact by an email address. That sounds like an easy enough task, until you realize that the class to manage customer contacts CustomerContext has no such method. You will need to find another way, and this is one way you can do it

CustomerContact contact = customerContext.GetContacts().Where(m => m.Email == email)?.FirstOrDefault();

Of course this is not the optimal – avoid it if you can. First of all it loads a lot of contact just to find one. Also while it looks like you are getting all contacts (which is of course something to avoid), you are only get the first 1000 contacts by default, so the code above would return inaccurate result.

Is there a better way?

Yes of course.

Contact was built on “Business Foundation” – think of it as an ORM with extensions. Business Foundation allows great flexibility, with a few caveats. This is how you can find contact by email address:

            try
            {
                var filterEl = new FilterElement("Email", FilterElementType.Equal, email);
                return
                    BusinessManager.List(ContactEntity.ClassName, new[] { filterEl })
                        .OfType<CustomerContact>()
                        .FirstOrDefault();
            }
            catch (ObjectNotFoundException)
            {
                //Safe guard
                return null;
            }

BusinessManager does not have a Get method, so we are use List instead. Note that Email is supposed to be unique, so this should not have any down side with performance (see note below).

This is not limited to email, or to contact. You can find contacts by other properties, or get an Organization with same technique.

It is very important to note, however, this need a proper index on Contact tables, to make sure you are not killing your database.

Don’t share HttpContext between threads

HttpContext is not designed to be thread-safe, which means you can run into nasty things if you share a HttpContext between thread. For example, this is one of the things that can happen: One (or more) thread is stuck in this:

mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].FindEntry(System.__Canon)+ed
EPiServer.Web.Routing.Segments.Internal.RequestSegmentContext.GetOrResolveContextMode(System.Web.HttpContextBase)+9b
EPiServer.DataAbstraction.Internal.ProjectResolver.IsInEditMode()+17
EPiServer.DataAbstraction.Internal.ProjectResolver.GetCurrentProjects()+17
EPiServer.Core.Internal.ProjectPipeline.Pipe(EPiServer.Core.ContentReference, System.Globalization.CultureInfo, EPiServer.Core.LoaderOptions)+1c
EPiServer.Core.Internal.ProviderPipelineImplementation.GetItem(EPiServer.Core.ContentProvider, EPiServer.Core.ContentReference, EPiServer.Core.LoaderOptions)+109
EPiServer.Core.Internal.DefaultContentLoader.TryGet[[System.__Canon, mscorlib]](EPiServer.Core.ContentReference, EPiServer.Core.LoaderOptions, System.__Canon ByRef)+11e
[[StubHelperFrame]]
EPiServer.Core.Internal.DefaultContentLoader.Get[[System.__Canon, mscorlib]](EPiServer.Core.ContentReference, EPiServer.Core.LoaderOptions)+63

If you don’t know why this is scary, this is any infinite loop, meaning your CPU will be spent 100% in to this Dictionary.FindEntry, unable to do anything else. The only way to solve this problem is to restart the instance.

That is caused by unsafe accessing of a Dictionary – if you have a thread that is enumerating it, and another thread trying to write to it, it is possible to run into a dead end like this.

And HttpContext just happens to have many Dictionary properties and sub-properties. HttpContext.Request.RequestContext.RouteData.DataTokens is one of them (And a reason the code above ended in a disaster), making it vulnerable for this kind of problem. Which is exactly why it is not recommended to share a HttpContext between threads.

Sometimes, you can just set HttpContext.Current to null. Sometimes, you need to take a step back and ask yourself that do you really need to run things in parallel?

Get exported Personalization catalog feeds

There are cases that you want to get your Personalization catalog feeds in zip format, maybe to make sure the customizations you have done are there (and are exactly what you want to), or you need to send them to developer support service for further assistance (like why your catalog feeds are not properly imported). Theoretically you can log in as an admin, and go to https://<yoursite>/episerverapi/downloadcatalogfeed to download the feed. But there are few problems with that.

First of all you might have more than one catalog, and the link above only allows you to download the latest one. Secondly it might fail to give you any catalog feed at all (With “There is no product feed available. Try run the scheduled job first.” error, even if you already ran the Export Catalog Feed job. There is currently no known fix for that). Is there a way to simply get the data?

Yes, there is. It’s not fancy, but it works. All your catalog feeds will be put in appdata\blobs\d4a76096689649908bce5881979b7c1a folder, so just go there and grab the latest ones. appdata is the path defined in your <episerver.framework>\<appData> section.

What if you are running on Azure? It is the same “folder”. Use Azure Storage Explorer to locate the blob. If you are running on DXP, get in touch with developer support service and they’d be happy to help.

I originally planned to write a small tool to easily download the catalog feeds, but it turned out the Episerver Blob APIs have no way to list content of a container, so a manual, simple way is better this time.

Don’t let order search kill your site

Episerver Commerce order search is a powerful feature. My colleague Shannon Gray wrote about is long ago https://world.episerver.com/blogs/Shannon-Gray/Dates/2012/12/EPiServer-Commerce-Order-Search-Made-Easy/ , and I myself as well https://world.episerver.com/blogs/Quan-Mai/Dates/2014/10/Order-searchmade-easy/

But because of its power and flexibility, it can be complicated to get right. People usually stop at making the query works. Performance is usually an after thought, as it is only visible on production environment when there are enough requests to bring your database to its knees.

Let me be very clear about it: during my years helping customers with performance issues (and you can guess, that is a lot of customers), order search is one of the most, if not the most common cause of database spikes.

Trust me, you never want to your database looks like this

As your commerce database is brought to its knees, your entire website performance suffers. Your response time suffers. Your visitors are unhappy and that makes your business suffer.

But what is so bad about order search?

Order search allows you to find orders by almost any criteria. And to do that, you often join with different tables in the database. Search for orders with specific line items? Join with LineItem table on a match of CatalogEntryId column. Search for orders with a specific shipping method? Join with Shipment table on a match of ShippingMethodId etc. etc. SqlWhereClause and SqlMetaWhereClause of OrderSearchParameters are extremely flexible, and that is both a cure, and a curse.

Let’s examine the first example in closer details. The query is easy to write. But don’t you know that there is no index on the CatalogEntryId column? That means every request to search order, end up in a full table scan of LineItem.

There are two bad news into that: your LineItem table usually have many rows already, which makes that scan slow, and resource intensive. And as it’s an ever growing table, the situation only gets worse over time.

That is only a start, and a simple one, because that can be resolved by adding an index on CatalogEntryId , but there are more complicated cases when adding an index simply can’t solve the problem – because there is no good one. For example if you search for orders with custom fields, but only of type bit . Bit is essentially the worst type when it comes to index-ability, so your indexes will be much less effective than you want it to be. A full table scan will likely be used.

In short:

Order search is flexible, and powerful. But, “With great power come great responsibility”. Think about what you join on your SqlWhereClause and SqlMetaWhereClause statements, and if your query is covered by an index, or if adding an index will make senses in this case (I have a few guidelines here for a good index https://vimvq1987.com/index-or-no-index-thats-the-question/). Or if you can limit the number of the orders you search for.

Your database will thank you, later.

Iterate through all carts/orders

While it’s not a common task to do, you might want to iterate through all carts, or all carts with a specific criteria. For example, you might want to load all carts that have been last modified for more than 1 week, but less than 2 weeks, so you can send a reminder to the buyer (Ideas on the implementation of that feature is discussed in my book – Episerver Commerce A problem solution approach). Or you simply want do delete all the carts, as asked here https://world.episerver.com/forum/developer-forum/Episerver-Commerce/Thread-Container/2021/1/removing-all-active-carts/ . How?

In previous versions of Episerver Commerce, what you can do is to use OrderContext to find orders and carts using the Order search API. However that does not work with non default implementations, such as the serializable carts. A better way would be to use the new abstraction – IOrderSearchService. It takes a OrderSearchFilter which allows things like paging to be set, and returns an OrderSearchResults<T> which contains the matching collection of carts or orders, and the total count. When you have a lot of carts or orders to process, it’s nice (even important) to let the end users know the progress. However, it’s also important to know that, counting the matching carts/orders can be very expensive, so I’d suggest to avoid doing it every time.

The pattern that you can use is to do a first round (which do not load many carts, except one), to load total count. For subsequent calls you only load the carts, but set ReturnTotalCount to false to skip loading the total count. If you want to delete all the carts (for fun and profit, obviously do not try this on production, unless if this is exactly what you want), the code can be written like this, with _orderSearchService is an instance of IOrderSearchService, and _orderRepository is an instance of IOrderRepository

            var deletedCartsTotalCount = 0;
            var cartFilter = new CartFilter
            {
                RecordsToRetrieve = 1,
                ExcludedCartNames = excludedCartNames,
                ReturnTotalCount = true
            };

            //Get the total carts for status update.
            var orderSearchResults = _orderSearchService.FindCarts(cartFilter);
            var totalCount = orderSearchResults.TotalRecords;
            cartFilter.ReturnTotalCount = false;
            cartFilter.RecordsToRetrieve = 100;

            var cartLoaded = 0;
            do
            {
                var searchResults = _orderSearchService.FindCarts(cartFilter);
                foreach (var cart in searchResults.Orders)
                {
                    _orderRespository.Delete(cart.OrderLink);
                    deletedCartsTotalCount++;
                }

                OnStatusChanged($"Deleted {deletedCartsTotalCount} in {totalCount} carts.");
                cartLoaded = searchResults.Orders.Count();
            }
            while (cartLoaded > 0);

A few notes:

  • You might or might not exclude carts based on name
  • CartFilter has a few filters that you can play with, not just names.

Register your custom implementation, the sure way

The point of Episerver dependency injection is that you can plug in your custom implementation for, well almost, everything. But it can be tricky at times how to properly register your custom implementation.

The default DI framework (and possibly any other popular DI frameworks) works in the way that implementation registered later wins, i.e. it overrides any other implementation registered before it. To make Episerver uses your implementation, you have to make sure yours is registered last.

  • Never register your customer implementation using ServiceConfiguration. Implementation with that attributes will be registered first in the initialization pipeline. You will run into either
    • The default implementation was registered in an IConfigurableModule.ConfigureContainer. As those will be registered later than any implementation using ServiceConfiguration , yours will be overridden by the default ones.
    • The default implementation was also registered using ServiceConfiguration. Now you run into indeterministic situation – the order will be randomized every time your website starts. Sometimes it’s yours, sometimes it’s the default one, and that might cause some nasty bug (Heisenbug, if you know the reference 😉 )
  • That leaves you with registering your implementation by IConfigurableModule.ConfigureContainer . In many cases, registering your implementations here will just work, because the default implementations are registered by ServiceConfiguration attribute. However, that is not always the case. There is a possibility that the default one was registered using IConfigurableModule.ConfigureContainer, and things will be tricky. First of all, unlike IInitializationModule when you can make your module depends on a specific module, the order in which IConfigurationModule.ConfigureContainer is executed is not determined. Even if you were allowed to make the dependency, it’s not clear which module you should depend on, and in many cases, that module is internal, so you can’t specify it

That is the point of this post then. To make sure your implementation is registered regardless of how the default one is registered, you can always fallback to use the ConfigurationComplete event of ServiceConfigurationContext. This is called once all ConfigureContainer have been called, so you can be sure that the default implementation is registered – time to override it then!

        public void ConfigureContainer(ServiceConfigurationContext context)
        {
            context.ConfigurationComplete += Context_ConfigurationComplete;
        }

        private void Context_ConfigurationComplete(object sender, ServiceConfigurationEventArgs e)
        {
            e.Services.AddSingleton<IOrderRepository, CustomOrderRepository>();
        }

Simple as that!

Note that this only applies to cases when you want to override the default implementation. If you register an implementation of your own interfaces/abstract classes, or you will be adding your implementation (not overriding the default one, an example is if you have an implementation of IShippingPlugin), you can register it in any way.

Name or Display name in Catalog UI: you can choose

Since the beginning of Catalog UI, it had always shown Name, in both Catalog Tree and the Catalog content list.

That, however, was changed to DisplayName since 13.14 due to a popular feature request here https://world.episerver.com/forum/developer-forum/Feature-requests/Thread-Container/2019/12/use-localized-catalog-in-commerce-catalog-ui/#214650

All is good and the change was positively received. However not every is happy with it – some want it the old way, i.e. `Name` to be displayed. From a framework perspective, it might be complex to let partners configure which field to display. But if you are willing to do some extra work, then it’s all easy.

Catalog content is transformed using CatalogContentModelTransform, this is where DisplayName is added to the data returned to the client. If you override that, you can set DisplayName to whatever you want, for example, Name.

Here is what the implementation would look like

using EPiServer.Cms.Shell.UI.Rest.Models.Transforms;
using EPiServer.Commerce;
using EPiServer.Commerce.Catalog;
using EPiServer.Commerce.Catalog.ContentTypes;
using EPiServer.Commerce.Catalog.Linking;
using EPiServer.Commerce.Shell.Rest;
using EPiServer.Framework.Localization;
using EPiServer.ServiceLocation;
using Mediachase.Commerce.Catalog;
using Mediachase.Commerce.Customers;
using Mediachase.Commerce.InventoryService;
using Mediachase.Commerce.Markets;
using Mediachase.Commerce.Pricing;

namespace EPiServer.Reference.Commerce.Site.Infrastructure
{
    [ServiceConfiguration(typeof(IModelTransform))]
    public class BlahBlahBlah : CatalogContentModelTransform
    {
        public BlahBlahBlah(ExpressionHelper expressionHelper, IPriceService priceService, IMarketService marketService, IInventoryService inventoryService, LocalizationService localizationService, ICatalogSystem catalogContext, IRelationRepository relationRepository, ThumbnailUrlResolver thumbnailUrlResolver, CustomerContext customerContext) : base(expressionHelper, priceService, marketService, inventoryService, localizationService, catalogContext, relationRepository, thumbnailUrlResolver, customerContext)
        {
        }

        public override TransformOrder Order
        {
            ///Yes, this is very important to make it work
            get { return base.Order + 1; }
        }

        protected override void TransformInstance(IModelTransformContext context)
        {
            var catalogContent = context.Source as CatalogContentBase;
            var properties = context.Target.Properties;

            if (catalogContent is NodeContent nodeContent)
            {
                properties["DisplayName"] = nodeContent.Name;
            }
            if (catalogContent is EntryContentBase entryContent)
            {
                properties["DisplayName"] = entryContent.Name;
            }
        }
    }
}

And here is how it looks




A few notes:

  • CatalogContentModelTransform, and other APIs in Commerce.Shell, are not considered public APIs, so they might change without notice. There is a risk for adding this, however, it’s quite low.
  • This (or the bug fix) does not affect breadcrumb, it has been, and still is, showing Name.