Use Find for CSR UI

If you have been using Find, you might be surprised to find that CSR UI uses the SearchProvider internally. This is a bit unfortunate because you likely are using Find, and that creates unnecessary complexity. For starter, you need to configure a SearchProvider, then you need to index the entries, separately from the Find index. If you install EPiServer.CloudPlatform.Commerce, it will setup the DXPLucenceSearchProvider for you, which is basically a wrapper of LuceneSearchProvider to let it work on DXP (i.e. Azure storage). But even with that, you have to index your entries anyway. You can use FindSearchProvider, but that actually just creates another problem – it uses a different index compared to Find, so you double your index count, yet you have still make sure to index your content. Is there a better way – to use the existing Find indexed content?

Yes, there is

Searches for entries in CSR is done by IEntrySearchService which the default implementation uses the configured SearchProvider internally . Fortunately for us, as with most thing in Commerce, we can create our own implementation and inject it. Now that’s with a caveat – IEntrySearchService is marked as BETA remark, so prepare for some breaking changes without prior notice. However it has not changed much since its inception (funny thing, when I checked for its history, I was the one who created it 6 years ago, in 2017. Feeling old now), and if it is changed, it would be quite easy to adapt for such changes.

IEntrySearchService is a simple with just one method:


IEnumerable<int> Search(string keyword, MarketId marketId, Currency currency, string siteId);

It is a bit weird to return an IEnumerable<int> (what was I thinking ? ), but it was likely created as a scaffolding of SearchManager.Search which returns an IEnumerable<int>, and was not updated later. Anyway, an implementation using Find should look like this:

    public class FindEntrySearchService : IEntrySearchService
    {
        private EPiServer.Find.IClient _searchClient;

        public FindEntrySearchService(EPiServer.Find.IClient searchClient) => _searchClient = searchClient;

        public IEnumerable<int> Search(string keyword, MarketId marketId, Currency currency, string siteId)
        {
            return _searchClient.Search<EntryContentBase>()
                 .For(keyword)
                 .Filter(x => x.MatchMarketId(marketId))
                 .Filter(x => x.SiteId().Match(siteId))
                 .Filter(x => FilterPriceAvailableForCurrency<IPricing>(y => y.Prices(), currency))
                 .GetResult()
                 .Select(x => x.ContentLink.ID);
        }

        public FilterExpression<Price> FilterPriceAvailableForCurrency<T>(Expression<Func<T, IEnumerable<Price>>> prices, Currency currency)
        {
            var currencyCode = currency != null ? currency.CurrencyCode : string.Empty;

            return new NestedFilterExpression<T, Price>(prices, price => price.UnitPrice.Currency.CurrencyCode.Match(currencyCode), _searchClient.Conventions);
        }
    }

Note that I am not an expert on Find, especially on NestedFilterExpression, so my FilterPriceAvailableForCurrency might be wrong. Feel free to correct it, the code is not copyrighted and is provided as-is.

As always, you need to register this implementation for IEntrySearchService. You can add it anywhere you like as long as it’s after .AddCommerce.

_services.AddSingleton<IEntrySearchService, FindEntrySearchService>();

Skip validating carts

Carts are meant to be validated. Prices changed, customers add more quantity than allowed, promotions expired, stock ran out, etc.. All kinds of stuffs that make the items in carts need validation to make sure they up to date, and be ready to be converted to an order.

However, there are cases when you don’t want your carts to be validated. The most common case is of course, wish list – a special cart that allows customer to add items to, just to keep track of. You certainly don’t want to touch it. Another example is quote – when you give a specific item at specific price for a customer, and you don’t want it to be automatically changed to the public prices, which is different from that said price.

By default, when it is called to validate a cart, these things will be done:

  • Remove items that no longer available (either deleted or end of line)
  • Update prices of the items to the latest applicable prices, or remove items that have no prices.
  • Update quantity of the items (to comply with the settings or in stock quantity), or remove items that are out of stock
  • Apply promotions
  • Update quantity again (As promotions could do things like adding free items to the cart)

There are two ways you can avoid carts being validated, let’s see what we can do.

The “Wish list names” route

With OrderOptions you can set certain wishlist names to be exempted from the validation, using WishListCartNames. By default, it’s only “Wishlist”, but you can set several using the comma separator, like this

orderOptions.WishListCartNames = "Wishlist,Quote";

However there is a caveat, with this approach, carts in those names will not be shown in the Order Management (If you want, you can change that, however it is not an easy or quick one)

The OrderValidationService route

The validation of carts (or rather, order types in general) is done by OrderValidationService. And that class is meant to be extended if necessary. Here is how you would avoid validation carts with name “Quote”, using OrderValidationService

    public class CustomOrderValidationService : OrderValidationService
    {
        public CustomOrderValidationService(ILineItemValidator lineItemValidator, IPlacedPriceProcessor placedPriceProcessor, IPromotionEngine promotionEngine, IInventoryProcessor inventoryProcessor, OrderOptions orderOptions) : base(lineItemValidator, placedPriceProcessor, promotionEngine, inventoryProcessor, orderOptions)
        {
        }

        public override IDictionary<ILineItem, IList<ValidationIssue>> ValidateOrder(IOrderGroup orderGroup)
        {
            if (orderGroup.Name.Equals("Quote", System.StringComparison.OrdinalIgnoreCase))
            {
                return new Dictionary<ILineItem, IList<ValidationIssue>>();
            }
            return base.ValidateOrder(orderGroup);
        }
    }

And as OrderValidationService is registered by ServiceConfiguration attribute, you can register yours by

services.AddTransient<OrderValidationService, CustomOrderValidationService>();

One caveat though, making changes to OrderValidationService means those changes will apply to the entire website, so make sure the changes are actually the ones you want site-wide, not just in specific places.

Index only Catalog content

If you are using Find to index your content, you likely have used the Find Indexing job – which would index everything in one go. Today I stumped upon this question – A way to run indexing job for Commerce only | Optimizely Develope – and it is a good one – if you have many of content in CMS side, and they don’t change that often, if at all – you certain don’t want to waste time and resource in trying to reindex them again. Is there away to just index catalog content?

Yes, there is. It is a bit hacky solution, but it can certain work. But first, let’s dive in on how Find indexing job does it work. It relies on IIndexingJobService , which itself relies on ContentIndexer to do the job. In its turn, ContentIndexer uses a list of IReindexInformation to know which content to index, and in which languages. Here’s what it looks like

    public interface IReindexInformation
    {
        /// <summary>
        /// Content links to be reindexed.
        /// </summary>
        IEnumerable<ReindexTarget> ReindexTargets { get; }

        /// <summary>
        /// Gets the root to index.
        /// </summary>
        ContentReference Root { get; }
    }

It has one Root, and multiple ReindexTarget, which contains

    public class ReindexTarget
    {
        /// <summary>
        /// The content references.
        /// </summary>
        public IEnumerable<ContentReference> ContentLinks { get; set; }

        /// <summary>
        /// The languages the collection of <see cref="ContentReference"/> are enabled on.
        /// </summary>
        public IEnumerable<CultureInfo> Languages { get; set; }

        /// <summary>
        /// The site that the collection of <see cref="ContentReference"/> appears on
        /// or <c>null</c> if unknown.
        /// </summary>
        public SiteDefinition SiteDefinition { get; set; }
    }

As you might have guessed, Commerce has its own IReindexInformation to index catalog content. If we can only use that to run our job. This is how our “hack” begins

The interface IContentIndexer has no method to control the IReindexInformation`, but the default implementation ContentIndexer does. We set it to the only one we need, so here it is

        List<IReindexInformation> targets;
        var contentIndexer = _contentIndexer as ContentIndexer;
        if (contentIndexer != null)
        {
            targets = contentIndexer.ReindexInformation.ToList();
            var commerceReIndexInformation = targets.FirstOrDefault(x => x.GetType() == typeof(CommerceReIndexInformation));
            contentIndexer.ReindexInformation = new List<IReindexInformation>() { commerceReIndexInformation };
            _indexingJobService.Start(OnStatusChanged);

            contentIndexer.ReindexInformation = targets;
        }

A note is that you will still see the “Indexing Global assets and other data” message, because IIndexingJobService implementation will go through all SiteDefinition regardless and show that message, but the internal ContentIndexer will skip if the SiteDefinition passed to it does not match the SiteDefinition in the IReindexInformation (and for CommerceReIndexInformation it’s SiteDefinition.Empty

As I mentioned in the beginning, this is a bit hacky solution, as you have to cast IContentIndexer to its concrete implementation. The proper solution would be implement IContentIndexer yourself. Given that’s not a trivial job, I’ll leave at that.

A curious case of cookie threading issue – part 2

A while back, I wrote this post A curious case of cookie threading issue – Quan Mai’s blog (vimvq1987.com) as an example of how to use WinDbg to diagnose a race condition (with results in an infinite loop). Naturally that issue needs to be fixed. A starting point to fix a race condition is to put locks whenever you change the data. But it’s not that simple. Too much locking would definitely hurt your application performance, and increase the chance of you running into another nasty problem (namely, deadlock). So the art of locking is “lagom” – Swedish word for “just right”, not too much, not too little.

ReaderWriteLockSlim comes to rescue. You only need to lock when writing, not locking. So this for reading

            readerWriterLockSlim.EnterReadLock();
            try
            {
                return httpRequest.Cookies[CookieKey];
            }
            finally
            {
                readerWriterLockSlim.ExitReadLock();
            }

And this for writing

            readerWriterLockSlim.EnterWriteLock();
            try
            {
                httpRequest.Cookies.Remove(CookieKey);               
            }
            finally
            {
                readerWriterLockSlim.ExitWriteLock();
            }

Would be enough, right?

No. The issue runs deeper than that.

If we go back to the stacktrace of the issue

0:039> !clrstack
OS Thread Id: 0x1e34 (39)
        Child SP               IP Call Site
000000740273cff0 00007ff9297d9f59 System.Collections.Generic.HashSet`1[[System.__Canon, mscorlib]].Remove(System.__Canon)
000000740273d080 00007ff92a8eb4e3 System.Web.HttpCookieCollection.Get(System.String)
000000740273d0c0 00007ff9314de98d 

and look closer. Something is not right. If you do not spot the issue – it’s fine, I missed it too. The problem is that we have a HashSet.Remove inside a HttpCookieCollection.Get. Uh oh.

If we look at source code of HttpCookieCollection, inside Get, it calls to EnsureKeyValidated

if (cookie != null) {
                EnsureKeyValidated(name, cookie.Value);
            }

referencesource/HttpCookieCollection.cs at master · microsoft/referencesource · GitHub

which itself calls to

            _keysAwaitingValidation.Remove(key);

referencesource/HttpCookieCollection.cs at master · microsoft/referencesource · GitHub

_keysAwaitingValidation is a HashSet<string> . That explains the Remove we saw. And that explains why ReaderWriterLockSlim is not enough – changes are made within the supposedly read only action as well.

The only valid solution here is to lock both read and write actions on HttpCookieCollection. However, as HttpCookieCollection is per request, so our lock object should only be per request as well (we certainly do not want every thread to be locked when we get cookie on a request).

Moral of the story:

  • Look closer. There might always be something underneath. Close to the truth is still not the truth.
  • Never assume that a Get method is thread-safe. The implementation can do plenty of unexpected things under the hood.
  • You might ask why this happen, as HttpCookieCollection is per request. Turned out that there is a code that use ThreadPool.QueueUserWorkItem to queue tasks which share the same HttpContext object. As we learned, that’s the recipe for disaster. Think twice (or thrice) before sharing an object that is not thread-safe between threads.

Delete a content – directly from database

Before we even start, I would reiterate that manipulating data directly should be avoided unless absolutely necessary, it should be used as the last resort, and should be proceeded with cautions – always back up first and test your queries on development database first before running it in production. And if the situation dictates that you have to run the query, better do it with the 4 eyes principle – having a colleague double check it for you. When it comes to production database, nothing is too careful.

Now back to the question, if you absolutely have to delete a content, you should do like this

exec editDeletePage @pageId = 123, @ForceDelete = 1

It is basically what Content Clouds (i.e. CMS) does under the hood, without the cache validation on the application layer of course.

So the moral of the story – do everything with API if you can. If you absolutely have to, use the built-in stored procedures – they are tested vigorously and should have minimal issues/bugs, and should take care of everything, data-wise for you. Only write your own query if there is no SP that can be used.

Update: Initially I mentioned Tomas’ post in this, and that gave impression his way is incorrect. I should have written better. My apologies to Tomas

Storing 100.000 prices per SKU – part 1

One of the questions I have received, from time to time, is that how to store a lot of prices per SKU in Optimizely (B2C) Commerce Cloud. While this is usually a perfect candidate for Optimizely B2B Commerce, there are many customers invested in B2C and want to make the best out of it. Is it possible?

It’s important to understand the pricing system of Optimizely Commerce (which is, written in detail in my book – shameless plug). But in short:

  • There are two price systems, IPriceService and IPriceDetailService
  • One is handling prices in batch – i.e. prices per SKU (IPriceService), and one is handling prices per individual price (IPriceDetailService)
  • Both are cached in latest version (cache for IPriceDetailService was added in late 13.x version)

With that in mind, it would be very problematic if you use IPriceService for such high number of prices per SKU, because each time you save a price, you save a lot of prices at once (same as loading prices). This is how the default IPriceService implementation saves prices of a SKU

create procedure dbo.ecf_Pricing_SetCatalogEntryPrices
    @CatalogKeys udttCatalogKey readonly,
    @PriceValues udttCatalogEntryPrice readonly
as
begin
    begin try
        declare @initialTranCount int = @@TRANCOUNT
        if @initialTranCount = 0 begin transaction

        delete pv
        from @CatalogKeys ck
        join dbo.PriceGroup pg on ck.CatalogEntryCode = pg.CatalogEntryCode
        join dbo.PriceValue pv on pg.PriceGroupId = pv.PriceGroupId

        merge into dbo.PriceGroup tgt
        using (select distinct CatalogEntryCode, MarketId, CurrencyCode, PriceTypeId, PriceCode from @PriceValues) src
        on (    tgt.CatalogEntryCode = src.CatalogEntryCode
            and tgt.MarketId = src.MarketId
            and tgt.CurrencyCode = src.CurrencyCode
            and tgt.PriceTypeId = src.PriceTypeId
            and tgt.PriceCode = src.PriceCode)
        when matched then update set Modified = GETUTCDATE()
        when not matched then insert (Created, Modified, CatalogEntryCode, MarketId, CurrencyCode, PriceTypeId, PriceCode)
            values (GETUTCDATE(), GETUTCDATE(), src.CatalogEntryCode, src.MarketId, src.CurrencyCode, src.PriceTypeId, src.PriceCode);

        insert into dbo.PriceValue (PriceGroupId, ValidFrom, ValidUntil, MinQuantity, MaxQuantity, UnitPrice)
        select pg.PriceGroupId, src.ValidFrom, src.ValidUntil, src.MinQuantity, src.MaxQuantity, src.UnitPrice
        from @PriceValues src
        left outer join PriceGroup pg
            on  src.CatalogEntryCode = pg.CatalogEntryCode
            and src.MarketId = pg.MarketId
            and src.CurrencyCode = pg.CurrencyCode
            and src.PriceTypeId = pg.PriceTypeId
            and src.PriceCode = pg.PriceCode

        delete tgt
        from dbo.PriceGroup tgt
        join @CatalogKeys ck on tgt.CatalogEntryCode = ck.CatalogEntryCode
        left join dbo.PriceValue pv on pv.PriceGroupId = tgt.PriceGroupId
        where pv.PriceGroupId is null

        if @initialTranCount = 0 commit transaction
    end try
    begin catch
        declare @msg nvarchar(4000), @severity int, @state int
        select @msg = ERROR_MESSAGE(), @severity = ERROR_SEVERITY(), @state = ERROR_STATE()
        if @initialTranCount = 0 rollback transaction
        raiserror(@msg, @severity, @state)
    end catch
end

If you have experience with SQL (which you probably should), you will see that it’s a deletion of rows in PriceValue that have CatalogEntryCode same as , then a merge, then a deletion of left over rows. To make matters worse, IPriceService system stores data on 3 tables: PriceValue, PriceGroup and PriceType. Imagine doing that with a few dozen of thousands rows.

Even if you change just one price, all prices of that specific SKU will be touched. It’d be fine if you have like ten prices, but if you have ten thousands prices, it’ll be a huge waste.

Not just that. To save one price, you would still need to load all prices of that specific SKU. That’s two layers of waste: the read operations at database layer, and then on application, a lot of price objects will need to be constructed, and then you need to recreate a datatable to send all the data back to the database to do the expensive operation above.

And wait, because the prices saved to IPriceService needs to be synchronized to IPriceDetailService (however, you can disable this). Prices that were changed (which is, all of them) need to be replicated to another table.

So in short, IPriceService was not designed to handle many prices per SKU. If you have less than a few hundred prices per SKU (on average), it’s fine. But if you have more than 1000 prices per SKU, it’s time to look at other options.

Where to store big collection data

No, I do not mean that big, big data (in size of terabytes or more). It’s big collection, like when you have a List<string> and it has more than a few hundreds of items. Where to store it?

Naturally, you would want to store that data as a property of a content. it’s convenient and it just works, so you definitely can. But the actual question is: should you?

It’s as simple as this

public virtual IList<String> MyBigProperty {get;set;}

But under the hood, it’s more than just … that. Let’s ignore UI for a moment (rendering such long list is a bad UX no matters how you look at it, but you can simply ignore rendering that property by appropriate attributes), and focus on the backend aspects of it.

List<T> properties are serialized as a long strings, and save at once to database. If you have a big property in your content, this will happen every time you load your content:

  • The data must be read from database, then transferred through the network
  • The data must be parsed to create an array (the underlying data structure of List<T>. The original string is tossed away.
  • Now you have a big array that you might not use every time. it’s just there taking your previous LOH (as with the original string)

Same thing happens when you actually save that property

  • The data must be serialized as string, the List<T> is now tossed away
  • The data then must be transferred through the network
  • The data then saved to database. Even though it is a very long string and you changed, maybe 10 characters, it’s completely rewritten. Due to its size, there might be multiple page writes needed.

As you can see, it can create a lot of waste, especially if you rarely use that property. To make the matter worse, due to the size of the property, it means they are taking up space in LOH (large objects heap).

And imagine if you have such properties in each and every of your content. The waste is multiplied, and your site is now at risk of some frequent Gen 2 Garbage collection. Nobody likes visiting a website that freezes (if not crashes) once every 30 minutes.

Then when to store such big collection data?

The obvious answer is … somewhere else. Without other inputs, it’s hard to give you some concrete suggestions, but how’s about a normalized custom table? You have the key as the content reference, and the other column is each value of the list. Just an idea. Then you only load the data when you absolutely need it. More work, yes, but it’s the better way to do it.

Just a reminder that whatever you do, just stay away from DDS – Dynamic Data Store. It’s the worst option of all. Just, don’t 🙂

Getting being delete variant codes

I got the question here Content Events and Service API | Optimizely Developer Community , and CHKN contacted Optimizely developer support service which is the good/right thing to do. However it could be beneficial for a wider audience, hence this blogpost.

If you are developing new Optimizely Commerce Cloud now, you should be using the latest version (CMS 12/ Commerce 14), or at least Commerce 13.31 if you have to stay with .NET 4.8. You then could use the IContentEvents to listen to any events that might be fired from Service API.

However, if you are using older versions, you might be limited to lower level, non content event through EventContext. It works, but with a catch: there is no EntryDeleting event in EventContext. At this point I’m not entirely sure why, probably just an overlook. But it’s not impossible to work around that issue.

As I suggested in the post, EntryUpdating is like a blanket event – every change to the entries goes through there. The sender is a CatalogEntryDto, which should contain information about the being deleted entries.

        private void Instance_EntryUpdating(object sender, Mediachase.Commerce.Catalog.Events.EntryEventArgs e)
        {
            var dto = sender as CatalogEntryDto;
            if (dto != null)
            {
                var rows = dto.CatalogEntry.Where(x => x.RowState == DataRowState.Deleted);
                var deletingCodes = rows.Select(x => (string)x["Code", DataRowVersion.Original]);
                //do stuffs with codes being deleted.
            }
        }

And then to listen to the event in one of your IInitializationModule

EventContext.Instance.EntryUpdating += Instance_EntryUpdating;

However, there is a caveat here: as EntryUpdating happens before the entry deletion, it is possible that the change did not go through (i.e. the changes were to be reverted). It is unlikely, but it’s a possibility. You might accept that, or you can:

  • Store the id and code in a “Deleting entry” dictionary
  • Listen to EntryDeleted event and match from DeletedEntryEventArgs parameter (which contains EntryId ) to get the deleted Code, and continue from there.

The hidden gotcha with IObjectInstanceCache

It’s not a secret that cache is one of the most, if not the most, important factors to their website performance. Yes cache is great, and if you are using Optimizely Content/Commerce Cloud, you should be using ISynchronizedObjectInstanceCache to cache your objects whenever possible.

But caching is not easy. Or rather, the cache invalidation is not easy.

To ensure that you have effective caching strategy, it’s important that you have cache dependencies, i.e. In principles, there are two types of dependencies:

  • Master keys. This is to control a entire cache “segment”. For example, you could have one master key for the prices. If you needs to invalidate the entire price cache, just remove the master key and you’re done.
  • Dependency keys. This is to tell cache system that your cache item depends on this or that object. If this or that object is invalidated, your cache item will be invalidated automatically. This is particularly useful if you do not control this or that object.

ISynchronizedObjectInstanceCache allows you to control the cache dependencies by CacheEvictionPolicy . There are a few ways to construct an instance of CacheEvictionPolicy, from if the cache expiration will be absolute (i.e. it will be invalidated after a fixed amount of time), or sliding (i.e. if it is accessed, its expiration will be renewed), to if your cache will be dependent on one or more master keys, and/or one or more dependency keys, like this

   
        /// <summary>
        /// Initializes a new instance of the <see cref="CacheEvictionPolicy"/> class.
        /// </summary>
        /// <param name="cacheKeys">The dependencies to other cached items, idetified by their keys.</param>
        public CacheEvictionPolicy(IEnumerable<string> cacheKeys)

        /// <summary>
        /// Initializes a new instance of the <see cref="CacheEvictionPolicy"/> class.
        /// </summary>
        /// <param name="cacheKeys">The dependencies to other cached items, idetified by their keys.</param>
        /// <param name="masterKeys">The master keys that we depend upon.</param>
        public CacheEvictionPolicy(IEnumerable<string> cacheKeys, IEnumerable<string> masterKeys)

The constructors that takes master keys and dependency keys look pretty the same, but there is an important difference/caveat here: if there is no dependency key already existing in cache, the cache item you are inserting will be invalidated (i.e. removed from cache) immediately. (For master keys, the framework will automatically add an empty object (if none existed) for you.)

That will be some unpleasant surprise – everything seems to be working fine, no error whatsoever. But, if you look closely, your code seems to be hitting database more than it should. But other than that, your website performance is silently suffering (sometimes, not “silently”)

This is an easy mistake to make – I did once myself (albeit long ago) in an important catalog content cache path. And I saw some very experienced developers made the same mistake as well (At this point you might wonder if the API itself is to blame)

Take away:

  • Make sure you are using the right constructor when you construct an instance of CacheEvictionPolicy . Are you sure that the cache keys you are going to depend on, actually exist?

In newer version of CMS, there would be a warning in log if the cache is invalidated immediately, however, it could be missed, unless you are actively looking for it.

Note that this behavior is the same with ISynchronizedObjectInstanceCache as it extends IObjectInstanceCache.

Potential performance issue with Maxmind.db

From time to time, I have to dig into some customers’ profiler traces to figure out why their site is slow (yeah, if you follow me, you’d know that’s kind of my main job). There are multiple issues that can eat your website performance for breakfast, from loading too much content, to unmaintained database indexes. While my blog does not cover everything, I think you can get a good grasp of what mistakes to avoid.

But sometimes the problem might come from a 3rd party library/framework. It’s not new, as we have seen it with A curious case of memory dump diagnostic: How Stackify can cause troubles to your site – Quan Mai’s blog (vimvq1987.com). The problem with those types of issues is that they are usually overlooked.

The library we’ll be investigating today would be Maxmind.db. To be honest, I’ve never used it my own, but it seems to be a very popular choice to geography-map the visitors. It’s usually used by Optimizely sites for that purpose, using VisitorGroup (which is why it came under my radar).

For several sites that use it, it seems more often than not stuck in this stack

It’s essentially to think that CreateActivator is doing something heavy here (evidently with the LambdaCompiler.Compile part. A peek from decompiling actually shows that yes, it’s heavy. I’m not quite sure I can post the decompiled code here without violating any agreement (I did, in fact, accepted no agreement at this point), but it’s quite straightforward code: TypeActivatorCreator uses reflection to get the constructors of the Type passed to it, to sees if there is any constructor decorated with MaxMind.Db.Constructor attribute, then prepares some parameters, and creates an LambdaExpression that would create an instance of that Type, using found constructor (which is a good thing because a compiled expression would perform much better than just a reflection call).

(I’m using Mindmax.db 2.0.0, for the record)

The code is straightforward, but it is also slow – as any code which involves reflection and lambda compilation would be. The essential step would be to cache any result of this. This is actually a very good place to cache. The number of types are fixed during runtime (except for very edge cases where you dynamically create new types), so you won’t have to worry about cache invalidation. The cache would significantly improve the performance of above code.

And in TypeActivatorCreator there is a cache for it. It is a simple ConcurrentDictionary<Type, TypeActivator> , which would return an TypeActivator if the Type was requested before, or create one and cache it, it it hasn’t been. As I said, this is a very good place to add cache to this.

There is a cache for that, which is good. However, the very important tidbit here is that the dictionary is not static. That means, the cache only works, if the class is registered as Singleton (by itself, or by another class down the dependency chain), meaning, only one of the instance is created and shared between thread (which is why the ConcurrentDictionary part is important).

But except it’s not.

When I look at a memory dump that collected for a customer that is using Maxmind.db, this is what I got:

0:000> !dumpheap -stat -type TypeAcivatorCreator
Statistics:
MT Count TotalSize Class Name
00007ffa920f67e0 1 24 MaxMind.Db.TypeAcivatorCreator+<>c
00007ffa920f6500 147 3528 MaxMind.Db.TypeAcivatorCreator
Total 148 objects

So there were 147 instances of TypeAcivatorCreator. Note that this is only the number of existing instances. There might be other instances that were disposed and garbaged by CLR.

Now it’s clear why it has been performing bad. For supposedly every request, a new instance of TypeActivatorCreator is created, and therefore its internal cache is simply empty (it is just newly created, too). Therefore each of request will go through the expensive path of CreateActivator, and performance suffers.

The obvious fix here is to make the dictionary static, or making the TypeActivatorCreator class Singleton. I don’t have the full source code of Mindmax.Db to determine which is better, but I’d be leaning toward the former.

Moral of the story:

  • Caching is very, very important, especially when you are dealing with reflection and lambda compilation
  • You can get it right 99%, but the 1% left could still destroy performance.

Update:

I reached out to Maxmind.db regarding this issue on November 9th, 2021

About 6h later they replied with this

I was at first confused, then somewhat disappointed. It is a small thing to fix to improve overall performance, rather than relying on/expecting customers to do what you say in documentation. But well, let’s just say we have different opinions.