The search for dictionary key

Recently I helped to chase down a ghost (and you might be surprised to know that I, for most part, spend hours to be a ghostbuster, it could be fun, sometimes). A customer reported a weird issue when a visitor goes to their website, have every thing correct in the cart, including the discount, only to have the discount disappeared when they check out. That would be a fairly easy task to debug and fix if not for the problem is random in nature. It might happen once in a while, but on average, daily. It could not be reproduced locally, or reproduced consistently on production, so all fix is based on guess work.

After a lot of dry code reading and then log reading, it turned out that it seems the problem with missing discount was problem with the missing cache. Once in a while, the cache that contains the promotion list is returned empty, resulting that no discount is applied to the order.

But why?

After a few guesses, it eventually came to me that the problem is with the caching using Dictionary, more specifically, campaigns are loaded and cached using a Dictionary, using IMarket as a key. It would be fine and highly efficient and well, if not for the fact that the default implementation of IMarket is not suitable to be a Dictionary key. It does not implement IComparable<T> and IEquatable<T> which means, for the only case that two IMarket instances to be equal, is that they are the same instances. Otherwise even if their properties all equal in value, they will not be equal.

This is a short program that demonstrates the problem. You can expect it write “False” to the output console.

public class Program
{
    private static Dictionary<AClass, int> dict = new Dictionary<AClass, int>();
    public static void Main()
    {
        dict.Add(new AClass("abc", 1), 1);
        dict.Add(new AClass("xyz", 2), 2);

        Console.WriteLine(dict.ContainsKey(new AClass("abc", 1)));
    }
}


public class AClass
{
    public AClass(string a, int b)
    {
        AString = a;
        AnInt = b;
    }

    public string AString { get; set; }
    public int AnInt { get; set; }
}

The question arises is that if the key is not matched and an empty list of campaigns returns, why this only happens sometimes. The answer is the IMarket instances themselves are cached, by default in 5 minutes. So for the problem to occur, a cache for campaigns must be loaded in memory, just before the cache for IMarket instances to be expired (then new instances are created). Once the new IMarket instances are loaded, then the campaigns cache must be accessed again before itself expires (default to 30 seconds). The timing needs to be “right” which causes this problem elusive and hard to find from normal testing – both automated and manual.

Time to some blaming and finger pointing. When I fix something I usually try to check the history of the code to understand the reason behind the original idea and intention. Was there a reason or just an overlook. And most importantly

Who wrote such code?

Me, about 7 months ago.

Uh oh.

The fix was simple enough. Instead of IMarket, we can change the key to MarketId which implements both IEquatable<T> and IComparer<T>. So it does not matter if you have two different instances of MarketId, as long as they have the same value, they will be equal.

A workaround was sent to the customer to test and after a week or so they reported back the problem is gone. The official fix is in Commerce 14.31 which was released yesterday https://nuget.optimizely.com/package/?id=EPiServer.Commerce.Core&v=14.31.0 , so you’re, always, highly recommended to upgrade.

Lessons learned:

Pick the dictionary key carefully. It should implement IEquatable<T> and IComparable<T> , properly I might ask. In general, a struct is a better choice than a class, if you can.
No matter how “experienced” you think you are, you are still a human being and can make mistake. It’s important to have someone to check your work from time to time, spotting problems that you couldn’t.

The hidden gotcha with IObjectInstanceCache

It’s not a secret that cache is one of the most, if not the most, important factors to their website performance. Yes cache is great, and if you are using Optimizely Content/Commerce Cloud, you should be using ISynchronizedObjectInstanceCache to cache your objects whenever possible.

But caching is not easy. Or rather, the cache invalidation is not easy.

To ensure that you have effective caching strategy, it’s important that you have cache dependencies, i.e. In principles, there are two types of dependencies:

Master keys. This is to control a entire cache “segment”. For example, you could have one master key for the prices. If you needs to invalidate the entire price cache, just remove the master key and you’re done.
Dependency keys. This is to tell cache system that your cache item depends on this or that object. If this or that object is invalidated, your cache item will be invalidated automatically. This is particularly useful if you do not control this or that object.

ISynchronizedObjectInstanceCache allows you to control the cache dependencies by CacheEvictionPolicy . There are a few ways to construct an instance of CacheEvictionPolicy, from if the cache expiration will be absolute (i.e. it will be invalidated after a fixed amount of time), or sliding (i.e. if it is accessed, its expiration will be renewed), to if your cache will be dependent on one or more master keys, and/or one or more dependency keys, like this

   
        /// <summary>
        /// Initializes a new instance of the <see cref="CacheEvictionPolicy"/> class.
        /// </summary>
        /// <param name="cacheKeys">The dependencies to other cached items, idetified by their keys.</param>
        public CacheEvictionPolicy(IEnumerable<string> cacheKeys)

        /// <summary>
        /// Initializes a new instance of the <see cref="CacheEvictionPolicy"/> class.
        /// </summary>
        /// <param name="cacheKeys">The dependencies to other cached items, idetified by their keys.</param>
        /// <param name="masterKeys">The master keys that we depend upon.</param>
        public CacheEvictionPolicy(IEnumerable<string> cacheKeys, IEnumerable<string> masterKeys)

The constructors that takes master keys and dependency keys look pretty the same, but there is an important difference/caveat here: if there is no dependency key already existing in cache, the cache item you are inserting will be invalidated (i.e. removed from cache) immediately. (For master keys, the framework will automatically add an empty object (if none existed) for you.)

That will be some unpleasant surprise – everything seems to be working fine, no error whatsoever. But, if you look closely, your code seems to be hitting database more than it should. But other than that, your website performance is silently suffering (sometimes, not “silently”)

This is an easy mistake to make – I did once myself (albeit long ago) in an important catalog content cache path. And I saw some very experienced developers made the same mistake as well (At this point you might wonder if the API itself is to blame)

Take away:

Make sure you are using the right constructor when you construct an instance of CacheEvictionPolicy . Are you sure that the cache keys you are going to depend on, actually exist?

In newer version of CMS, there would be a warning in log if the cache is invalidated immediately, however, it could be missed, unless you are actively looking for it.

Note that this behavior is the same with ISynchronizedObjectInstanceCache as it extends IObjectInstanceCache.

Don’t insert an IEnumerable to cache

You have been told, cache is great. If used correctly, it can greatly improve your website performance, sometimes, it can even make the difference between life and death.

While that’s true, cache can be tricky to get right. The #1 issue with cache is cache invalidation, which we will get into detail in another blog post. The topic of today is a hidden, easy to make mistake, but can wreck havok in production.

Can you spot the problem in this snippet?

var someData = GetDataFromDatabase();                
var dataToCache = someData.Concat(someOtherData);
InsertToCache(cacheKey, dataToCache);

If you can’t, don’t worry – it is more common than you’d imagine. It’s easy to think that you are inserting your data correctly to cache.

Except you are not.

dataToCache is actually just an enumerator. It’s not until you get your data back and actually access the elements, the enumerator is actually called to fetch the data. If GetDataFromDatabase does not return a List<T>, but a lazy loading collection, that is when unpredictable things happen.

Who like to have unpredictability on a production website?

A simple, but effective advice is to always make sure you have the actual data in the object you are inserting to cache. Calling either .ToList() or ToArray() before inserting the data to cache would solve the problem nicely.

And that’s applied to any other lazy loading type of data as well.

Episerver Commerce performance optimization – part 2

Or lock or no lock – that’s the question.

This is the second part of the series on how can you improve the performance of Episerver Commerce site – or more precisely, to avoid the deadlocks and 100% CPU usage. This is not Commerce specific actually, and you can apply the knowledge and techniques here for a normal CMS site as well.

It’s a common and well-known best practice to store the slow-to-retrieve data in cache. These days memory is cheap – not free – but cheap. Yet it is still much faster than the fastest PCIe SSD in the market (if your site is running on traditional HDD, it’s not even close). And having objects in cache means you won’t have to open the connection to SQL Server, wait for it to read the data and send back to you – which all cost time. And if the object you need is a complex one, for example a Catalog content, you will also save the time needed to construct the object. Even if it’s fast, it is still not instantaneous, and it will cost you both memory and CPU cycles. All in all – caching is the right way to go. But how to get it right?

One common mistake for to have no lock when you load the data for the first time and insert it into cache.

Continue reading “Episerver Commerce performance optimization – part 2” →

Episerver Commerce performance optimization – part 1

This is a first part of a long series (which have no planned number of parts) as the lessons I learned during trouble shouting customers’ performance problems. I’m quite of addicted to the support cases reported by customers, especially the ones with performance problems. Every time I jump into such support case, I’ll be with less hairs, but I also learn some new things: Implementations are different from cases to cases, but there are some common mistakes which will hurt your website performance. This series will try to point out those mistakes so you get your performance gain, for (almost) free:

Mistake 1: Loading to much content

It’s easy to load contents, especially with the new content APIs. Given an universal ContentReference, you can load a content with a simple line of code. By default, the loaded content is cached, so you might think it’s cheap, or even free to load a content. Think again.

Continue reading “Episerver Commerce performance optimization – part 1” →