Delete orphaned assets

I was asked this question: we have about 3TB of assets, any way to clean it up.

These days, storage is cheap, but still not free. and big storage means you need space for back up. and with that, bandwidth and time.

Is there away to clean up things you no longer need?

Yes!

Optimizely Content already has a scheduled job named Remove Abandoned BLOBs, but this job only removes the blobs that have no content associated. I.e. the content is deleted by IContentRepository.Delete but the blob was left behind. The job uses the log to find out which content were deleted, then find those blobs.

How’s about the assets that still have contents associated with them, but not used anywhere? Time to get your hands dirty!

Due to the nature of this task, it is best to make it a scheduled job.

All of the assets are children under the global asset root. By iterating over them, we can check if each of them is being used by another content. If not, we will add them to a list for later delete. Before deleting the content, we will find the blob and then delete it as well. Easy, right?

To get the content recursively we use this little piece of code

        public virtual IEnumerable<T> GetAssetRecursive<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : MediaData
        {
            foreach (var folder in LoadChildrenBatched<ContentFolder>(parentLink, defaultCulture))
            {
                foreach (var entry in GetAssetRecursive<T>(folder.ContentLink, defaultCulture))
                {
                    yield return entry;
                }
            }

            foreach (var entry in LoadChildrenBatched<T>(parentLink, defaultCulture))
            {
                yield return entry;
            }
        }

        private IEnumerable<T> LoadChildrenBatched<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : IContent
        {
            var start = 0;

            while (!_isStopped)
            {
                var batch = _contentRepository.GetChildren<T>(parentLink, defaultCulture, start, 50);
                if (!batch.Any())
                {
                    yield break;
                }
                foreach (var content in batch)
                {
                    // Don't include linked products to avoid including them multiple times when traversing the catalog
                    if (!parentLink.CompareToIgnoreWorkID(content.ParentLink))
                    {
                        continue;
                    }

                    yield return content;
                }
                start += 50;
            }
        }

And we will start from SiteDefinition.Current.GlobalAssetsRoot, and use IContentRepository.GetReferencesToContent to see if it is used in any content (both CMS and Catalog). If not, we add it to a list. Later, we use IPermanentLinkMapper to see if it has any blob associated, and delete that as well

            foreach (var asset in GetAssetRecursive<MediaData>(SiteDefinition.Current.GlobalAssetsRoot, CultureInfo.InvariantCulture))
            {
                totalAsset++;
                if (!_contentRepository.GetReferencesToContent(asset.ContentLink, false).Any())
                {
                    toDelete.Add(asset.ContentLink.ToReferenceWithoutVersion());
                }

                if (toDelete.Count % 50 == 0)
                {
                    var maps = _permanentLinkMapper.Find(toDelete);
                    foreach (var map in maps)
                    {
                        deletedAsset++;
                        _contentRepository.Delete(map.ContentReference, true, EPiServer.Security.AccessLevel.NoAccess);
                        var container = Blob.GetContainerIdentifier(map.Guid);
                        //Probably redundency, can just delete directly
                        var blob = _blobFactory.GetBlob(container);
                        if (blob != null)
                        {
                            _blobFactory.Delete(container);
                        }
                        OnStatusChanged($"Deleting asset with id {map.ContentReference}");
                    }
                    toDelete.Clear();
                }
            }

We need another round of delete after the while loop to clean up the left over (or if we have less than 50 abandoned assets)

And we’re done!

Testing this job is simple – uploading a few assets to your cms and do not use it anywhere, then run the job. it should delete those assets.

Things to improve: we might want to make sure only assets that created more than a certain number of days ago are deleted. This allows editors to upload assets for later uses without having to use them immediately.

The code has been open sourced at vimvq1987/DeleteUnusedAssets: Delete unused assets from an Optimizely/Episerver site (github.com) , and I have uploaded a nuget package to Packages (optimizely.com) to be reviewed.

Delete a content – directly from database

Before we even start, I would reiterate that manipulating data directly should be avoided unless absolutely necessary, it should be used as the last resort, and should be proceeded with cautions – always back up first and test your queries on development database first before running it in production. And if the situation dictates that you have to run the query, better do it with the 4 eyes principle – having a colleague double check it for you. When it comes to production database, nothing is too careful.

Now back to the question, if you absolutely have to delete a content, you should do like this

exec editDeletePage @pageId = 123, @ForceDelete = 1

It is basically what Content Clouds (i.e. CMS) does under the hood, without the cache validation on the application layer of course.

So the moral of the story – do everything with API if you can. If you absolutely have to, use the built-in stored procedures – they are tested vigorously and should have minimal issues/bugs, and should take care of everything, data-wise for you. Only write your own query if there is no SP that can be used.

Update: Initially I mentioned Tomas’ post in this, and that gave impression his way is incorrect. I should have written better. My apologies to Tomas

RedirectToAction is dead, long live RedirectToContent

In .NET 4.8/CMS 11.x and earlier, this is very commonly used to redirect an action

return RedirectToAction("Index", new{ node = contentLink });

Which will redirect the user to

public TResult Index(CheckoutPage currentPage );

and you will get the currentPage parameter set with content you specified by the contentLink.

However in .NET 5 once redirected currentPage will be null. It’s due to how .NET 5 handle the routing. The correct way is use this

return RedirectToContent(currentPage.ContentLink, "Index");

There is a action you can use – RedirectToContent. Note that the order of parameter is reserved – you pass in the content link to the content first, then the name of the action.

And that’s how’s it done in .NET 5/CMS 12.

Where to store big collection data

No, I do not mean that big, big data (in size of terabytes or more). It’s big collection, like when you have a List<string> and it has more than a few hundreds of items. Where to store it?

Naturally, you would want to store that data as a property of a content. it’s convenient and it just works, so you definitely can. But the actual question is: should you?

It’s as simple as this

public virtual IList<String> MyBigProperty {get;set;}

But under the hood, it’s more than just … that. Let’s ignore UI for a moment (rendering such long list is a bad UX no matters how you look at it, but you can simply ignore rendering that property by appropriate attributes), and focus on the backend aspects of it.

List<T> properties are serialized as a long strings, and save at once to database. If you have a big property in your content, this will happen every time you load your content:

  • The data must be read from database, then transferred through the network
  • The data must be parsed to create an array (the underlying data structure of List<T>. The original string is tossed away.
  • Now you have a big array that you might not use every time. it’s just there taking your previous LOH (as with the original string)

Same thing happens when you actually save that property

  • The data must be serialized as string, the List<T> is now tossed away
  • The data then must be transferred through the network
  • The data then saved to database. Even though it is a very long string and you changed, maybe 10 characters, it’s completely rewritten. Due to its size, there might be multiple page writes needed.

As you can see, it can create a lot of waste, especially if you rarely use that property. To make the matter worse, due to the size of the property, it means they are taking up space in LOH (large objects heap).

And imagine if you have such properties in each and every of your content. The waste is multiplied, and your site is now at risk of some frequent Gen 2 Garbage collection. Nobody likes visiting a website that freezes (if not crashes) once every 30 minutes.

Then when to store such big collection data?

The obvious answer is … somewhere else. Without other inputs, it’s hard to give you some concrete suggestions, but how’s about a normalized custom table? You have the key as the content reference, and the other column is each value of the list. Just an idea. Then you only load the data when you absolutely need it. More work, yes, but it’s the better way to do it.

Just a reminder that whatever you do, just stay away from DDS – Dynamic Data Store. It’s the worst option of all. Just, don’t 🙂

The hidden gotcha with IObjectInstanceCache

It’s not a secret that cache is one of the most, if not the most, important factors to their website performance. Yes cache is great, and if you are using Optimizely Content/Commerce Cloud, you should be using ISynchronizedObjectInstanceCache to cache your objects whenever possible.

But caching is not easy. Or rather, the cache invalidation is not easy.

To ensure that you have effective caching strategy, it’s important that you have cache dependencies, i.e. In principles, there are two types of dependencies:

  • Master keys. This is to control a entire cache “segment”. For example, you could have one master key for the prices. If you needs to invalidate the entire price cache, just remove the master key and you’re done.
  • Dependency keys. This is to tell cache system that your cache item depends on this or that object. If this or that object is invalidated, your cache item will be invalidated automatically. This is particularly useful if you do not control this or that object.

ISynchronizedObjectInstanceCache allows you to control the cache dependencies by CacheEvictionPolicy . There are a few ways to construct an instance of CacheEvictionPolicy, from if the cache expiration will be absolute (i.e. it will be invalidated after a fixed amount of time), or sliding (i.e. if it is accessed, its expiration will be renewed), to if your cache will be dependent on one or more master keys, and/or one or more dependency keys, like this

   
        /// <summary>
        /// Initializes a new instance of the <see cref="CacheEvictionPolicy"/> class.
        /// </summary>
        /// <param name="cacheKeys">The dependencies to other cached items, idetified by their keys.</param>
        public CacheEvictionPolicy(IEnumerable<string> cacheKeys)

        /// <summary>
        /// Initializes a new instance of the <see cref="CacheEvictionPolicy"/> class.
        /// </summary>
        /// <param name="cacheKeys">The dependencies to other cached items, idetified by their keys.</param>
        /// <param name="masterKeys">The master keys that we depend upon.</param>
        public CacheEvictionPolicy(IEnumerable<string> cacheKeys, IEnumerable<string> masterKeys)

The constructors that takes master keys and dependency keys look pretty the same, but there is an important difference/caveat here: if there is no dependency key already existing in cache, the cache item you are inserting will be invalidated (i.e. removed from cache) immediately. (For master keys, the framework will automatically add an empty object (if none existed) for you.)

That will be some unpleasant surprise – everything seems to be working fine, no error whatsoever. But, if you look closely, your code seems to be hitting database more than it should. But other than that, your website performance is silently suffering (sometimes, not “silently”)

This is an easy mistake to make – I did once myself (albeit long ago) in an important catalog content cache path. And I saw some very experienced developers made the same mistake as well (At this point you might wonder if the API itself is to blame)

Take away:

  • Make sure you are using the right constructor when you construct an instance of CacheEvictionPolicy . Are you sure that the cache keys you are going to depend on, actually exist?

In newer version of CMS, there would be a warning in log if the cache is invalidated immediately, however, it could be missed, unless you are actively looking for it.

Note that this behavior is the same with ISynchronizedObjectInstanceCache as it extends IObjectInstanceCache.

Potential performance issue with Maxmind.db

From time to time, I have to dig into some customers’ profiler traces to figure out why their site is slow (yeah, if you follow me, you’d know that’s kind of my main job). There are multiple issues that can eat your website performance for breakfast, from loading too much content, to unmaintained database indexes. While my blog does not cover everything, I think you can get a good grasp of what mistakes to avoid.

But sometimes the problem might come from a 3rd party library/framework. It’s not new, as we have seen it with A curious case of memory dump diagnostic: How Stackify can cause troubles to your site – Quan Mai’s blog (vimvq1987.com). The problem with those types of issues is that they are usually overlooked.

The library we’ll be investigating today would be Maxmind.db. To be honest, I’ve never used it my own, but it seems to be a very popular choice to geography-map the visitors. It’s usually used by Optimizely sites for that purpose, using VisitorGroup (which is why it came under my radar).

For several sites that use it, it seems more often than not stuck in this stack

It’s essentially to think that CreateActivator is doing something heavy here (evidently with the LambdaCompiler.Compile part. A peek from decompiling actually shows that yes, it’s heavy. I’m not quite sure I can post the decompiled code here without violating any agreement (I did, in fact, accepted no agreement at this point), but it’s quite straightforward code: TypeActivatorCreator uses reflection to get the constructors of the Type passed to it, to sees if there is any constructor decorated with MaxMind.Db.Constructor attribute, then prepares some parameters, and creates an LambdaExpression that would create an instance of that Type, using found constructor (which is a good thing because a compiled expression would perform much better than just a reflection call).

(I’m using Mindmax.db 2.0.0, for the record)

The code is straightforward, but it is also slow – as any code which involves reflection and lambda compilation would be. The essential step would be to cache any result of this. This is actually a very good place to cache. The number of types are fixed during runtime (except for very edge cases where you dynamically create new types), so you won’t have to worry about cache invalidation. The cache would significantly improve the performance of above code.

And in TypeActivatorCreator there is a cache for it. It is a simple ConcurrentDictionary<Type, TypeActivator> , which would return an TypeActivator if the Type was requested before, or create one and cache it, it it hasn’t been. As I said, this is a very good place to add cache to this.

There is a cache for that, which is good. However, the very important tidbit here is that the dictionary is not static. That means, the cache only works, if the class is registered as Singleton (by itself, or by another class down the dependency chain), meaning, only one of the instance is created and shared between thread (which is why the ConcurrentDictionary part is important).

But except it’s not.

When I look at a memory dump that collected for a customer that is using Maxmind.db, this is what I got:

0:000> !dumpheap -stat -type TypeAcivatorCreator
Statistics:
MT Count TotalSize Class Name
00007ffa920f67e0 1 24 MaxMind.Db.TypeAcivatorCreator+<>c
00007ffa920f6500 147 3528 MaxMind.Db.TypeAcivatorCreator
Total 148 objects

So there were 147 instances of TypeAcivatorCreator. Note that this is only the number of existing instances. There might be other instances that were disposed and garbaged by CLR.

Now it’s clear why it has been performing bad. For supposedly every request, a new instance of TypeActivatorCreator is created, and therefore its internal cache is simply empty (it is just newly created, too). Therefore each of request will go through the expensive path of CreateActivator, and performance suffers.

The obvious fix here is to make the dictionary static, or making the TypeActivatorCreator class Singleton. I don’t have the full source code of Mindmax.Db to determine which is better, but I’d be leaning toward the former.

Moral of the story:

  • Caching is very, very important, especially when you are dealing with reflection and lambda compilation
  • You can get it right 99%, but the 1% left could still destroy performance.

Update:

I reached out to Maxmind.db regarding this issue on November 9th, 2021

About 6h later they replied with this

I was at first confused, then somewhat disappointed. It is a small thing to fix to improve overall performance, rather than relying on/expecting customers to do what you say in documentation. But well, let’s just say we have different opinions.

Don’t share HttpContext between threads

HttpContext is not designed to be thread-safe, which means you can run into nasty things if you share a HttpContext between thread. For example, this is one of the things that can happen: One (or more) thread is stuck in this:

mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].FindEntry(System.__Canon)+ed
EPiServer.Web.Routing.Segments.Internal.RequestSegmentContext.GetOrResolveContextMode(System.Web.HttpContextBase)+9b
EPiServer.DataAbstraction.Internal.ProjectResolver.IsInEditMode()+17
EPiServer.DataAbstraction.Internal.ProjectResolver.GetCurrentProjects()+17
EPiServer.Core.Internal.ProjectPipeline.Pipe(EPiServer.Core.ContentReference, System.Globalization.CultureInfo, EPiServer.Core.LoaderOptions)+1c
EPiServer.Core.Internal.ProviderPipelineImplementation.GetItem(EPiServer.Core.ContentProvider, EPiServer.Core.ContentReference, EPiServer.Core.LoaderOptions)+109
EPiServer.Core.Internal.DefaultContentLoader.TryGet[[System.__Canon, mscorlib]](EPiServer.Core.ContentReference, EPiServer.Core.LoaderOptions, System.__Canon ByRef)+11e
[[StubHelperFrame]]
EPiServer.Core.Internal.DefaultContentLoader.Get[[System.__Canon, mscorlib]](EPiServer.Core.ContentReference, EPiServer.Core.LoaderOptions)+63

If you don’t know why this is scary, this is any infinite loop, meaning your CPU will be spent 100% in to this Dictionary.FindEntry, unable to do anything else. The only way to solve this problem is to restart the instance.

That is caused by unsafe accessing of a Dictionary – if you have a thread that is enumerating it, and another thread trying to write to it, it is possible to run into a dead end like this.

And HttpContext just happens to have many Dictionary properties and sub-properties. HttpContext.Request.RequestContext.RouteData.DataTokens is one of them (And a reason the code above ended in a disaster), making it vulnerable for this kind of problem. Which is exactly why it is not recommended to share a HttpContext between threads.

Sometimes, you can just set HttpContext.Current to null. Sometimes, you need to take a step back and ask yourself that do you really need to run things in parallel?

Don’t use AspNetIdentity FindByEmailAsync/FindByIdAsync

Or any of its equivalent – FindByEmail/FindById etc.

Why?

Reason? It’s slow. Slow enough to effectively kill your database, and therefore, your website.

If you want dig into the default implementation (which is using EntityFramework), this is what you end up with, either if you are using FindByEmailAsync, or its synchronous equivalent FindByEmail

    public virtual Task<TUser> FindByEmailAsync(string email)
    {
      this.ThrowIfDisposed();
      return this.GetUserAggregateAsync((Expression<Func<TUser, bool>>) (u => u.Email.ToUpper() == email.ToUpper()));
    }

It finds user by matching the email, but not before it uses ToUpper in both sides of the equation. This is to ensure correctness because an user can register with “[email protected]” but then try to login with “[email protected]”. If the database is set to be CS – case sensitive collation, that is not a match.

That is fine for C#/.NET, but it is bad for SQL Server. When it reaches database, this query is generated

(@p__linq__0 nvarchar(4000))SELECT TOP (1) 
    [Extent1].[Id] AS [Id], 
    [Extent1].[NewsLetter] AS [NewsLetter], 
    [Extent1].[IsApproved] AS [IsApproved], 
    [Extent1].[IsLockedOut] AS [IsLockedOut], 
    [Extent1].[Comment] AS [Comment], 
    [Extent1].[CreationDate] AS [CreationDate], 
    [Extent1].[LastLoginDate] AS [LastLoginDate], 
    [Extent1].[LastLockoutDate] AS [LastLockoutDate], 
    [Extent1].[Email] AS [Email], 
    [Extent1].[EmailConfirmed] AS [EmailConfirmed], 
    [Extent1].[PasswordHash] AS [PasswordHash], 
    [Extent1].[SecurityStamp] AS [SecurityStamp], 
    [Extent1].[PhoneNumber] AS [PhoneNumber], 
    [Extent1].[PhoneNumberConfirmed] AS [PhoneNumberConfirmed], 
    [Extent1].[TwoFactorEnabled] AS [TwoFactorEnabled], 
    [Extent1].[LockoutEndDateUtc] AS [LockoutEndDateUtc], 
    [Extent1].[LockoutEnabled] AS [LockoutEnabled], 
    [Extent1].[AccessFailedCount] AS [AccessFailedCount], 
    [Extent1].[UserName] AS [UserName]
    FROM [dbo].[AspNetUsers] AS [Extent1]
    WHERE ((UPPER([Extent1].[Email])) = (UPPER(@p__linq__0))) OR ((UPPER([Extent1].[Email]) IS NULL) AND (UPPER(@p__linq__0) IS NULL))

If you can’t spot the problem – don’t worry because I have seen experienced developers made the same mistake. By using the TOUPPER function on the column you are effectively remove any performance benefit of the index that might be on Email column. That means this query will do an index scan every time it is called. We have the TOP(1) statement which somewhat reduces the impact (it can stop as soon as it finds a match), but if there is no match – e.g. no registered email, it will be a full index scan.

If you have a lot of registered customers, frequent calls to that query can effectively kill your database.

And how to fix it

Fixing this issue will be a bit cumbersome, because the code is well hidden inside the implementation of AspNetIdentity EntityFramework. But it’s not impossible. First we need an UserStore which does not use the Upper for comparison:

public class FoundationUserStore<TUser> : UserStore<TUser> where TUser : IdentityUser, IUIUser, new()
{
    public FoundationUserStore(DbContext context)
        : base(context)
    { }

    public override Task<TUser> FindByEmailAsync(string email)
    {
        return GetUserAggregateAsync(x => x.Email == email);
    }

    public override Task<TUser> FindByNameAsync(string name)
    {
        return GetUserAggregateAsync(x => x.UserName == name);
    }
}

And then a new UserManager to use that new UserStore

    public class CustomApplicationUserManager<TUser> : ApplicationUserManager<TUser> where TUser : IdentityUser, IUIUser, new()
    {
        public CustomApplicationUserManager(IUserStore<TUser> store)
            : base(store)
        {
        }

        public static new ApplicationUserManager<TUser> Create(IdentityFactoryOptions<ApplicationUserManager<TUser>> options, IOwinContext context)
        {
            var manager = new ApplicationUserManager<TUser>(new FoundationUserStore<TUser>(context.Get<ApplicationDbContext<TUser>>()));

            // Configure validation logic for usernames
            manager.UserValidator = new UserValidator<TUser>(manager)
            {
                AllowOnlyAlphanumericUserNames = false,
                RequireUniqueEmail = true
            };

            // Configure validation logic for passwords
            manager.PasswordValidator = new PasswordValidator
            {
#if DEBUG
                RequiredLength = 2,
                RequireNonLetterOrDigit = false,
                RequireDigit = false,
                RequireLowercase = false,
                RequireUppercase = false
#else
                RequiredLength = 6,
                RequireNonLetterOrDigit = true,
                RequireDigit = true,
                RequireLowercase = true,
                RequireUppercase = true

#endif
            };

            // Configure user lockout defaults
            manager.UserLockoutEnabledByDefault = true;
            manager.DefaultAccountLockoutTimeSpan = TimeSpan.FromMinutes(5);
            manager.MaxFailedAccessAttemptsBeforeLockout = 5;

            var provider = context.Get<ApplicationOptions>().DataProtectionProvider.Create("EPiServerAspNetIdentity");
            manager.UserTokenProvider = new DataProtectorTokenProvider<TUser>(provider);

            return manager;
        }
    }

And then a way to register our UserManager

    public static IAppBuilder AddCustomAspNetIdentity<TUser>(this IAppBuilder app, ApplicationOptions applicationOptions) where TUser : IdentityUser, IUIUser, new()
    {
        applicationOptions.DataProtectionProvider = app.GetDataProtectionProvider();

        // Configure the db context, user manager and signin manager to use a single instance per request
        app.CreatePerOwinContext<ApplicationOptions>(() => applicationOptions);
        app.CreatePerOwinContext<ApplicationDbContext<TUser>>(ApplicationDbContext<TUser>.Create);
        app.CreatePerOwinContext<ApplicationRoleManager<TUser>>(ApplicationRoleManager<TUser>.Create);
        app.CreatePerOwinContext<ApplicationUserManager<TUser>>(CustomApplicationUserManager<TUser>.Create);
        app.CreatePerOwinContext<ApplicationSignInManager<TUser>>(ApplicationSignInManager<TUser>.Create);

        // Configure the application
        app.CreatePerOwinContext<UIUserProvider>(ApplicationUserProvider<TUser>.Create);
        app.CreatePerOwinContext<UIRoleProvider>(ApplicationRoleProvider<TUser>.Create);
        app.CreatePerOwinContext<UIUserManager>(ApplicationUIUserManager<TUser>.Create);
        app.CreatePerOwinContext<UISignInManager>(ApplicationUISignInManager<TUser>.Create);

        // Saving the connection string in the case dbcontext be requested from none web context
        ConnectionStringNameResolver.ConnectionStringNameFromOptions = applicationOptions.ConnectionStringName;

        return app;
    }

Finally, replace the normal app.AddAspNetIdentity with this:

        app.AddCustomAspNetIdentity<SiteUser>(new ApplicationOptions
        {
            ConnectionStringName = commerceConectionStringName
        });

As I mentioned, this is cumbersome to do. If you know a better way to do, I’m all ear ;).

We are also skipping the case sensitivity part. In most of the cases, it’ll be fine as you are most likely using CI collation instead. But it’s better to be sure than leave it to chance. We will address that in the second part of this blog post.

Register your custom implementation, the sure way

The point of Episerver dependency injection is that you can plug in your custom implementation for, well almost, everything. But it can be tricky at times how to properly register your custom implementation.

The default DI framework (and possibly any other popular DI frameworks) works in the way that implementation registered later wins, i.e. it overrides any other implementation registered before it. To make Episerver uses your implementation, you have to make sure yours is registered last.

  • Never register your customer implementation using ServiceConfiguration. Implementation with that attributes will be registered first in the initialization pipeline. You will run into either
    • The default implementation was registered in an IConfigurableModule.ConfigureContainer. As those will be registered than any implementation using ServiceConfiguration , yours will be overridden by the default ones.
    • The default implementation was also registered using ServiceConfiguration. Now you run into indeterministic situation – the order will be randomized every time your website starts. Sometimes it’s yours, sometimes it’s the default one, and that might cause some nasty bug (Heisenbug, if you know the reference 😉 )
  • That leaves you with registering your implementation by IConfigurableModule.ConfigureContainer . In many cases, registering your implementations here will just work, because the default implementations are registered by ServiceConfiguration attribute. However, that is not always the case. There is a possibility that the default one was registered using IConfigurableModule.ConfigureContainer, and things will be tricky. First of all, unlike IInitializationModule when you can make your module depends on a specific module, the order in which IConfigurationModule.ConfigureContainer is executed is not determined. Even if you were allowed to make the dependency, it’s not clear which module you should depend on, and in many cases, that module is internal, so you can’t specify it

That is the point of this post then. To make sure your implementation is registered regardless of how the default one is registered, you can always fallback to use the ConfigurationComplete event of ServiceConfigurationContext. This is called once all ConfigureContainer have been called, so you can be sure that the default implementation is registered – time to override it then!

        public void ConfigureContainer(ServiceConfigurationContext context)
        {
            context.ConfigurationComplete += Context_ConfigurationComplete;
        }

        private void Context_ConfigurationComplete(object sender, ServiceConfigurationEventArgs e)
        {
            e.Services.AddSingleton<IOrderRepository, CustomOrderRepository>();
        }

Simple as that!

Note that this only applies to cases when you want to override the default implementation. If you register an implementation of your own interfaces/abstract classes, or you will be adding your implementation (not overriding the default one, an example is if you have an implementation of IShippingPlugin), you can register it in any way.

Don’t insert an IEnumerable to cache

You have been told, cache is great. If used correctly, it can greatly improve your website performance, sometimes, it can even make the difference between life and death.

While that’s true, cache can be tricky to get right. The #1 issue with cache is cache invalidation, which we will get into detail in another blog post. The topic of today is a hidden, easy to make mistake, but can wreck havok in production.

Can you spot the problem in this snippet?

var someData = GetDataFromDatabase();                
var dataToCache = someData.Concat(someOtherData);
InsertToCache(cacheKey, dataToCache);

If you can’t, don’t worry – it is more common than you’d imagine. It’s easy to think that you are inserting your data correctly to cache.

Except you are not.

dataToCache is actually just an enumerator. It’s not until you get your data back and actually access the elements, the enumerator is actually called to fetch the data. If GetDataFromDatabase does not return a List<T>, but a lazy loading collection, that is when unpredictable things happen.

Who like to have unpredictability on a production website?

A simple, but effective advice is to always make sure you have the actual data in the object you are inserting to cache. Calling either .ToList() or ToArray() before inserting the data to cache would solve the problem nicely.

And that’s applied to any other lazy loading type of data as well.