Don’t use AspNetIdentity FindByEmailAsync/FindByIdAsync

Or any of its equivalent – FindByEmail/FindById etc.

Why?

Reason? It’s slow. Slow enough to effectively kill your database, and therefore, your website.

If you want dig into the default implementation (which is using EntityFramework), this is what you end up with, either if you are using FindByEmailAsync, or its synchronous equivalent FindByEmail

    public virtual Task<TUser> FindByEmailAsync(string email)
    {
      this.ThrowIfDisposed();
      return this.GetUserAggregateAsync((Expression<Func<TUser, bool>>) (u => u.Email.ToUpper() == email.ToUpper()));
    }

It finds user by matching the email, but not before it uses ToUpper in both sides of the equation. This is to ensure correctness because an user can register with “[email protected]” but then try to login with “[email protected]”. If the database is set to be CS – case sensitive collation, that is not a match.

That is fine for C#/.NET, but it is bad for SQL Server. When it reaches database, this query is generated

(@p__linq__0 nvarchar(4000))SELECT TOP (1) 
    [Extent1].[Id] AS [Id], 
    [Extent1].[NewsLetter] AS [NewsLetter], 
    [Extent1].[IsApproved] AS [IsApproved], 
    [Extent1].[IsLockedOut] AS [IsLockedOut], 
    [Extent1].[Comment] AS [Comment], 
    [Extent1].[CreationDate] AS [CreationDate], 
    [Extent1].[LastLoginDate] AS [LastLoginDate], 
    [Extent1].[LastLockoutDate] AS [LastLockoutDate], 
    [Extent1].[Email] AS [Email], 
    [Extent1].[EmailConfirmed] AS [EmailConfirmed], 
    [Extent1].[PasswordHash] AS [PasswordHash], 
    [Extent1].[SecurityStamp] AS [SecurityStamp], 
    [Extent1].[PhoneNumber] AS [PhoneNumber], 
    [Extent1].[PhoneNumberConfirmed] AS [PhoneNumberConfirmed], 
    [Extent1].[TwoFactorEnabled] AS [TwoFactorEnabled], 
    [Extent1].[LockoutEndDateUtc] AS [LockoutEndDateUtc], 
    [Extent1].[LockoutEnabled] AS [LockoutEnabled], 
    [Extent1].[AccessFailedCount] AS [AccessFailedCount], 
    [Extent1].[UserName] AS [UserName]
    FROM [dbo].[AspNetUsers] AS [Extent1]
    WHERE ((UPPER([Extent1].[Email])) = (UPPER(@p__linq__0))) OR ((UPPER([Extent1].[Email]) IS NULL) AND (UPPER(@p__linq__0) IS NULL))

If you can’t spot the problem – don’t worry because I have seen experienced developers made the same mistake. By using the TOUPPER function on the column you are effectively remove any performance benefit of the index that might be on Email column. That means this query will do an index scan every time it is called. We have the TOP(1) statement which somewhat reduces the impact (it can stop as soon as it finds a match), but if there is no match – e.g. no registered email, it will be a full index scan.

If you have a lot of registered customers, frequent calls to that query can effectively kill your database.

And how to fix it

Fixing this issue will be a bit cumbersome, because the code is well hidden inside the implementation of AspNetIdentity EntityFramework. But it’s not impossible. First we need an UserStore which does not use the Upper for comparison:

public class FoundationUserStore<TUser> : UserStore<TUser> where TUser : IdentityUser, IUIUser, new()
{
    public FoundationUserStore(DbContext context)
        : base(context)
    { }

    public override Task<TUser> FindByEmailAsync(string email)
    {
        return GetUserAggregateAsync(x => x.Email == email);
    }

    public override Task<TUser> FindByNameAsync(string name)
    {
        return GetUserAggregateAsync(x => x.UserName == name);
    }
}

And then a new UserManager to use that new UserStore

    public class CustomApplicationUserManager<TUser> : ApplicationUserManager<TUser> where TUser : IdentityUser, IUIUser, new()
    {
        public CustomApplicationUserManager(IUserStore<TUser> store)
            : base(store)
        {
        }

        public static new ApplicationUserManager<TUser> Create(IdentityFactoryOptions<ApplicationUserManager<TUser>> options, IOwinContext context)
        {
            var manager = new ApplicationUserManager<TUser>(new FoundationUserStore<TUser>(context.Get<ApplicationDbContext<TUser>>()));

            // Configure validation logic for usernames
            manager.UserValidator = new UserValidator<TUser>(manager)
            {
                AllowOnlyAlphanumericUserNames = false,
                RequireUniqueEmail = true
            };

            // Configure validation logic for passwords
            manager.PasswordValidator = new PasswordValidator
            {
#if DEBUG
                RequiredLength = 2,
                RequireNonLetterOrDigit = false,
                RequireDigit = false,
                RequireLowercase = false,
                RequireUppercase = false
#else
                RequiredLength = 6,
                RequireNonLetterOrDigit = true,
                RequireDigit = true,
                RequireLowercase = true,
                RequireUppercase = true

#endif
            };

            // Configure user lockout defaults
            manager.UserLockoutEnabledByDefault = true;
            manager.DefaultAccountLockoutTimeSpan = TimeSpan.FromMinutes(5);
            manager.MaxFailedAccessAttemptsBeforeLockout = 5;

            var provider = context.Get<ApplicationOptions>().DataProtectionProvider.Create("EPiServerAspNetIdentity");
            manager.UserTokenProvider = new DataProtectorTokenProvider<TUser>(provider);

            return manager;
        }
    }

And then a way to register our UserManager

    public static IAppBuilder AddCustomAspNetIdentity<TUser>(this IAppBuilder app, ApplicationOptions applicationOptions) where TUser : IdentityUser, IUIUser, new()
    {
        applicationOptions.DataProtectionProvider = app.GetDataProtectionProvider();

        // Configure the db context, user manager and signin manager to use a single instance per request
        app.CreatePerOwinContext<ApplicationOptions>(() => applicationOptions);
        app.CreatePerOwinContext<ApplicationDbContext<TUser>>(ApplicationDbContext<TUser>.Create);
        app.CreatePerOwinContext<ApplicationRoleManager<TUser>>(ApplicationRoleManager<TUser>.Create);
        app.CreatePerOwinContext<ApplicationUserManager<TUser>>(CustomApplicationUserManager<TUser>.Create);
        app.CreatePerOwinContext<ApplicationSignInManager<TUser>>(ApplicationSignInManager<TUser>.Create);

        // Configure the application
        app.CreatePerOwinContext<UIUserProvider>(ApplicationUserProvider<TUser>.Create);
        app.CreatePerOwinContext<UIRoleProvider>(ApplicationRoleProvider<TUser>.Create);
        app.CreatePerOwinContext<UIUserManager>(ApplicationUIUserManager<TUser>.Create);
        app.CreatePerOwinContext<UISignInManager>(ApplicationUISignInManager<TUser>.Create);

        // Saving the connection string in the case dbcontext be requested from none web context
        ConnectionStringNameResolver.ConnectionStringNameFromOptions = applicationOptions.ConnectionStringName;

        return app;
    }

Finally, replace the normal app.AddAspNetIdentity with this:

        app.AddCustomAspNetIdentity<SiteUser>(new ApplicationOptions
        {
            ConnectionStringName = commerceConectionStringName
        });

As I mentioned, this is cumbersome to do. If you know a better way to do, I’m all ear ;).

We are also skipping the case sensitivity part. In most of the cases, it’ll be fine as you are most likely using CI collation instead. But it’s better to be sure than leave it to chance. We will address that in the second part of this blog post.

Register your custom implementation, the sure way

The point of Episerver dependency injection is that you can plug in your custom implementation for, well almost, everything. But it can be tricky at times how to properly register your custom implementation.

The default DI framework (and possibly any other popular DI frameworks) works in the way that implementation registered later wins, i.e. it overrides any other implementation registered before it. To make Episerver uses your implementation, you have to make sure yours is registered last.

  • Never register your customer implementation using ServiceConfiguration. Implementation with that attributes will be registered first in the initialization pipeline. You will run into either
    • The default implementation was registered in an IConfigurableModule.ConfigureContainer , which means it will override yours
    • The default implementation was also registered using ServiceConfiguration. As those will be registered than any implementation using ServiceConfiguration , yours will be overridden by the default ones.
  • That leaves you with registering your implementation by IConfigurableModule.ConfigureContainer . In many cases, registering your implementations here will just work, because the default implementations are registered by ServiceConfiguration attribute. However, that is not always the case. There is a possibility that the default one was registered using IConfigurableModule.ConfigureContainer, and things will be tricky. First of all, unlike IInitializationModule when you can make your module depends on a specific module, the order in which IConfigurationModule.ConfigureContainer is executed is not determined. Even if you were allowed to make the dependency, it’s not clear which module you should depend on, and in many cases, that module is internal, so you can’t specify it

That is the point of this post then. To make sure your implementation is registered regardless of how the default one is registered, you can always fallback to use the ConfigurationComplete event of ServiceConfigurationContext. This is called once all ConfigureContainer have been called, so you can be sure that the default implementation is registered – time to override it then!

        public void ConfigureContainer(ServiceConfigurationContext context)
        {
            context.ConfigurationComplete += Context_ConfigurationComplete;
        }

        private void Context_ConfigurationComplete(object sender, ServiceConfigurationEventArgs e)
        {
            e.Services.AddSingleton<IOrderRepository, CustomOrderRepository>();
        }

Simple as that!

Note that this only applies to cases when you want to override the default implementation. If you register an implementation of your own interfaces/abstract classes, or you will be adding your implementation (not overriding the default one, an example is if you have an implementation of IShippingPlugin), you can register it in any way.

Don’t insert an IEnumerable to cache

You have been told, cache is great. If used correctly, it can greatly improve your website performance, sometimes, it can even make the difference between life and death.

While that’s true, cache can be tricky to get right. The #1 issue with cache is cache invalidation, which we will get into detail in another blog post. The topic of today is a hidden, easy to make mistake, but can wreck havok in production.

Can you spot the problem in this snippet?

var someData = GetDataFromDatabase();                
var dataToCache = someData.Concat(someOtherData);
InsertToCache(cacheKey, dataToCache);

If you can’t, don’t worry – it is more common than you’d imagine. It’s easy to think that you are inserting your data correctly to cache.

Except you are not.

dataToCache is actually just an enumerator. It’s not until you get your data back and actually access the elements, the enumerator is actually called to fetch the data. If GetDataFromDatabase does not return a List<T>, but a lazy loading collection, that is when unpredictable things happen.

Who like to have unpredictability on a production website?

A simple, but effective advice is to always make sure you have the actual data in the object you are inserting to cache. Calling either .ToList() or ToArray() before inserting the data to cache would solve the problem nicely.

And that’s applied to any other lazy loading type of data as well.

Beware of GetContentResult()

If you are using Episerver Search & Navigation (Formerly Find), you are likely using GetContentResult() . While I’m fairly skeptical about it (if you do not need the content explicitly, it’s probably a good idea to use GetResult instead), there are legitimate uses of the APIs. However, it can come with a surprise, and a not very pleasant one.

Your code probably looks like this

SearchClient.Instance.Search<IContent>()
  .For("banana")
  .StaticallyCacheFor(15)
  .GetContentResult();

Looks good, right? You search for content with keyword “banana” and cache the result for 15 minutes.

But is it? Does your search result really get cached for 15 minutes?

As it might surprise you, it doesn’t. GetContentResult() caches the result by default (unlike GetResult() which does not), but it only caches for 1 minutes. Even thought you asked it to cache for 15 minutes.

The right way to do it is to use other overload which takes an int as parameter. That is the cache time in seconds, like this

SearchClient.Instance.Search<IContent>()
  .For("banana")
  .GetContentResult(900);

In upcoming version of Find, the surprise will be fixed, but it’s probably a good idea to specify the cache time out explicitly.

Name or Display name in Catalog UI: you can choose

Since the beginning of Catalog UI, it had always shown Name, in both Catalog Tree and the Catalog content list.

That, however, was changed to DisplayName since 13.14 due to a popular feature request here https://world.episerver.com/forum/developer-forum/Feature-requests/Thread-Container/2019/12/use-localized-catalog-in-commerce-catalog-ui/#214650

All is good and the change was positively received. However not every is happy with it – some want it the old way, i.e. `Name` to be displayed. From a framework perspective, it might be complex to let partners configure which field to display. But if you are willing to do some extra work, then it’s all easy.

Catalog content is transformed using CatalogContentModelTransform, this is where DisplayName is added to the data returned to the client. If you override that, you can set DisplayName to whatever you want, for example, Name.

Here is what the implementation would look like

using EPiServer.Cms.Shell.UI.Rest.Models.Transforms;
using EPiServer.Commerce;
using EPiServer.Commerce.Catalog;
using EPiServer.Commerce.Catalog.ContentTypes;
using EPiServer.Commerce.Catalog.Linking;
using EPiServer.Commerce.Shell.Rest;
using EPiServer.Framework.Localization;
using EPiServer.ServiceLocation;
using Mediachase.Commerce.Catalog;
using Mediachase.Commerce.Customers;
using Mediachase.Commerce.InventoryService;
using Mediachase.Commerce.Markets;
using Mediachase.Commerce.Pricing;

namespace EPiServer.Reference.Commerce.Site.Infrastructure
{
    [ServiceConfiguration(typeof(IModelTransform))]
    public class BlahBlahBlah : CatalogContentModelTransform
    {
        public BlahBlahBlah(ExpressionHelper expressionHelper, IPriceService priceService, IMarketService marketService, IInventoryService inventoryService, LocalizationService localizationService, ICatalogSystem catalogContext, IRelationRepository relationRepository, ThumbnailUrlResolver thumbnailUrlResolver, CustomerContext customerContext) : base(expressionHelper, priceService, marketService, inventoryService, localizationService, catalogContext, relationRepository, thumbnailUrlResolver, customerContext)
        {
        }

        public override TransformOrder Order
        {
            ///Yes, this is very important to make it work
            get { return base.Order + 1; }
        }

        protected override void TransformInstance(IModelTransformContext context)
        {
            var catalogContent = context.Source as CatalogContentBase;
            var properties = context.Target.Properties;

            if (catalogContent is NodeContent nodeContent)
            {
                properties["DisplayName"] = nodeContent.Name;
            }
            if (catalogContent is EntryContentBase entryContent)
            {
                properties["DisplayName"] = entryContent.Name;
            }
        }
    }
}

And here is how it looks




A few notes:

  • CatalogContentModelTransform, and other APIs in Commerce.Shell, are not considered public APIs, so they might change without notice. There is a risk for adding this, however, it’s quite low.
  • This (or the bug fix) does not affect breadcrumb, it has been, and still is, showing Name.

Export catalog, with linked assets

If you are already using a PIM system, you can stop reading!

If you have been using Commerce for a while, you probably have seen this screen – yes, in Commerce Manager

This allow you to export a catalog, but without a caveat: the exported catalog, most likely, does not contains any linked assets. The reason for that was the asset content types need to be present at the context of the site. In Commerce Manager, the general advice is to not deploy the content types there for simpler management.

Import/Export are also missing features in Catalog UI compared to Commerce Manager. I wish I could have added it, but given my Dojo skill, it’s better to write something UI-less, and here you go: a controller to let you download a catalog with everything attached. Well, here is the entire code that you can drop into your project and build:

using EPiServer.Data;
using EPiServer.Framework.Blobs;
using EPiServer.Logging;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Web.Http;
using EPiServer.Commerce.Catalog.ContentTypes;
using Mediachase.Commerce.Catalog;
using Mediachase.Commerce.Catalog.ImportExport;
using System.IO;
using System.IO.Compression;
using System.Text;
using System.Xml;

namespace EPiServer.Personalization.Commerce.CatalogFeed.Internal
{
    /// <summary>
    /// Download a catalog.
    /// </summary>
    public class CatalogExportController : ApiController
    {
        private readonly CatalogImportExport _importExport;
        private readonly IBlobFactory _blobFactory;
        private readonly IContentLoader _contentLoader;
        private readonly ReferenceConverter _referenceConverter;
        internal const string DownloadRoute = "episerverapi/catalogs/";
        private static readonly Guid _blobContainerIdentifier = Guid.Parse("119AD01E-ECD1-4781-898B-6DEC356FC8D8");

        private static readonly ILogger _logger = LogManager.GetLogger(typeof(CatalogExportController));

        /// <summary>
        /// Initializes a new instance of the <see cref="CatalogExportController"/> class.
        /// </summary>
        /// <param name="importExport">Catalog import export</param>
        /// <param name="blobFactory">The blob factory.</param>
        /// <param name="contentLoader">The content loader.</param>
        /// <param name="referenceConverter"></param>
        public CatalogExportController(CatalogImportExport importExport,
            IBlobFactory blobFactory,
            IContentLoader contentLoader,
            ReferenceConverter referenceConverter)
        {
            _importExport = importExport;
            _blobFactory = blobFactory;
            _contentLoader = contentLoader;
            _referenceConverter = referenceConverter;

            _importExport.IsModelsAvailable = true;
        }

        /// <summary>
        /// Direct download catalog export for admins.
        /// </summary>
        /// <param name="catalogName">Name of catalog to be exported.</param>
        /// <returns>
        /// Catalog.zip if successful else HttpResponseMessage containing error.
        /// </returns>
        [HttpGet]
        [Authorize(Roles = "CommerceAdmins")]
        [Route(DownloadRoute)]
        public HttpResponseMessage Index(string catalogName)
        {
            var catalogs = _contentLoader.GetChildren<CatalogContent>(_referenceConverter.GetRootLink());
            var catalog = catalogs.First(x => x.Name.Equals(catalogName, StringComparison.OrdinalIgnoreCase));
            if (catalog != null)
            {
                return GetFile(catalog.Name);
            }

            return new HttpResponseMessage
            { Content = new StringContent($"There is no catalog with name {catalogName}.") };
        }

        private HttpResponseMessage GetFile(string catalogName)
        {
            var container = Blob.GetContainerIdentifier(_blobContainerIdentifier);
            var blob = _blobFactory.CreateBlob(container, ".zip");
            using (var stream = blob.OpenWrite())
            {
                using (var zipArchive = new ZipArchive(stream, ZipArchiveMode.Create, false))
                {
                    var entry = zipArchive.CreateEntry("catalog.xml");

                    using (var entryStream = entry.Open())
                    {
                        _importExport.Export(catalogName, entryStream, Path.GetTempPath());
                    }
                }
            }

            var response = new HttpResponseMessage
            {
                Content = new PushStreamContent(async (outputStream, content, context) =>
                {
                    var fileStream = blob.OpenRead();

                    await fileStream.CopyToAsync(outputStream)
                        .ContinueWith(task =>
                        {
                            fileStream.Close();
                            outputStream.Close();

                            if (task.IsFaulted)
                            {
                                _logger.Error($"Catalog download failed", task.Exception);
                                return;
                            }

                            _logger.Information($"Feed download completed.");

                        });
                }, new MediaTypeHeaderValue("application/zip"))
            };
            return response;
        }
    }
}

And now you can access to this path http://yoursite.com/episerverapi/catalogs?catalogName=fashion to download the catalog named “Fashion”.

A few notes:

  • This requires Admin access, for obvious reasons. You will need to log in to your website first before accessing the path above
  • It can take some time for big catalogs, so be patient if that’s the case ;). Yes another approach is to have this as a scheduled job when you can export the catalog in background, but that make the selection of catalog to export much more complicated. If you have only one catalog, then go ahead!

Include/IncludeOn/Exclude/ExcludeOn: a simple explanation

When I come across this question https://world.episerver.com/forum/developer-forum/-Episerver-75-CMS/Thread-Container/2020/3/trouble-with-availablecontenttypesattribute-excludeonincludeon/ I was rather confused by the properties of AvailableContentTypesAttribute (admittedly I don’t use them that often!). Looking at the code that defined them, or the XML documentation does not really help. I only come to an understanding when I look into how they are used, and I guess many other developers, especially beginners, might have same confusion, so here’s a simple explanation.

Include : defines content types that can be created as children of a content of this type (being decorated by the attribute)

IncludeOn: defines content types that can be parent of a content of this type

Exclude: defines content types that can not be created as children of a content of this type

ExcludeOn: defines content types that can not be parent of a content of this type.

If there is a conflict between those properties, for example content type A has Include with content type B, and content type B has ExcludeOn with content type A, then Exclude and ExcludeOn take priority (i.e. they will override Include and IncludeOn. In the example above then content type B will not be able to be children of content type A)

While AvailableContentTypesAttribute is extremely helpful, the property naming is not the best – they are short and symmetric, but they are not easy to understand and remember. An “improved” example might be

CanBeParentOf

CanBeChildrenOf

CannotBeParentOf

CannotBeChildrenOf

Yes they are more verbose, but they are unambiguous and you will not have to check the document (or this blog post) when you use them.

This is not the first time we have something that rather confusing in our API. One notable example is the old (now removed) ILinksRepository with the Source and Target properties in Relation . For quite some time I always had to check the code to know what to use, and then had the documentation updated, and eventually, changed to Parent and Child. No API is created perfect, but we can improve over time.

Disabling Catalog Dto cache: maybe, don’t?

Recently (as recent as this morning) I was asked to look into a case when the Find indexing performance was subpar. Upon investigation – looking at a properly captured trace from dotTrace – it was clear that at least 30% percent of time was spending in loading the CatalogDto

This is something that should not happen, as the CatalogDto should have been cached. Also, a normal site should have very few catalogs, so the cache should be very effective. However, data does not lie – it has been hitting database a lot, and a quick check on the site settings revealed that the entire DTO cache has been indeed, disabled

 <Cache enabled="false" collectionTimeout="0:0:0" entryTimeout="0:0:0" nodeTimeout="0:0:0" schemaTimeout="0:0:0" /> 

By setting these timeout settings to 0, the cache is immediately invalidated, rendering them useless. The CatalogDto, therefore, is loaded everytime from database, causing the bottleneck.

The reason for setting those timeout to 0 was probably – I guess – to reduce the memory footprint of the site. However, Catalog DTOs are fairly small in size, and since Commerce 11, it has been smart enough to skip caching the DTOs if there is cache on a higher (content ) level, thanks to my colleague Magnus Rahl. So DTOs should not be of any concerns, if you are not actively using them (and in most of the case, you should not). By re-enabling the cache, the indexing time can be cut, at least 30%, according to the aforementioned trace.

As you might wonder, Catalog content provider still uses the DTOs internally, therefore it would load those for data.

Moral of the story:

  • The cache settings are there, but because you can, does not mean you should. I personally think cache settings should be as hidden as possible from accidental changes. Disabling cache, and in a lesser extend, changing default cache timeout, can have unforeseeable consequences. Only do so if you have strong reasons to do so. Or better, let us know why you need to do that, and we can figure out a compromise.

Looking for static class fields in Windbg

I am looking into the ever growing problem with LambdaExpression cache in Find, as reported here https://world.episerver.com/forum/developer-forum/Problems-and-bugs/Thread-Container/2019/9/episerver-find-lambdaexpressionextensions-_cache-keeps-growing-indefinately/ . One important part of analyzing cache is to understand how many items are in cache, and how is the cache hit ratio. I have received the memory dumps from our partners, time to fire up some Windbg. Luckily for us that is stored in the class as fields. Unluckily for us, the class in question is a static one, it is when you find out !dumpheap -type is not working for you.

The right way would be to use !name2ee

0:000> !name2ee episerver.find.dll EPiServer.Find.LambdaExpressionExtensions
Module:      00007ffd770bef80
Assembly:    EPiServer.Find.dll
Token:       000000000200000d
MethodTable: 00007ffd79c998e8
EEClass:     00007ffd79c8d268
Name:        EPiServer.Find.LambdaExpressionExtensions

!name2ee takes two parameters, the first one is the module name (basically the name of the assembly), and the second one is the name of the class. It is important to note that the class name is case sensitive, so you have to give it name with correct casing.

Now you have the EEClass and you just need to dump it using !Dumpclass

0:000> !DumpClass /d 00007ffd79c8d268
Class Name:      EPiServer.Find.LambdaExpressionExtensions
mdToken:         000000000200000d
File:            C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root5a589a\ddb1376c\assembly\dl3c58139
0:000> !DumpClass /d 00007ffd79c8d268
Class Name:      EPiServer.Find.LambdaExpressionExtensions
mdToken:         000000000200000d
File:            C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\775a589a\ddb1376c\assembly\dl3\26c58139\00ecff94_a3aed501\EPiServer.Find.dll
Parent Class:    00007ffd76045498
Module:          00007ffd770bef80
Method Table:    00007ffd79c998e8
Vtable Slots:    4
Total Method Slots:  7
Class Attributes:    100181  Abstract, 
Transparency:        Critical
NumInstanceFields:   0
NumStaticFields:     3
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ffd7d0c4e28  4000009        8 ...egate, mscorlib]]  0   static 000001e6008e6768 _cache
00007ffd760d0d90  400000a      398         System.Int64  1   static 87633 _compiles
00007ffd760d0d90  400000b      3a0         System.Int64  1   static 34738206 _calls
ecff94_a3aed501\EPiServer.Find.dll Parent Class: 00007ffd76045498 Module: 00007ffd770bef80 Method Table: 00007ffd79c998e8 Vtable Slots: 4 Total Method Slots: 7 Class Attributes: 100181 Abstract, Transparency: Critical NumInstanceFields: 0 NumStaticFields: 3 MT Field Offset Type VT Attr Value Name 00007ffd7d0c4e28 4000009 8 ...egate, mscorlib]] 0 static 000001e6008e6768 _cache 00007ffd760d0d90 400000a 398 System.Int64 1 static 87633 _compiles 00007ffd760d0d90 400000b 3a0 System.Int64 1 static 34738206 _calls

And voilĂ !

Dynamic data store is slow, (but) you can do better.

If you have been developing with Episerver CMS for a while, you probably know about its embedded “ORM”, called Dynamic Data Store, or DDS for short. It allows you to define strongly typed types which are mapped to database directly to you. You don’t have to create the table(s), don’t have to write stored procedures to insert/query/delete data. Sounds very convenient, right? The fact is, DDS is quite frequently used, and more often than you might think, mis-used.

As Joel Spolsky once said Every abstraction is leaky, an ORM will likely make you forget about the nature of the RDBMS under neath, and that can cause performance problems, sometime severe problems.

Let me make it clear to you

DDS is slow, and it is not suitable for big sets of data.

If you want to store a few settings for your website, DDS should be fine. However, if you are thinking about hundreds of items, it is probably worth looking else. Thousands and more items, then it would be a NO.

I did spend some time trying to bench mark the DDS to see how bad it is. A simple test is to add 10.000 items to a store, then query by each item, then deleted by each item, to see how long does it take

The item is defined like this, this is just another boring POCO:

internal class ShippingArea : IDynamicData
{
    public Identity Id { get; set; }

    public string PostCode { get; set; }

    public string Area { get; set; }

    public DateTime Expires { get; set; }
}

The store is defined like this

    public class ShippingAreaStore
    {
        private const string TokenStoreName = "ShippingArea";

        internal virtual ShippingArea CreateNew(string postCode, string area)
        {
            var token = new ShippingArea
            {
                Id = Identity.NewIdentity(),
                PostCode = postCode,
                Area = area,
                Expires = DateTime.UtcNow.AddDays(1)
            };
            GetStore().Save(token);
            return token;
        }

        internal virtual IEnumerable<ShippingArea> LoadAll()
        {
            return GetStore().LoadAll<ShippingArea>();
        }

        internal virtual IEnumerable<ShippingArea> Find(IDictionary<string, object> parameters)
        {
            return GetStore().Find<ShippingArea>(parameters);
        }

        internal virtual void Delete(ShippingArea shippingArea)
        {
            GetStore().Delete(shippingArea);
        }

        internal virtual ShippingArea Get(Identity tokenId)
        {
            return GetStore().Load<ShippingArea>(tokenId);
        }

        private static DynamicDataStore GetStore()
        {
            return DynamicDataStoreFactory.Instance.CreateStore(TokenStoreName, typeof(ShippingArea));
        }
    }

Then I have some quick and dirty code in QuickSilver ProductController.Index to measure the time (You will have to forgive some bad coding practices here ;). As usual StopWatch should be used on demonstration only, it should not be used in production. If you want a good break down of your code execution, use tools like dotTrace. If you want to measure production performance, use some monitoring system like NewRelic or Azure Application Insights ):

        var shippingAreaStore = ServiceLocator.Current.GetInstance<ShippingAreaStore>();
        var dictionary = new Dictionary<string, string>();
        for (int i = 0; i < 10000; i++)
        {
            dictionary[RandomString(6)] = RandomString(10);
        }
        var identities = new List<ShippingArea>();
        var sw = new Stopwatch();
        sw.Start();
        foreach (var pair in dictionary)
        {
            shippingAreaStore.CreateNew(pair.Key, pair.Value);
        }
        sw.Stop();
        _logger.Error($"Creating 10000 items took {sw.ElapsedMilliseconds}");
        sw.Restart();
        foreach (var pair in dictionary)
        {
            Dictionary<string, object> parameters = new Dictionary<string, object>();
            parameters.Add("PostCode", pair.Key);
            parameters.Add("Area", pair.Value);
            identities.AddRange(shippingAreaStore.Find(parameters));
        }

        sw.Stop();
        _logger.Error($"Querying 10000 items took {sw.ElapsedMilliseconds}");
        sw.Restart();

        foreach (var id in identities)
        {
            shippingAreaStore.Delete(id);
        }
        sw.Stop();
        _logger.Error($"Deleting 10000 items took {sw.ElapsedMilliseconds}");

Everything is ready. So a few tries gave us a fairly stable result:

2019-12-02 13:33:01,574 Creating 10000 items took 11938

2019-12-02 13:34:59,594 Querying 10000 items took 118009

2019-12-02 13:35:24,728 Deleting 10000 items took 25131

And this is strictly single-threaded, the site will certainly perform worse when it comes to real site with a lot of traffic, and thus multiple insert-query-delete at the same time.

Can we do better?

There is a little better attribute that many people don’t know about DDS: you can mark a field as indexed, by adding [EPiServerDataIndex] attribute to the properties. The new class would look like this.

    [EPiServerDataStore]
    internal class ShippingArea : IDynamicData
    {
        public Identity Id { get; set; }

        [EPiServerDataIndex]
        public string PostCode { get; set; }

        [EPiServerDataIndex]
        public string Area { get; set; }

        public DateTime Expires { get; set; }
    }

If you peek into the database during the test, you can see that the data is now being written to Indexed_String01 and Indexed_String02 columns, instead of String01 and String02 as without the attributes. Such changes give us quite drastic improvement:

2019-12-02 15:38:16,376 Creating 10000 items took 7741

2019-12-02 15:38:19,245 Querying 10000 items took 2867

2019-12-02 15:38:44,266 Deleting 10000 items took 25019

The querying benefits greatly from the new index, as it no longer has to do a Clustered Index Scan, it can now do a non clustered index seek + Key look up. Deleting is still equally slow, because the delete is done by a Clustered Index delete on the Id column, which we already have, and the index on an Uniqueidentifier column is not the most effective one.

Before you are happy which such improvement, keep in mind that there are two indexes added for Indexed_String01 and Indexed_String02 separately. Naturally, we would want a combination, clustered even, on those columns, but we just can’t.

What if we want to go bare metal and create a table ourselves, write the query ourselves? Our repository would look like this

public class ShippingAreaStore2
    {
        private readonly IDatabaseExecutor _databaseExecutor;

        public ShippingAreaStore2(IDatabaseExecutor databaseExecutor)
        {
            _databaseExecutor = databaseExecutor;
        }

        /// <summary>
        /// Creates and stores a new token.
        /// </summary>
        /// <param name="blobId">The id of the blob for which the token is valid.</param>
        /// <returns>The id of the new token.</returns>
        internal virtual ShippingArea CreateNew(string postCode, string area)
        {
            var token = new ShippingArea
            {
                Id = Identity.NewIdentity(),
                PostCode = postCode,
                Area = area,
                Expires = DateTime.UtcNow.AddDays(1)
            };
            _databaseExecutor.Execute(() =>
            {
                var cmd = _databaseExecutor.CreateCommand();
                cmd.CommandText = "ShippingArea_Add";
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Id", token.Id.ExternalId));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("PostCode", token.PostCode));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Area", token.Area));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Expires", token.Expires));
                cmd.ExecuteNonQuery();
            });

            return token;
        }

        internal virtual IEnumerable<ShippingArea> Find(IDictionary<string, object> parameters)
        {
            return _databaseExecutor.Execute<IEnumerable<ShippingArea>>(() =>
            {
                var areas = new List<ShippingArea>();
                var cmd = _databaseExecutor.CreateCommand();
                cmd.CommandText = "ShippingArea_Find";
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("PostCode", parameters.Values.First()));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Area", parameters.Values.Last()));
                var reader = cmd.ExecuteReader();
                while (reader.Read())
                {
                    areas.Add(new ShippingArea
                    {
                        Id = (Guid)reader["Id"],
                        PostCode = (string)reader["PostCode"],
                        Area = (string)reader["Area"],
                        Expires = (DateTime)reader["Expires"]
                    });
                }
                return areas;
            });
        }

        /// <summary>
        /// Deletes a token from the store.
        /// </summary>
        /// <param name="token">The token to be deleted.</param>
        internal virtual void Delete(ShippingArea area)
        {
            _databaseExecutor.Execute(() =>
            {
                var cmd = _databaseExecutor.CreateCommand();
                cmd.CommandText = "ShippingArea_Delete";
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("PostCode", area.PostCode));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Area", area.Area));
                cmd.ExecuteNonQuery();
            });
        }
    }

And those would give us the results:

2019-12-02 16:44:14,785 Creating 10000 items took 2977

2019-12-02 16:44:17,114 Querying 10000 items took 2315

2019-12-02 16:44:20,307 Deleting 10000 items took 3190

Moral of the story?

DDS is slow and you should be avoid using it if you are working with fairly big set of data. If you have to use DDS for whatever reason, make sure to at least try to index the columns that you query the most.

And in the end of the days, hand-crafted custom table + query beats everything. Remember that you can use some tools like Dapper to do most of the works for you.