Dynamic data store is slow, (but) you can do better.

If you have been developing with Episerver CMS for a while, you probably know about its embedded “ORM”, called Dynamic Data Store, or DDS for short. It allows you to define strongly typed types which are mapped to database directly to you. You don’t have to create the table(s), don’t have to write stored procedures to insert/query/delete data. Sounds very convenient, right? The fact is, DDS is quite frequently used, and more often than you might think, mis-used.

As Joel Spolsky once said Every abstraction is leaky, an ORM will likely make you forget about the nature of the RDBMS under neath, and that can cause performance problems, sometime severe problems.

Let me make it clear to you

DDS is slow, and it is not suitable for big sets of data.

If you want to store a few settings for your website, DDS should be fine. However, if you are thinking about hundreds of items, it is probably worth looking else. Thousands and more items, then it would be a NO.

I did spend some time trying to bench mark the DDS to see how bad it is. A simple test is to add 10.000 items to a store, then query by each item, then deleted by each item, to see how long does it take

The item is defined like this, this is just another boring POCO:

internal class ShippingArea : IDynamicData
{
    public Identity Id { get; set; }

    public string PostCode { get; set; }

    public string Area { get; set; }

    public DateTime Expires { get; set; }
}

The store is defined like this

    public class ShippingAreaStore
    {
        private const string TokenStoreName = "ShippingArea";

        internal virtual ShippingArea CreateNew(string postCode, string area)
        {
            var token = new ShippingArea
            {
                Id = Identity.NewIdentity(),
                PostCode = postCode,
                Area = area,
                Expires = DateTime.UtcNow.AddDays(1)
            };
            GetStore().Save(token);
            return token;
        }

        internal virtual IEnumerable<ShippingArea> LoadAll()
        {
            return GetStore().LoadAll<ShippingArea>();
        }

        internal virtual IEnumerable<ShippingArea> Find(IDictionary<string, object> parameters)
        {
            return GetStore().Find<ShippingArea>(parameters);
        }

        internal virtual void Delete(ShippingArea shippingArea)
        {
            GetStore().Delete(shippingArea);
        }

        internal virtual ShippingArea Get(Identity tokenId)
        {
            return GetStore().Load<ShippingArea>(tokenId);
        }

        private static DynamicDataStore GetStore()
        {
            return DynamicDataStoreFactory.Instance.CreateStore(TokenStoreName, typeof(ShippingArea));
        }
    }

Then I have some quick and dirty code in QuickSilver ProductController.Index to measure the time (You will have to forgive some bad coding practices here ;). As usual StopWatch should be used on demonstration only, it should not be used in production. If you want a good break down of your code execution, use tools like dotTrace. If you want to measure production performance, use some monitoring system like NewRelic or Azure Application Insights ):

        var shippingAreaStore = ServiceLocator.Current.GetInstance<ShippingAreaStore>();
        var dictionary = new Dictionary<string, string>();
        for (int i = 0; i < 10000; i++)
        {
            dictionary[RandomString(6)] = RandomString(10);
        }
        var identities = new List<ShippingArea>();
        var sw = new Stopwatch();
        sw.Start();
        foreach (var pair in dictionary)
        {
            shippingAreaStore.CreateNew(pair.Key, pair.Value);
        }
        sw.Stop();
        _logger.Error($"Creating 10000 items took {sw.ElapsedMilliseconds}");
        sw.Restart();
        foreach (var pair in dictionary)
        {
            Dictionary<string, object> parameters = new Dictionary<string, object>();
            parameters.Add("PostCode", pair.Key);
            parameters.Add("Area", pair.Value);
            identities.AddRange(shippingAreaStore.Find(parameters));
        }

        sw.Stop();
        _logger.Error($"Querying 10000 items took {sw.ElapsedMilliseconds}");
        sw.Restart();

        foreach (var id in identities)
        {
            shippingAreaStore.Delete(id);
        }
        sw.Stop();
        _logger.Error($"Deleting 10000 items took {sw.ElapsedMilliseconds}");

Everything is ready. So a few tries gave us a fairly stable result:

2019-12-02 13:33:01,574 Creating 10000 items took 11938

2019-12-02 13:34:59,594 Querying 10000 items took 118009

2019-12-02 13:35:24,728 Deleting 10000 items took 25131

And this is strictly single-threaded, the site will certainly perform worse when it comes to real site with a lot of traffic, and thus multiple insert-query-delete at the same time.

Can we do better?

There is a little better attribute that many people don’t know about DDS: you can mark a field as indexed, by adding [EPiServerDataIndex] attribute to the properties. The new class would look like this.

    [EPiServerDataStore]
    internal class ShippingArea : IDynamicData
    {
        public Identity Id { get; set; }

        [EPiServerDataIndex]
        public string PostCode { get; set; }

        [EPiServerDataIndex]
        public string Area { get; set; }

        public DateTime Expires { get; set; }
    }

If you peek into the database during the test, you can see that the data is now being written to Indexed_String01 and Indexed_String02 columns, instead of String01 and String02 as without the attributes. Such changes give us quite drastic improvement:

2019-12-02 15:38:16,376 Creating 10000 items took 7741

2019-12-02 15:38:19,245 Querying 10000 items took 2867

2019-12-02 15:38:44,266 Deleting 10000 items took 25019

The querying benefits greatly from the new index, as it no longer has to do a Clustered Index Scan, it can now do a non clustered index seek + Key look up. Deleting is still equally slow, because the delete is done by a Clustered Index delete on the Id column, which we already have, and the index on an Uniqueidentifier column is not the most effective one.

Before you are happy which such improvement, keep in mind that there are two indexes added for Indexed_String01 and Indexed_String02 separately. Naturally, we would want a combination, clustered even, on those columns, but we just can’t.

What if we want to go bare metal and create a table ourselves, write the query ourselves? Our repository would look like this

public class ShippingAreaStore2
    {
        private readonly IDatabaseExecutor _databaseExecutor;

        public ShippingAreaStore2(IDatabaseExecutor databaseExecutor)
        {
            _databaseExecutor = databaseExecutor;
        }

        /// <summary>
        /// Creates and stores a new token.
        /// </summary>
        /// <param name="blobId">The id of the blob for which the token is valid.</param>
        /// <returns>The id of the new token.</returns>
        internal virtual ShippingArea CreateNew(string postCode, string area)
        {
            var token = new ShippingArea
            {
                Id = Identity.NewIdentity(),
                PostCode = postCode,
                Area = area,
                Expires = DateTime.UtcNow.AddDays(1)
            };
            _databaseExecutor.Execute(() =>
            {
                var cmd = _databaseExecutor.CreateCommand();
                cmd.CommandText = "ShippingArea_Add";
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Id", token.Id.ExternalId));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("PostCode", token.PostCode));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Area", token.Area));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Expires", token.Expires));
                cmd.ExecuteNonQuery();
            });

            return token;
        }

        internal virtual IEnumerable<ShippingArea> Find(IDictionary<string, object> parameters)
        {
            return _databaseExecutor.Execute<IEnumerable<ShippingArea>>(() =>
            {
                var areas = new List<ShippingArea>();
                var cmd = _databaseExecutor.CreateCommand();
                cmd.CommandText = "ShippingArea_Find";
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("PostCode", parameters.Values.First()));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Area", parameters.Values.Last()));
                var reader = cmd.ExecuteReader();
                while (reader.Read())
                {
                    areas.Add(new ShippingArea
                    {
                        Id = (Guid)reader["Id"],
                        PostCode = (string)reader["PostCode"],
                        Area = (string)reader["Area"],
                        Expires = (DateTime)reader["Expires"]
                    });
                }
                return areas;
            });
        }

        /// <summary>
        /// Deletes a token from the store.
        /// </summary>
        /// <param name="token">The token to be deleted.</param>
        internal virtual void Delete(ShippingArea area)
        {
            _databaseExecutor.Execute(() =>
            {
                var cmd = _databaseExecutor.CreateCommand();
                cmd.CommandText = "ShippingArea_Delete";
                cmd.CommandType = CommandType.StoredProcedure;
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("PostCode", area.PostCode));
                cmd.Parameters.Add(_databaseExecutor.CreateParameter("Area", area.Area));
                cmd.ExecuteNonQuery();
            });
        }
    }

And those would give us the results:

2019-12-02 16:44:14,785 Creating 10000 items took 2977

2019-12-02 16:44:17,114 Querying 10000 items took 2315

2019-12-02 16:44:20,307 Deleting 10000 items took 3190

Moral of the story?

DDS is slow and you should be avoid using it if you are working with fairly big set of data. If you have to use DDS for whatever reason, make sure to at least try to index the columns that you query the most.

And in the end of the days, hand-crafted custom table + query beats everything. Remember that you can use some tools like Dapper to do most of the works for you.

Listing permissions per user/group

This week I came cross this question on Episerver World forum https://world.episerver.com/forum/developer-forum/Episerver-Commerce/Thread-Container/2019/5/get-rolepermission-data/ , and while it is not Commerce-related. it is quite interesting to solve. Perhaps this short post will help the original poster, as well future visitors.

As in the thread, I replied the first piece to solve the puzzle:


You can use PermissionTypeRepository to get the registered PermissionTypes, then PermissionRepository to figure out which groups/users have a specific permission 

If you want to list permissions granted to a specific role or user, it is just a simple reversion using a dictionary:

            var rolePermissionMap = new Dictionary<string, HashSet<PermissionType>>(StringComparer.OrdinalIgnoreCase);
            var permissionTypes = _permissionTypeRepository.List();
            foreach (var permissionType in permissionTypes)
            {
                var securityEntities = _permissionRepository.GetPermissions(permissionType);
                foreach (var securityEntity in securityEntities)
                {
                    if (rolePermissionMap.ContainsKey(securityEntity.Name))
                    {
                        rolePermissionMap[securityEntity.Name].Add(permissionType);
                    }
                    else
                    {
                        rolePermissionMap[securityEntity.Name] = new HashSet<PermissionType>() { permissionType };
                    }
                }
            }

As suggested above, we use
PermissionTypeRepository to list the registered PermissionType(s) , and then for each PermissionType we get the list of SecurityEntity it is granted for. A SecurityEntity can be an user, a group, or a virtual role, and is identified by the name. For purpose of demonstration, we only use names: For each SecurityEntity granted a permission, we check if it is in our dictionary already, if yes, then add the permission to the list, otherwise add a new entry.

Simple, eh?

Unless if you are assigning/un-assigning permissions a lot, it is probably a good idea to keep this Dictionary in cache for some time, because it is not exactly cheap to build.

Watch out for Singletons

If you are a seasoned Episerver developer, you should (and probably, already) know about the foundation of the framework: dependency injection. With the Inversion of control framework (most common, Structuremap, but recent versions of Framework allow much more flexible options), you can easily register your implementations, without having to manually create each and every instance by new operator. Sounds great, right? Yes it is.

And Episerver Framework allows you to make it even easier by this nice ServiceConfiguration attribute:

[ServiceConfiguration]
public class MyClass 
{
}

so your class will be automatically registered, and whenever you need an instance of MyClass, IoC framework will get the best instance for you, automatically, without breaking a sweat. Isn’t it nice? Yes it is.

But I guess you also saw this from place to place

[ServiceConfiguration(LifeCycle = ServiceInstanceScope.Singleton)]
public class MyClass 
{
}

So instead of creating a new instance every time you ask it to, IoC framework only creates the instance once and reuses it every time. You must think to yourself: even nicer, that would save you a lot of time and memory.

But is it (nicer)?

You might want to think again.

Singleton means one thing: shared state (or even worse, Global state). When a class is marked with Singleton, the instance of that class is supposed to be shared across the site. The upside, is, well, if your constructor is expensive to create, you can avoid just that. The downside, of course, shared state can be a real b*tch and it might come back to bite you. What if MyClass holds the customer address of current user. If I set my address to that, and because you get the same instance, you’ll gonna see mine. In Sweden it’s not a real problem as you can easily know where I live (even my birthday if you want to send a gift, or flowers), but I guess in the bigger parts of the world that is a serious privacy problem. And what if it’s not just address?

And it’s not just that, Singleton also make things complicated with “inherited singleton”. Let’s take a look at previous example. Now we see Singleton is bad, so let’s remove that on our class. But what if other class depends on our little MyClass:

[ServiceConfiguration(LifeCycle = ServiceInstanceScope.Singleton)]
public class MyOtherClass 
{
   private MyClass _myClass;
   public MyOtherClass(MyClass myClass)
   {
        _myClass = myClass;
   }
}

Now I hope you see where the problem is. One instance of MyOtherClass is shared inside the side. And it comes with an attached MyClass instance. Even if you don’t intend to, that MyClass instance will also be shared. Same problem after all.

Singleton was there to solve one problem (or two), but it can also introduce other problems if you don’t really think about if your instance should be shared or not. And not just your class, the classes which have dependency on your class as well.

And it’s not just Singleton , HttpContext and Hybrid might also subject to same problem, but to a lesser extend. Any lifecycle that shares state should be considered: if you really need it and what you are sharing.

Lifecycle is hard, but it can also work wonder, so please take your time to make it right. It’s worth it.

Adding backslash ending to your URLs by UrlRewrite

It’s generally a best practice to add a backslash to your URLs, for performance reasons (let’s get back to that later in another post). There are several ways to do it, but the best/simplest way, IMO, is using UrlRewrite module. In most of the case, processing the Urls before it reaching the server code will be most effective, and here by using UrlRewrite we trust IIS to do the right thing. Our application does not even need to know about it. It’s also a matter of reusable. You can simply copy a working, well tested rule to another sites without having to worry much about compatibility.

Before getting started, it’s worth noting that UrlRewrite is an IIS module which is not installed by default, if you want to use it, then you would have to install it explicitly. Download and install it from https://www.iis.net/downloads/microsoft/url-rewrite .

And then we can simple add rules to <system.webServer><rewrite><rules> section in web.config. For the purpose of this post, this rule should be enough:

<rule name="Add trailing slash" stopProcessing="true">
  <match url="(.*[^/])$" />
  <conditions>
    <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
    <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
  </conditions>
  <action type="Redirect" redirectType="Permanent" url="{R:1}/" />
</rule>

If you are new to UrlRewrite, may be it’s useful to explain a bit on the rule. Here we want to add a rule which matches every url, except the urls which already end with backslash (/) already.

Continue reading “Adding backslash ending to your URLs by UrlRewrite”

Maintaining your indexes

Indexes are crucial to SQL Server performance. Having the right indexes might make the difference of day and night with your application performance – as I once talked here.

However, even having the right indexes is not everything. You have to keep them healthy. Indexes, as any other kinds of storage, is subjected to fragmentation. SQL Server works best if the index structure is compact and continuous, but with all of the inserts/updates/deletes, it’s inevitable to get fragmented. When the fragmentation grows, it starts affecting the performance of SQL Server: Instead of having to read just one page, it now have to read two, which increases both time and resource needed, and so on and so forth.

Continue reading “Maintaining your indexes”

Episerver caching issue with .NET 4.7

Update 1: The bug is fixed in .NET 4.7.1 (thanks to Pascal van der Horst for the information)

Update 2: The related bug is fixed in CMS Core 10.10.2 and 9.12.5. If upgrading to that version is not an option, you can contact Episerver support service for further assistance.

Original post:

If you are using Episerver and update to .NET 4.7 (even involuntarily, such as you are using DXC/Azure to host your websites. Microsoft updated Azure to .NET 4.7 on June 26th) , you might notice some weird performance issues. If your servers are in Europe, Asia or Australia, then you can see a peak in memory usage. If your servers in North America, then you can see the number of database calls increased. In both cases, your website performance is affected, the former can cause your websites to constantly restarts as memory usage reaches a threshold limit, and even more obvious in the latter. Why?

It was a known issue in .NET 4.7, as mentioned here: https://support.microsoft.com/en-us/help/4035412/fix-expiration-time-issue-when-you-insert-items-by-using-the-cache-ins

Continue reading “Episerver caching issue with .NET 4.7”

Episerver CMS performance optimization – part 1

Update: In Episerver CMS 11, released today (November 21st 2017), the simpleaddress router has been moved to the last of the route table.

Original post:

This is an unusual post – it is not about Commerce – my area of expertise, but about CMS. Recently I’ve been working on some support cases where SQL Server instance is on high utilization, and in some scenarios it eventually slows down the site. After investigation, it’s likely to come from a small, simple and helpful feature: Simple address.

CMS content can have a property named “Simple address”, which allows you to create a “shortcut” url for that content. So if you have one page with “name in url” as “contact-us” under a page name “about-us” under Home page, you can access it via https://mysupercoolsite.com/en/about-us/contact-us. Or you can set the Simple address for that page as “contact-us”, and then you can access it directly via https://mysupercoolsite.com/contact-us.

Continue reading “Episerver CMS performance optimization – part 1”