Fix your Search & Navigation (Find) indexing job, please

Once upon a time, a colleague asked me to look into a customer database with weird spikes in database log usage. (You might start to wonder why I am always the one who looks into weird things, is there a pattern here)

Upon reviewing the query store, I noticed a very high logical reads related to tblScheduledItem. From past experience, it was likely because of fragmentation of indexes in this table (which has only one clustered). I did a quick look at the table, and confirmed the index indeed has high fragmentation. I suggested to do a rebuild of that index and see what happen. Well, it could have been one of the daily simple quick questions, and I almost forgot about it.

A few days passed, the colleague pinged me again. Apparently they rebuilt it but it does not really help. That raised my eye brows a little bit, so I dug deeper.

To my surprised, the problem was not really fragmentation (it definitely contributed). The problem is that the column has a column of type nvarchar(max) , and it’s for recording the last text from the Execute method of the scheduled job. It was meant for something short like “The job failed successfully”, or “Deleted 12345 versions from 1337 contents”. But because it’s nvarchar(max) it could be very, very long. You can, in theory, store the entire book content of a a few very big libraries in there.

Of course because of you can, does not mean you should. When your column is long, each read from the table will be a burden to SQL Server. And the offending job was nothing less than our S&N indexing job.

In theory, any job could have caused this issue, however it’s much more likely happen with S&N indexing job for a reason – it keeps track of every exception thrown during indexing process, and because it indexes each and every content of your website (except the ones you specifically, explicitly tell it not too), the chance of its running into a recurring issue that affects multiple (reads, a lot) of content is much higher than any built-in job.

I asked, this time, to trim the column, and most importantly, fix any exceptions that might be thrown during the index. I was on my day off when my colleague noticed me that the job is running for 10h without errors, as they fixed it. Curious, so I did check some statistic. Well, let those screenshots speak for themselves:

The query itself went from 16,000ms to a mere 2.27ms. Even better, each call to get the list of scheduled job before results in 3.5GB logical reads. Now? 100KB. A lot of resource saved!

So, make sure your job is not throwing a lot of errors. And fix your S&N indexing job.

P/S: I do think S&N indexing job should have simpler return result. Maybe “Indexed 100.000 content with 1234 errors”, and the exceptions could have been logged. But that’s debatable. For now, you can do your parts!

Migrate Catalog content properties

A colleague asked me yesterday – how do we migrate properties of catalog content. There is, unfortunately, no official way to do it. There are several unofficial ways to do it, however. Today we will explore the way I personally recommend – for its safety and backward compatible.

Let’s say we have FashionProduct with a MSRP property with type of Money, now we would want to change it to Decimal . There are a some hacky ways to do this, but all of them require direct database manipulation which we should try to avoid – if possible.

First we will need this piece of code. it was “stolen” from a colleague of mine and has been used for countless times. You probably want to bookmark it as it’ll likely be useful in the future (I should probably do so myself as I have to find it every time I need). It is a snippet to traverse the catalog structure based on the content type you’d want.

public virtual IEnumerable<T> GetEntriesRecursive<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : EntryContentBase
    {
        foreach (var nodeContent in LoadChildrenBatched<NodeContent>(parentLink, defaultCulture))
        {
            foreach (var entry in GetEntriesRecursive<T>(nodeContent.ContentLink, defaultCulture))
            {
                yield return entry;
            }
        }

        foreach (var entry in LoadChildrenBatched<T>(parentLink, defaultCulture))
        {
            yield return entry;
        }
    }

    private IEnumerable<T> LoadChildrenBatched<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : IContent
    {
        var start = 0;

        while (true)
        {
            var batch = _contentLoader.GetChildren<T>(parentLink, defaultCulture, start, 50);
            if (!batch.Any())
            {
                yield break;
            }

            foreach (var content in batch)
            {
                // Don't include linked products to avoid including them multiple times when traversing the catalog
                if (!parentLink.CompareToIgnoreWorkID(content.ParentLink))
                {
                    continue;
                }

                yield return content;
            }
            start += 50;
        }
    }

To make sure we don’t load to many content at once, the batch is set size 50 but that is of course configurable (up to you)!

Now the fun part, where it actually does the work. Once we have the content, we will need to actually migrate the data, it is can be simple as this

private void MigrateProperty<T>(IEnumerable<T> contents) where T: EntryContentBase
{
      var batch = new List<T>();
      foreach(var content in contents)
      {
           var writeableClone = content.CreateWriteableClone<T>();
           Transform(writeableClone);
           batch.Add(writeableClone);
      }
      _contentRepository.Publish(batch, PublishAction.SyncDraft);
}

With the Transform method you can do whatever you want with the property value. As you might just want to rename it – it can do nothing except assign value to the new property. Or in the case we mentioned at the beginning, convert Money to Decimal is an easy task (Money is the less precision version of Decimal). Note that if you convert between data types, for example from double to int , there are potential data loss, but you are probably aware of that already.

The final step is to publish the change. For performance reasons, it is probably the best that you the Publish extension method of IContentRepository and save multiple content in one batch – may of of size 50 or 100. Those will skip things like creating new versions for optimal performance. You can read it about here New simple batch saving API for Commerce | Optimizely Developer C

The remaining question is where to put it. In a perfect world, I’d say in a migration step (i.e. a class that implement IMigrationStep ), so you ensure that your data will be properly migrated before anything else run, for example your new code that access the new property, or indexing of your content after migration. But if you have a sizeable catalog, this will take time and it might not be a good idea to let your users wait for it to complete. For that, it makes senses to do this in a schedule job and when it completes, you make a switch.

Migrating properties is not an easy or quick task, but it can be done with relative ease. It also reminds us about modeling – try to get it right from beginning so we don’t have to migrate. In the end, the fastest code is the code that does not need to be run!

Command timeout for Commerce 14

While we always want to have fast database queries, it is not doable all the time. Sometimes we need to run slow queries, and we need to tell the underlying framework that this query can take some time to complete, and we’re fine with it. Otherwise, it will try to terminate the query after 30 seconds (the default time out limit)

There is a different between connection timeout and command timeout. Connection timeout is the time .NET will try to connect to the database before giving up. Command timeout is the time .NET will try to execute a command before giving up.

With Commerce 13, we have a setting added in 9.23.1, as we talked here Episerver Commerce commandTimeout configuration – Quan Mai’s blog (vimvq1987.com) , however, in Commerce 14, it’s … different.

Things are a bit complicated when it comes to command timeout with .NET 5 and up. With later versions of Microsoft.Data.SqlClient, it is possible to set command timeout directly using connection string. It is indeed a simple way to do it, but with a caveat.

The new setting is not recognized by Entity Framework/Entity Framework Core, and it will throw exception if you try to access a connection string with command timeout setting. It has another way to set the command timeout itself by each DbContext , but it does not accept the setting via Connection string. It will throw “Keyword not supported: ‘command timeout'” if such setting is present.

The workaround is to configure the command timeout for EcfSqlConnection connection string, and another different connection string without command timeout just for Entity Framework.

However, that’s with a caveat: using command timeout in connection string means that value applies to all database queries. As we discussed in the previous post above, doing so is not without drawbacks – it hides slow queries rather than let it fails. A failed query might not bring down your website, but an overloaded database will likely do.

In Commerce 14.15.6 which should be released shortly, we introduce a new setting SqlDataProviderOptions.CommandTimeout which let you set the command timeout for queries that are using SqlDataProvider – most notably the MetaDataPlus system like orders.

The important advantage of the new setting is that you can set it on the fly. If you know that some operation will be slow, you can set the command timeout to a higher value just for that operation, then set it back to default value. In most cases, you can leave it to default value (30 seconds), and do optimization on application level (reduce batch size for example) or database layer (rewrite the query, adding/optimizing indexes etc.). But sometimes you know the query would be slow and you acknowledge that – this is the tool.

Performance optimization – the hardcore series – part 3

“In 99% of the cases, premature optimization is the root of all devil”

This quote is usually said to be from Donald Knuth, usually regarded as “father of the analysis of algorithms”. His actual quote is a bit difference

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should not pass up our opportunities in that critical 3%.

If you have read my posts, you know that I always ask for measuring your application before diving in optimization. But that’s not all of the story. Without profiling, your optimization effort might be futile. But there are things you can “optimize” right away without any profiling – because – they are easy to do, they make your code simpler, easier to follow, and you can be certain they are faster.

Let’s see if you can spot the potential problematic piece of code from this snippet

public Something GetData()
{
var market = list.FirstOrDefault(x => x.MarketId == GetCurrentMarket().MarketId)
{
//do some stuffs
}

}

If you are writing similar code, don’t be discouraged. It’s easy to overlook the problem – when you call FirstOrDefault, you actually iterate over the list until you find the first matching element. And for each and every of that, GetCurrentMarket() will be called.

Because we can’t be sure when we will find the matching element, it might be the first element, or the last, or it does not exist, or anywhere in between. The median is that GetCurrentMarket will be half called half the size of list

We don’t know if GetCurrentMarket is a very lightweight implementation, or list is a very small set, but we know that if this is in one very hot path, the cost can be (very) significant. These are the allocations made by said GetCurrentMarket

This is a custom implementation of IMarketService – the default implementation is much more lightweight and should not be of concern. Of course, fewer calls are always better – no matter how quick something is.

In this specific example, a simple call to get the current market and store it in a local variable to be used in the scope of the entire method should be enough. You don’t need profiling to make such “optimization” (and as we proved, profiling only confirm our suspect )

Moral of the story

  • For optimization, less is almost always, more
  • You definitely should profile before spending any considerable amount optimizing your code. But there are things that can be optimized automatically. Make them your habit.

Performance optimization – the hardcore series – part 2

Earlier we started a new series about performance optimization, here Performance optimization – the hardcore series – part 1 – Quan Mai’s blog (vimvq1987.com) . There are ton of places where things can go wrong. A seasoned developer can, from experience, avoid some obvious performance errors. But as we will soon learn, a small thing can make a huge impact if it is called repeatedly, and a big thing might be OK to use as long as it is called once.

Let’s take this example – how would you think about this snippet – CategoryIds is a list of string converted from ContentReference

            if (CategoryIds.Any(x => new ContentReference(x).ToReferenceWithoutVersion() == contentLink))
            {
                //do stuff
            }

If this is in any “cool” path that run a few hundred times a day, you will be fine. It’s not “elegant”, but it works, and maybe you can get away with it. However, if it is in a hot path that is executed every time a visitor visits a product page in your website, it can create a huge problem.

And can you guess what it is?

new ContentReference(string) is fairly lightweight, but if it is called a lot, this is what happen. This is allocations from the constructor alone, and only within 220 seconds of the trace

A lot of allocations which should have been avoided if CategoryIds was just an IEnumerable<ContentReference> instead of IEnumerable<string>

For comparison, this is how 10.000 and 1000.000 new ContentReference would allocate

Thing is similar if you use .ToReferenceWithoutVersion() to compare to another ContentReference (although to a lesser extend as ToReferenceWithoutVersion would return the same ContentReference if the WorkId is 0, and it use cloning instead of new). The correct way to compare two instances of ContentReference without caring about versions, is to use .Compare with ContentReferenceComparer.IgnoreVersion

Moral of the story

  • It is not only what you do, but also how you do it
  • Small things can make big impacts, don’t guess, measure!

Performance optimization – the hardcore series – part 1

Hi again every body. New day – new thing to write about. today we will talk about memory allocation, and effect it has on your website performance. With .NET, memory allocations are usually overlooked because CLR handles that for you. Except in rare cases that you need to handle unmanaged resources, that you have to be conscious about releasing that said resources yourself, it’s usually fire and forget approach.

Truth is, it is more complicated than that. The more objects you created, the more memory you need, and the more time CLR needs to clean it up after you. When you might have written code that is executed blazing fast in your benchmarks, in reality, your website might still struggle to perform well in long run – and that’s because of Garbage collection. Occasional GC is not of a concern – because it’s nature of .NET CLR, but frequent GC, especially Gen 2 GC, is definitely something you should look into and should fix, because it natively affects your website performance.

The follow up question – how do you fix that.

Of course, the first step is always measuring the memory allocations of your app. Locally you can use something like Jetbrains dotMemory to profile your website, but that has a big caveat – you can’t really mimic the actual traffic to your website. Sure, it is very helpful to profile something like a scheduled job, but it is less than optimal to see how your website performs in reality. To do that, we need another tool, and I’ve found nothing better than Application Insights Profiler trace on Azure. It will sample your website periodically, taking ETL ( event trace log) files in 220 seconds (Note, depends on your .NET version, you might download a .diagsession or a .netperf.zip file from Application Insights, but they are essentially the same inside (zipped .ETL)). Those files are extremely informative, they contains whole load of information which might be overwhelming if you’re new, but take small steps, you’ll be there.

To open a ETL file, common tool is Perfview (microsoft/perfview: PerfView is a CPU and memory performance-analysis tool (github.com)). Yes it has certain 2000 look like other analysis tool (remember Windbg), but it is fast, efficient, and gets the job done

Note that once extracted ETL can be very big – in 1GB or more range often. Perfview has to go through all that event log so it’s extremely memory hungry as well, especially if you open multiple ETL files at once. My perfview kept crashing when I had a 16GB RAM machine (I had several Visual Studio instances open), and that was solved when I switched to 32GB RAM

The first step is to confirm the allocation problems with GCStats (this is one of the extreme ones, but it does happen)

Two main things to look into – Total Allocs, i.e. the total size of objects allocated, and then the time spent in Garbage collection. They are naturally closely related, but not always. Total allocation might not be high but time for GC might be – in case of large objects allocation (we will talk about it in a later post). Then for the purpose of memory allocation analysis, this is where you should look at

What you find in there, might surprise you. And that’s the purpose of this series, point out possible unexpected allocations that are easy – or fairly easy – to fix.

In this first post, we will talk about a somewhat popular feature – Injected<T>.

We all know that in Optimizely Content/Commerce, the preferred way of dependency injection is constructor injection. I.e. if your class has a dependency on a certain type, that dependency should be declared as a parameter of the constructor. That’s nice and all, but not always possible. For example you might have a static class (used for extension methods) so no constructor is available. Or in some rare cases, that you can’t added a new parameter to the constructor because it is a breaking change.

Adding Injected<T> as a hidden dependency in your class is at least working, so can you forget about it?

Not quite!

This is how the uses of Injected<T> result in allocation of Structuremap objects – yes every time you call Injected<T>.Service the whole dependency tree must be built again.

And that’s not everything, during that process, other objects need to be created as well. You can right click on a path and select “Include item”. The allocations below are for anything that were created by `module episerver.framework episerver.framework!EPiServer.ServiceLocation.Injected1[System.__Canon].get_Service() i.e. all object allocations, related to Injected<T>

You can expand further to see what Injected<T>(s) have the most allocations, and therefore, are the ones should be fixed.

How can one fix a Injected<T> then? The best fix is to make it constructor dependency, but that might not always be possible. Alternative fix is to use ServiceLocator.GetInstance, but to make that variable static if possible. That way you won’t have to call Injected<T>.Service every time you need the instance.

There are cases that you indeed need a new instance every time, then the fix might be more complicated, and you might want to check if you need the whole dependency tree, or just a data object.

Moral of the story

  • Performance can’t be guessed, it must be measured
  • Injected<T> is not your good friend. You can use it if you have no other choice, but definitely avoid it in hot paths.

The do nothing SearchProvider

With Find-backed IEntrySearchService in the previous post , we can now put SearchProvider to rest. There are, however, parts of the framework that still rely on SearchManager, and it expects a configured, working SearchProvider. The Full search index job, and the Incremental search index job are two examples. To make sure we don’t break the system, we might want to give SearchManager something to chew on. A do nothing SearchProvider that is!

And we need a DoNothingSearchProvider

    public class DoNothingSearchProvider : SearchProvider
    {
        public override string QueryBuilderType => GetType().ToString();

        public override void Close(string applicationName, string scope) { }
        public override void Commit(string applicationName) { }
        public override void Index(string applicationName, string scope, ISearchDocument document) { }
        public override int Remove(string applicationName, string scope, string key, string value)
        { return 42; }

        public override void RemoveAll(string applicationName, string scope)
        {
        }
        public override ISearchResults Search(string applicationName, ISearchCriteria criteria)
        {
            return new SearchResults(new SearchDocuments(), new CatalogEntrySearchCriteria());
        }
    }

    

And a DoNothingIndexBuilder

public class DoNothingIndexBuiler : ISearchIndexBuilder
    {
        public SearchManager Manager { get; set; }
        public IndexBuilder Indexer { get; set; }

        public event SearchIndexHandler SearchIndexMessage;

        public void BuildIndex(bool rebuild) { }
        public bool UpdateIndex(IEnumerable<int> itemIds) { return true; }
    }

What remains is simply register them in your appsettings.json

                                       "SearchOptions":  {
                                                             "DefaultSearchProvider":  "DoNothingSearchProvider",
                                                             "MaxHitsForSearchResults":  1000,
                                                             "IndexerBasePath":  "[appDataPath]/Quicksilver/SearchIndex",
                                                             "IndexerConnectionString":  "",
                                                             "SearchProviders":  [
                                                              {
                                                                "Name": "DoNothingSearchProvider",
                                                                "Type": "EPiServer.Reference.Commerce.Site.Infrastructure.Indexing.DoNothingSearchProvider, EPiServer.Reference.Commerce.Site",
                                                                "Parameters": {
                                                                  "queryBuilderType": "EPiServer.Reference.Commerce.Site.Infrastructure.Indexing.DoNothingSearchProvider, EPiServer.Reference.Commerce.Site",
                                                                  "storage": "[appDataPath]/Quicksilver/SearchIndex",
                                                                  "simulateFaceting": "true"
                                                                }
                                                              }
                                                                                 ],
                                                             "Indexers":  [
                                                                              {
                                                                                  "Name":  "catalog",
                                                                                  "Type":  "EPiServer.Reference.Commerce.Site.Infrastructure.Indexing.DoNothingIndexBuilder, EPiServer.Reference.Commerce.Site"
                                                                              }
                                                                          ]
                                                         },

And that’s it.

Use Find for CSR UI

If you have been using Find, you might be surprised to find that CSR UI uses the SearchProvider internally. This is a bit unfortunate because you likely are using Find, and that creates unnecessary complexity. For starter, you need to configure a SearchProvider, then you need to index the entries, separately from the Find index. If you install EPiServer.CloudPlatform.Commerce, it will setup the DXPLucenceSearchProvider for you, which is basically a wrapper of LuceneSearchProvider to let it work on DXP (i.e. Azure storage). But even with that, you have to index your entries anyway. You can use FindSearchProvider, but that actually just creates another problem – it uses a different index compared to Find, so you double your index count, yet you have still make sure to index your content. Is there a better way – to use the existing Find indexed content?

Yes, there is

Searches for entries in CSR is done by IEntrySearchService which the default implementation uses the configured SearchProvider internally . Fortunately for us, as with most thing in Commerce, we can create our own implementation and inject it. Now that’s with a caveat – IEntrySearchService is marked as BETA remark, so prepare for some breaking changes without prior notice. However it has not changed much since its inception (funny thing, when I checked for its history, I was the one who created it 6 years ago, in 2017. Feeling old now), and if it is changed, it would be quite easy to adapt for such changes.

IEntrySearchService is a simple with just one method:


IEnumerable<int> Search(string keyword, MarketId marketId, Currency currency, string siteId);

It is a bit weird to return an IEnumerable<int> (what was I thinking ? ), but it was likely created as a scaffolding of SearchManager.Search which returns an IEnumerable<int>, and was not updated later. Anyway, an implementation using Find should look like this:

    public class FindEntrySearchService : IEntrySearchService
    {
        private EPiServer.Find.IClient _searchClient;

        public FindEntrySearchService(EPiServer.Find.IClient searchClient) => _searchClient = searchClient;

        public IEnumerable<int> Search(string keyword, MarketId marketId, Currency currency, string siteId)
        {
            return _searchClient.Search<EntryContentBase>()
                 .For(keyword)
                 .Filter(x => x.MatchMarketId(marketId))
                 .Filter(x => x.SiteId().Match(siteId))
                 .Filter(x => FilterPriceAvailableForCurrency<IPricing>(y => y.Prices(), currency))
                 .GetResult()
                 .Select(x => x.ContentLink.ID);
        }

        public FilterExpression<Price> FilterPriceAvailableForCurrency<T>(Expression<Func<T, IEnumerable<Price>>> prices, Currency currency)
        {
            var currencyCode = currency != null ? currency.CurrencyCode : string.Empty;

            return new NestedFilterExpression<T, Price>(prices, price => price.UnitPrice.Currency.CurrencyCode.Match(currencyCode), _searchClient.Conventions);
        }
    }

Note that I am not an expert on Find, especially on NestedFilterExpression, so my FilterPriceAvailableForCurrency might be wrong. Feel free to correct it, the code is not copyrighted and is provided as-is.

As always, you need to register this implementation for IEntrySearchService. You can add it anywhere you like as long as it’s after .AddCommerce.

_services.AddSingleton<IEntrySearchService, FindEntrySearchService>();

Delete orphaned assets

I was asked this question: we have about 3TB of assets, any way to clean it up.

These days, storage is cheap, but still not free. and big storage means you need space for back up. and with that, bandwidth and time.

Is there away to clean up things you no longer need?

Yes!

Optimizely Content already has a scheduled job named Remove Abandoned BLOBs, but this job only removes the blobs that have no content associated. I.e. the content is deleted by IContentRepository.Delete but the blob was left behind. The job uses the log to find out which content were deleted, then find those blobs.

How’s about the assets that still have contents associated with them, but not used anywhere? Time to get your hands dirty!

Due to the nature of this task, it is best to make it a scheduled job.

All of the assets are children under the global asset root. By iterating over them, we can check if each of them is being used by another content. If not, we will add them to a list for later delete. Before deleting the content, we will find the blob and then delete it as well. Easy, right?

To get the content recursively we use this little piece of code

        public virtual IEnumerable<T> GetAssetRecursive<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : MediaData
        {
            foreach (var folder in LoadChildrenBatched<ContentFolder>(parentLink, defaultCulture))
            {
                foreach (var entry in GetAssetRecursive<T>(folder.ContentLink, defaultCulture))
                {
                    yield return entry;
                }
            }

            foreach (var entry in LoadChildrenBatched<T>(parentLink, defaultCulture))
            {
                yield return entry;
            }
        }

        private IEnumerable<T> LoadChildrenBatched<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : IContent
        {
            var start = 0;

            while (!_isStopped)
            {
                var batch = _contentRepository.GetChildren<T>(parentLink, defaultCulture, start, 50);
                if (!batch.Any())
                {
                    yield break;
                }
                foreach (var content in batch)
                {
                    // Don't include linked products to avoid including them multiple times when traversing the catalog
                    if (!parentLink.CompareToIgnoreWorkID(content.ParentLink))
                    {
                        continue;
                    }

                    yield return content;
                }
                start += 50;
            }
        }

And we will start from SiteDefinition.Current.GlobalAssetsRoot, and use IContentRepository.GetReferencesToContent to see if it is used in any content (both CMS and Catalog). If not, we add it to a list. Later, we use IPermanentLinkMapper to see if it has any blob associated, and delete that as well

            foreach (var asset in GetAssetRecursive<MediaData>(SiteDefinition.Current.GlobalAssetsRoot, CultureInfo.InvariantCulture))
            {
                totalAsset++;
                if (!_contentRepository.GetReferencesToContent(asset.ContentLink, false).Any())
                {
                    toDelete.Add(asset.ContentLink.ToReferenceWithoutVersion());
                }

                if (toDelete.Count % 50 == 0)
                {
                    var maps = _permanentLinkMapper.Find(toDelete);
                    foreach (var map in maps)
                    {
                        deletedAsset++;
                        _contentRepository.Delete(map.ContentReference, true, EPiServer.Security.AccessLevel.NoAccess);
                        var container = Blob.GetContainerIdentifier(map.Guid);
                        //Probably redundency, can just delete directly
                        var blob = _blobFactory.GetBlob(container);
                        if (blob != null)
                        {
                            _blobFactory.Delete(container);
                        }
                        OnStatusChanged($"Deleting asset with id {map.ContentReference}");
                    }
                    toDelete.Clear();
                }
            }

We need another round of delete after the while loop to clean up the left over (or if we have less than 50 abandoned assets)

And we’re done!

Testing this job is simple – uploading a few assets to your cms and do not use it anywhere, then run the job. it should delete those assets.

Things to improve: we might want to make sure only assets that created more than a certain number of days ago are deleted. This allows editors to upload assets for later uses without having to use them immediately.

The code has been open sourced at vimvq1987/DeleteUnusedAssets: Delete unused assets from an Optimizely/Episerver site (github.com) , and I have uploaded a nuget package to Packages (optimizely.com) to be reviewed.

Skip validating carts

Carts are meant to be validated. Prices changed, customers add more quantity than allowed, promotions expired, stock ran out, etc.. All kinds of stuffs that make the items in carts need validation to make sure they up to date, and be ready to be converted to an order.

However, there are cases when you don’t want your carts to be validated. The most common case is of course, wish list – a special cart that allows customer to add items to, just to keep track of. You certainly don’t want to touch it. Another example is quote – when you give a specific item at specific price for a customer, and you don’t want it to be automatically changed to the public prices, which is different from that said price.

By default, when it is called to validate a cart, these things will be done:

  • Remove items that no longer available (either deleted or end of line)
  • Update prices of the items to the latest applicable prices, or remove items that have no prices.
  • Update quantity of the items (to comply with the settings or in stock quantity), or remove items that are out of stock
  • Apply promotions
  • Update quantity again (As promotions could do things like adding free items to the cart)

There are two ways you can avoid carts being validated, let’s see what we can do.

The “Wish list names” route

With OrderOptions you can set certain wishlist names to be exempted from the validation, using WishListCartNames. By default, it’s only “Wishlist”, but you can set several using the comma separator, like this

orderOptions.WishListCartNames = "Wishlist,Quote";

However there is a caveat, with this approach, carts in those names will not be shown in the Order Management (If you want, you can change that, however it is not an easy or quick one)

The OrderValidationService route

The validation of carts (or rather, order types in general) is done by OrderValidationService. And that class is meant to be extended if necessary. Here is how you would avoid validation carts with name “Quote”, using OrderValidationService

    public class CustomOrderValidationService : OrderValidationService
    {
        public CustomOrderValidationService(ILineItemValidator lineItemValidator, IPlacedPriceProcessor placedPriceProcessor, IPromotionEngine promotionEngine, IInventoryProcessor inventoryProcessor, OrderOptions orderOptions) : base(lineItemValidator, placedPriceProcessor, promotionEngine, inventoryProcessor, orderOptions)
        {
        }

        public override IDictionary<ILineItem, IList<ValidationIssue>> ValidateOrder(IOrderGroup orderGroup)
        {
            if (orderGroup.Name.Equals("Quote", System.StringComparison.OrdinalIgnoreCase))
            {
                return new Dictionary<ILineItem, IList<ValidationIssue>>();
            }
            return base.ValidateOrder(orderGroup);
        }
    }

And as OrderValidationService is registered by ServiceConfiguration attribute, you can register yours by

services.AddTransient<OrderValidationService, CustomOrderValidationService>();

One caveat though, making changes to OrderValidationService means those changes will apply to the entire website, so make sure the changes are actually the ones you want site-wide, not just in specific places.