Mass update catalog entries

This is something you don’t do daily, but you will probably need one day, so it might come in handy.

Recently we got a question on how to update the code of all entries in the catalog. This is interesting, because even thought you don’t update the codes that often (if at all, as the code is the identity to identify the entries with external system, such as ERPs or PIMs), it raises a question on how to do mass update on catalog entries.

    • Update the code directly via database query. It is supposedly the fastest to do such thing. If you have been following my posts closely, you must be familiar with my note regarding how Episerver does not disclose the database schema. I list it here because it’s an option, but not the good one. It easily goes wrong (and cause catastrophes), you have to deal with versions and cache, and those can be hairy to get right. Direct data manipulation should be only used as the last resort when no other option is available.

  • Update via content APIs.  This snippet can be found from my previous post
        public void UpdateCodes(ContentReference contentLink)
        {
            var children = _contentRepository.GetChildren<CatalogContentBase>(contentLink, new LoaderOptions() { LanguageLoaderOption.MasterLanguage() });
            foreach (var child in children)
            {
                if (child is EntryContentBase)
                {
                    UpdateProductCodes(child as EntryContentBase);
                }
                else if (child is NodeContent || child is CatalogContent)
                {
                    UpdateCodes(child.ContentLink);
                }
            }
        }
 
        private void UpdateProductCodes(EntryContentBase entryContent)
        {
            var writableCopy = entryContent.CreateWriteableClone() as EntryContentBase;
            writableCopy.Code = writableCopy.Code + "_edited";
            _contentRepository.Save(writableCopy, SaveAction.Publish, AccessLevel.None);
        }
    }

(I write this code in the editor of WordPress, so it might not entirely be correct, but you get the idea. You can even make it further by force saving the current version, so it will not create a new version you don’t care about).

Now we are using the (highly) recommended APIs, is this solution good? As much as I praise the content APIs, this is not performance-optimized. There are two things that can slow it down. First, we are loading in batch, saving individually. Secondly, saving a catalog content via content APIs has significant overheads (events, pipeline, you name it). Those overheads are actually useful, and recent versions of Commerce did a good job of reducing them by not saving unchanged properties, but for our purpose, it’s just waste. It’s perfectly fine to save a few hundreds of catalog contents via content APIs, but when it comes to dozens of thousands, or even more, you will very likely have to wait for a long time.

Is there a better option which does not compromise best practices, nor performance?

The Search API comes to rescue.

I talked quite long ago on why using Search APIs for search functionalities is not a good idea. But those APIs exist for one reason – batch editing.

The problem we have can be solved by this

            CatalogSearchOptions options = new CatalogSearchOptions();
            CatalogSearchParameters searchParams = new CatalogSearchParameters();
            int totalCount = 0;
            _catalogSystem.FindItemsDto(searchParams, options, ref totalCount);
            int recordsCount = 0;
            while (totalCount > 0 && recordsCount < totalCount)
            {               
                options.RecordsToRetrieve = 500; //We'll get 500 catalogentry each time
                options.StartingRecord = recordsCount;
                var catalogEntriesDto = CatalogContext.Current.FindItemsDto(searchParams, options, ref totalCount, new CatalogEntryResponseGroup());
                foreach (var catalogEntryDto in catalogEntriesDto)
                {
                    //Update the code
                }
                _catalogSystem.SaveCatalogEntryDto(catalogEntriesDto);
            }

Here we are using ICatalogSystem.FindItemsDto to get CatalogEntryDto in batches of 500, update, then save them in batches of 500. By using low level APIs and doing things in batches, we will have quite significant performance improvements over the content APIs way.

It’s worth remind that this is NOT something you should do daily. The DTO and ICatalogSystem APIs are still supported, but they are not recommended ways of working with catalog content.  But until Episerver provides a way to save catalog contents in batches, this is the best we can do now.

That raises another question, how would we update fields which are not in the DTO, but in the MetaObjects? Yes, there is a way for that, but that should be a topic to an upcoming blog post.

Leave a Reply

Your email address will not be published. Required fields are marked *