Lelit Bianca v3 vs ECM Synchronika vs Profitec Pro 700

If you are looking for an espresso machine with range of $3000 (or around €2500 if you are in EU – this is one of the wins for European), you will most likely come to battle of these three. They are probably the most popular options in this price range, and rightly so. The prices are fairly comparable, with Profitec Pro 700 is the cheapest in the US (around $200), and Lelit Bianca v3 is the cheapest in the EU (also around €200). I did quite intensive research on the topic, and finally come to the conclusion (spoiler alert, in the end of this post).

If you haven’t known already, Profitec is a subsidiary of ECM. Pro 700 is still made in Milan, Italy, but it shares a lot of design with ECM Synchronika. Basically two sibling except for some cosmetic difference. I would expect them to perform very similar. For easier comparison, I will compare Bianca and Synchronika. Let’s go through pros and cons of each, and hopefully it will help you come to a decision

They are very similar espresso machines on definitions. Both are E61, dual boiler machines that target home enthusiasts.

Build quality

This is no contest. Synchronika is a clear winner, Pro 700 a second and Bianca comes last. It is not only that Synchronika has better fit and finish, it has clear internal layout which is like an engineer’s dream. Whole Latte Love has several dive in videos for that, and it means if you ever need to service your Synchronika yourself, you will easily know where to go and what to check/change

Bianca has less fit and finish, and its internal is pretty cramped – more on that below

To be very clear Bianca’s build quality is definitely more than decent, and it would last you a very long time with proper care. The cramped inside has two reasons – due to its smaller size and more features.

Size and look

Of all three, Lelit Bianca is smallest, and is the only one come with wood (walnut) knobs and wands finish by default, while the others come with hard black plastic . It is only 29cm wide and only 40cm deep. Both ECM and Profitec are noticeable larger, with the former is 33.5cm wide and 49cm deep, and the latter 34cm and 47cm, respectively.

While look is definitely subjective – make sure each of these machine can fit into your coffee station, either that is under your cupboard or otherwise. One of the biggest selling point of Bianca is the moveable water tank, you can put it behind, left or right. All three can be plumbed and you can put your days of refilling water behind you, but sometimes plumbing is not an option, and being able to move the tank is a huge plus. As it’s my biggest complaint of Lelit Elizabeth. Now it can be solved easily.

Start up time

If you have very stable schedule every day, start up time might not be of your concern, you can use a smart power plug and schedule it to turn on your coffee machine every day at a fixed time. but let’s be very clear here: all these machines take a significantly long time to be fully heated. Not only they have to heat up both boilers to temp and let them stabilize, they also need to heat up the E61 group head via thermosiphon (baristahustle explains it in great detail here EM 3.04 How the E61 Thermosyphon Works – Barista Hustle – but basically, let hot water flow through the head to heat it up). The E61 is very heavy, like 4kg heavy, so it’s important to make it hot, so the water does not lose too much temperature during brewing.

Synchronika takes significantly longer to heat up. By the test of kaffemacher, it takes a whoping 35 minutes to be able to pull 5 shots without failing (not reaching targeted temp)

That is double of what Bianca v3 needs

That means you can start pulling shorts 16 minutes faster on Bianca. That’s impressive. If you want to brew lighter roasts which need higher temps, say, 96*C, unofficial and unscientific tests showed that Bianca is ready in even shorter time (12 minutes), based on the indication on the PID. It’s not breaking any records, but for E61, that’s nothing to be sniffed at.

Temperature stability

This is one interesting test. Thanks to kaffemacher we have measurements from both machines, and it’s a tie

  • ECM Synchronika has more stability during the shot. i.e. with a 25s shot, the temp between 5s and 25s remains a more straight line (albeit hotter toward the end). With shot after shot however, it tends to be under temp after being idle for some time
  • Lelit Bianca has more stability between shots. Temp within shot is fluctuated a bit, but does not rise up as much as with Synchronika. You can, however, adjust the PID with settings like temp offset to have even better temp stability, especially after your machine has been idling for a while.

Features

Bianca hand down.

Bianca comes with the default flow control by default. ECM Synchronika and Profitec Pro 700 can be retrofitted with the E61 flow control package, which cost you somewhere $200 more, plus installation. As most people has commented, Bianca flow control feels natural and nicer to use. That is of course subjective, but it is not too surprising. The main difference is that Bianca flow control has ~200 degree travel from fully open to fully close, while the E61 flow control is ~720 degree. The former allows some more fine tuning, but it is less intuitive to use.

Bianca can pre-infuse even with water tank, while ECM and Profitec need plumbed in to pre-infuse (using line pressure). Bianca v3 has low flow settings which make pre infusion even more flexible. You can pre-infuse in any way you like.

Lelit is also known to make their Lelit Control Center – LCC settings available to end users and you can fine tune your machine even further. Most notably, the temp offset (between the boiler, and the targeted temp at group head), so you can fine tune your brew temp to what you would like.

Conclusion

When I bought my Lelit Elizabeth, I thought about Bianca as something I wanted but couldn’t get, and if I upgrade, I would pick it. After two years, when I finally decided to upgrade, for some reasons I skipped Bianca. I almost decided to go with Synchronika but slowly and steadily Bianca won me back. And I will be soon one of its owners.

With that said, you can’t go wrong with each option. Those three are the most popular options in their price range, and there’s reason for that – they are that good.

Optimizing an interesting query

It’s not a secret, I love optimizing things. In a sense, I am both an Optimizer (literally) and an optimizer. And today we will be back to basic – optimizing a tricky SQL query.

The query in question is this particular stored procedure ecf_CatalogNode_GetAllChildNodes, this is used to get all children nodes of specific nodes. It is used in between to find all entries that are direct, or indirect children of specific nodes. Why, you might ask, because when you change the url segment of the node, you want to make sure that all entries that are under that node, will have their indexed object refreshed.

Let’s take a look at this stored procedure, this is how it looks like

CREATE PROCEDURE [dbo].[ecf_CatalogNode_GetAllChildNodes]
    @catalogNodeIds udttCatalogNodeList readonly
AS
BEGIN
    WITH all_node_relations AS 
    (
        SELECT ParentNodeId, CatalogNodeId AS ChildNodeId FROM CatalogNode
        WHERE ParentNodeId > 0
        UNION
        SELECT ParentNodeId, ChildNodeId FROM CatalogNodeRelation
    ),
    hierarchy AS
    (
        SELECT 
            n.CatalogNodeId,
            '|' + CAST(n.CatalogNodeId AS nvarchar(4000)) + '|' AS CyclePrevention
        FROM @catalogNodeIds n
        UNION ALL
        SELECT
            children.ChildNodeId AS CatalogNodeId,
            parent.CyclePrevention + CAST(children.ChildNodeId AS nvarchar(4000)) + '|' AS CyclePrevention
        FROM hierarchy parent
        JOIN all_node_relations children ON parent.CatalogNodeId = children.ParentNodeId
        WHERE CHARINDEX('|' + CAST(children.ChildNodeId AS nvarchar(4000)) + '|', parent.CyclePrevention) = 0
    )
    SELECT CatalogNodeId FROM hierarchy
END

I previously wrote about the relations between entities in Commerce catalog, here Commerce relation(ship), a story – Quan Mai’s blog (vimvq1987.com) , so relations between nodes can be a bit complicated – a node can have one true parent defined in CatalogNode table, and then other “linked” nodes in CatalogNodeRelation . So to find all children – and grand children of a node, you need to get from both.

Getting children of a node from CatalogNode or CatalogNodeRelation is simple, but things become more complicated when you have to get grandchildren, then great-grandchildren, and so on, and so forth. with that, CTE needs to be used in a recursive way. But then there is a problem arises – there is a chance, small, but still, that the data was added in a correct way, so circular reference is possible. i.e. A is a parent of B, which is a parent of C, and itself is a parent of A. To stop the SP from running forever, a check needs to be added to make sure any circular reference is cut short.

This brings back memory as the first ever support case I worked on at Optimizely (then Episerver) was with a circular reference. The site would crash whenever someone visited the catalog management in Commerce Manager. That was around June, 2012 (feeling old now?). My “boss” at that time involuntarily volunteered me for the case. See what you made me do, boss.

Now you can grasp the basic of what the SP does – let’s get back to the original problem. it’s slow to run especially with big catalog and complex node structure. As always, to optimize everything you need to find the bottleneck – time to fire up SQL Server Management Studio and turn on the Actual Execution Plan

I decided to go with 66, the “root” catalog node. this query yield around 18k rows

declare @Nodes udttCatalogNodeList 

insert into @Nodes (CatalogNodeId) select 66

exec ecf_CatalogNode_GetAllChildNodes @Nodes

and also 18s of execution.

Mind you, this is on my machine with pretty powerful CPU (AMD Ryzen 7 5800x, 8 cores 16 threads), and a very fast nvme PCIe SSD (Western Digital Black SN850 2TB). If this was executed on Azure Sql database for example, a timeout is almost certainly guaranteed. So time of execution should only be compared relatively with each other.

If we look at the execution plan, it is quite obvious where the bottleneck is. A scan on CatalogNode table is heavy (it read 79M rows on that operation). As suggest by Anders from Timeout when deleting CatalogNodes from a large catalog (optimizely.com), adding a non clustered index on ParentNodeId column would improve it quite a lot. And indeed it does. The execution time is reduced to 5 second.

And the number of rows read on CatalogNode reduced to just 17k

This is of course a very nice improvement. But the customer reported that it is not enough and the SP is still giving timeout, i.e. further optimization is needed.

Naturally, the next step would be to see if we can skip the circular check. It was added as a safe measure to avoid bad data. It should not be there, as the check should be performed at data modification. But it is there for historical reasons and we can’t just change it, not trivially. So let’s try it for our curiousity.

The modified query looks like this (basically just commented out any code related to the CyclePrevention

ALTER PROCEDURE [dbo].[ecf_CatalogNode_GetAllChildNodes]
    @catalogNodeIds udttCatalogNodeList readonly
AS
BEGIN
    WITH all_node_relations AS 
    (
        SELECT ParentNodeId, CatalogNodeId AS ChildNodeId FROM CatalogNode
        WHERE ParentNodeId > 0
        UNION
        SELECT ParentNodeId, ChildNodeId FROM CatalogNodeRelation
    ),
    hierarchy AS
    (
        SELECT 
            n.CatalogNodeId
			--, '|' + CAST(n.CatalogNodeId AS nvarchar(4000)) + '|' AS CyclePrevention
        FROM @catalogNodeIds n
        UNION ALL
        SELECT
            children.ChildNodeId AS CatalogNodeId
			--, parent.CyclePrevention + CAST(children.ChildNodeId AS nvarchar(4000)) + '|' AS CyclePrevention
        FROM hierarchy parent
        JOIN all_node_relations children ON parent.CatalogNodeId = children.ParentNodeId
        --WHERE CHARINDEX('|' + CAST(children.ChildNodeId AS nvarchar(4000)) + '|', parent.CyclePrevention) = 0
    )
    SELECT CatalogNodeId FROM hierarchy
END

And the improvement is quite impressive (more than I expected), the query completes almost instantly (less than 1s). The read on CatalogNodeRelation significantly reduced

A word of warning here, execution plan can’t be simply compared as-is. If I run two versions side by side, it gives quite misleading comparison

Even though the top one (without the circular reference check) is much faster than the original (the bottom one), SQL Server estimates that the first is slower (almost 2x slower than the second). So execution plan should be used to see what has been done and what is likely the bottleneck inside a query, it should not be used as comparison between queries. In most cases, comparing statistics using set statistics io on is the best way to compare.

If not for the fact that we are changing the behavior of the stored procedure, I would be happy with this approach. The chance of running into circular reference is small, but it is not zero. As we said, we can in theory gating the relation during insert/updating, but that would be too big a change to start with. This is one of constraint as we work at framework level – we have to step carefully to not break anything. A breaking change is bad, but a data corruption is simply unacceptable. I spent a few hours (probably more than I should) trying to optimize the circular reference check, but no better solution is found.

The next approach would be – as we can guess, to make sure that we get rid of the Clustered Index Scan happened on the CatalogNodeRelation table. The solution would be quite simple, a non clustered index on the `ParentNodeId should be enough.

Great success. The performance is comparable with the “non circular reference check” approach.

As adding an index is a non breaking change (and albeit in some cases it can cause performance regression, like in A curious case of SQL execution plan – Quan Mai’s blog (vimvq1987.com) , but it is rare, also, in this case the cardinality of the ParentNodeId is most likely quite well distributed).

That is all for today. Hopefully you learn one thing or two about optimizing queries in your daily works.

Delete orphaned assets

I was asked this question: we have about 3TB of assets, any way to clean it up.

These days, storage is cheap, but still not free. and big storage means you need space for back up. and with that, bandwidth and time.

Is there away to clean up things you no longer need?

Yes!

Optimizely Content already has a scheduled job named Remove Abandoned BLOBs, but this job only removes the blobs that have no content associated. I.e. the content is deleted by IContentRepository.Delete but the blob was left behind. The job uses the log to find out which content were deleted, then find those blobs.

How’s about the assets that still have contents associated with them, but not used anywhere? Time to get your hands dirty!

Due to the nature of this task, it is best to make it a scheduled job.

All of the assets are children under the global asset root. By iterating over them, we can check if each of them is being used by another content. If not, we will add them to a list for later delete. Before deleting the content, we will find the blob and then delete it as well. Easy, right?

To get the content recursively we use this little piece of code

        public virtual IEnumerable<T> GetAssetRecursive<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : MediaData
        {
            foreach (var folder in LoadChildrenBatched<ContentFolder>(parentLink, defaultCulture))
            {
                foreach (var entry in GetAssetRecursive<T>(folder.ContentLink, defaultCulture))
                {
                    yield return entry;
                }
            }

            foreach (var entry in LoadChildrenBatched<T>(parentLink, defaultCulture))
            {
                yield return entry;
            }
        }

        private IEnumerable<T> LoadChildrenBatched<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : IContent
        {
            var start = 0;

            while (!_isStopped)
            {
                var batch = _contentRepository.GetChildren<T>(parentLink, defaultCulture, start, 50);
                if (!batch.Any())
                {
                    yield break;
                }
                foreach (var content in batch)
                {
                    // Don't include linked products to avoid including them multiple times when traversing the catalog
                    if (!parentLink.CompareToIgnoreWorkID(content.ParentLink))
                    {
                        continue;
                    }

                    yield return content;
                }
                start += 50;
            }
        }

And we will start from SiteDefinition.Current.GlobalAssetsRoot, and use IContentRepository.GetReferencesToContent to see if it is used in any content (both CMS and Catalog). If not, we add it to a list. Later, we use IPermanentLinkMapper to see if it has any blob associated, and delete that as well

            foreach (var asset in GetAssetRecursive<MediaData>(SiteDefinition.Current.GlobalAssetsRoot, CultureInfo.InvariantCulture))
            {
                totalAsset++;
                if (!_contentRepository.GetReferencesToContent(asset.ContentLink, false).Any())
                {
                    toDelete.Add(asset.ContentLink.ToReferenceWithoutVersion());
                }

                if (toDelete.Count % 50 == 0)
                {
                    var maps = _permanentLinkMapper.Find(toDelete);
                    foreach (var map in maps)
                    {
                        deletedAsset++;
                        _contentRepository.Delete(map.ContentReference, true, EPiServer.Security.AccessLevel.NoAccess);
                        var container = Blob.GetContainerIdentifier(map.Guid);
                        //Probably redundency, can just delete directly
                        var blob = _blobFactory.GetBlob(container);
                        if (blob != null)
                        {
                            _blobFactory.Delete(container);
                        }
                        OnStatusChanged($"Deleting asset with id {map.ContentReference}");
                    }
                    toDelete.Clear();
                }
            }

We need another round of delete after the while loop to clean up the left over (or if we have less than 50 abandoned assets)

And we’re done!

Testing this job is simple – uploading a few assets to your cms and do not use it anywhere, then run the job. it should delete those assets.

Things to improve: we might want to make sure only assets that created more than a certain number of days ago are deleted. This allows editors to upload assets for later uses without having to use them immediately.

The code has been open sourced at vimvq1987/DeleteUnusedAssets: Delete unused assets from an Optimizely/Episerver site (github.com) , and I have uploaded a nuget package to Packages (optimizely.com) to be reviewed.

Skip validating carts

Carts are meant to be validated. Prices changed, customers add more quantity than allowed, promotions expired, stock ran out, etc.. All kinds of stuffs that make the items in carts need validation to make sure they up to date, and be ready to be converted to an order.

However, there are cases when you don’t want your carts to be validated. The most common case is of course, wish list – a special cart that allows customer to add items to, just to keep track of. You certainly don’t want to touch it. Another example is quote – when you give a specific item at specific price for a customer, and you don’t want it to be automatically changed to the public prices, which is different from that said price.

By default, when it is called to validate a cart, these things will be done:

  • Remove items that no longer available (either deleted or end of line)
  • Update prices of the items to the latest applicable prices, or remove items that have no prices.
  • Update quantity of the items (to comply with the settings or in stock quantity), or remove items that are out of stock
  • Apply promotions
  • Update quantity again (As promotions could do things like adding free items to the cart)

There are two ways you can avoid carts being validated, let’s see what we can do.

The “Wish list names” route

With OrderOptions you can set certain wishlist names to be exempted from the validation, using WishListCartNames. By default, it’s only “Wishlist”, but you can set several using the comma separator, like this

orderOptions.WishListCartNames = "Wishlist,Quote";

However there is a caveat, with this approach, carts in those names will not be shown in the Order Management (If you want, you can change that, however it is not an easy or quick one)

The OrderValidationService route

The validation of carts (or rather, order types in general) is done by OrderValidationService. And that class is meant to be extended if necessary. Here is how you would avoid validation carts with name “Quote”, using OrderValidationService

    public class CustomOrderValidationService : OrderValidationService
    {
        public CustomOrderValidationService(ILineItemValidator lineItemValidator, IPlacedPriceProcessor placedPriceProcessor, IPromotionEngine promotionEngine, IInventoryProcessor inventoryProcessor, OrderOptions orderOptions) : base(lineItemValidator, placedPriceProcessor, promotionEngine, inventoryProcessor, orderOptions)
        {
        }

        public override IDictionary<ILineItem, IList<ValidationIssue>> ValidateOrder(IOrderGroup orderGroup)
        {
            if (orderGroup.Name.Equals("Quote", System.StringComparison.OrdinalIgnoreCase))
            {
                return new Dictionary<ILineItem, IList<ValidationIssue>>();
            }
            return base.ValidateOrder(orderGroup);
        }
    }

And as OrderValidationService is registered by ServiceConfiguration attribute, you can register yours by

services.AddTransient<OrderValidationService, CustomOrderValidationService>();

One caveat though, making changes to OrderValidationService means those changes will apply to the entire website, so make sure the changes are actually the ones you want site-wide, not just in specific places.

Index only Catalog content

If you are using Find to index your content, you likely have used the Find Indexing job – which would index everything in one go. Today I stumped upon this question – A way to run indexing job for Commerce only | Optimizely Develope – and it is a good one – if you have many of content in CMS side, and they don’t change that often, if at all – you certain don’t want to waste time and resource in trying to reindex them again. Is there away to just index catalog content?

Yes, there is. It is a bit hacky solution, but it can certain work. But first, let’s dive in on how Find indexing job does it work. It relies on IIndexingJobService , which itself relies on ContentIndexer to do the job. In its turn, ContentIndexer uses a list of IReindexInformation to know which content to index, and in which languages. Here’s what it looks like

    public interface IReindexInformation
    {
        /// <summary>
        /// Content links to be reindexed.
        /// </summary>
        IEnumerable<ReindexTarget> ReindexTargets { get; }

        /// <summary>
        /// Gets the root to index.
        /// </summary>
        ContentReference Root { get; }
    }

It has one Root, and multiple ReindexTarget, which contains

    public class ReindexTarget
    {
        /// <summary>
        /// The content references.
        /// </summary>
        public IEnumerable<ContentReference> ContentLinks { get; set; }

        /// <summary>
        /// The languages the collection of <see cref="ContentReference"/> are enabled on.
        /// </summary>
        public IEnumerable<CultureInfo> Languages { get; set; }

        /// <summary>
        /// The site that the collection of <see cref="ContentReference"/> appears on
        /// or <c>null</c> if unknown.
        /// </summary>
        public SiteDefinition SiteDefinition { get; set; }
    }

As you might have guessed, Commerce has its own IReindexInformation to index catalog content. If we can only use that to run our job. This is how our “hack” begins

The interface IContentIndexer has no method to control the IReindexInformation`, but the default implementation ContentIndexer does. We set it to the only one we need, so here it is

        List<IReindexInformation> targets;
        var contentIndexer = _contentIndexer as ContentIndexer;
        if (contentIndexer != null)
        {
            targets = contentIndexer.ReindexInformation.ToList();
            var commerceReIndexInformation = targets.FirstOrDefault(x => x.GetType() == typeof(CommerceReIndexInformation));
            contentIndexer.ReindexInformation = new List<IReindexInformation>() { commerceReIndexInformation };
            _indexingJobService.Start(OnStatusChanged);

            contentIndexer.ReindexInformation = targets;
        }

A note is that you will still see the “Indexing Global assets and other data” message, because IIndexingJobService implementation will go through all SiteDefinition regardless and show that message, but the internal ContentIndexer will skip if the SiteDefinition passed to it does not match the SiteDefinition in the IReindexInformation (and for CommerceReIndexInformation it’s SiteDefinition.Empty

As I mentioned in the beginning, this is a bit hacky solution, as you have to cast IContentIndexer to its concrete implementation. The proper solution would be implement IContentIndexer yourself. Given that’s not a trivial job, I’ll leave at that.

Loading the contacts/organizations, the right way

If you have been using Business Foundation, you most likely know about a limitation – you can only load the first 1000 objects using the GetXXX methods. For example, by using CustomerContext.Current.GetOrganizations(), you can load the first 1000 organizations. In theory, you can get more objects by changing the value of MaxObjectsList. However, changing that has consequences. Changing that will affect all types of objects, including contacts, organizations, and your custom objects. Also, loading too much in one go is almost never a good idea.

Is there a better way?

Yes, of course – which is why we have this blog post

There is a “hidden” method from base class of Business Foundation – BusinessManager that takes paging parameters

public static EntityObject[] List(string metaClassName, FilterElement[] filters, SortingElement[] sorting, int? start, int? count)

You will need to convert the results to the type you want. Note that all Business Foundation objects are inherited from EntityObject. So if you want to get the contacts by paging, it would look like this:

                var contacts = BusinessManager.List(ContactEntity.ClassName, new FilterElement[0], new SortingElement[] { new SortingElement(sortField, sortType) }, startIndex, recordsToRetrieve)
                .OfType<CustomerContact>();

Let’s go through the parameters one by one.

  • The first you need is the class name of your objects. For contacts, you can use ContactEntity.ClassName as shown above. For organizations, OrganizationEntity.ClassName
  • Next one is the filter. As you are trying to load all objects, you can just pass in an empty (but not null) instance – new FilterElement[0]
  • Third one is how you want to sort it. If you pass an empty array, it will be sort by default. If you want to sort by Name for example, set your sortField to Name and sortType to one of SortingElementType (Asc or Desc)
  • Forth and fifth ones are what we are looking for, they’re simply paging parameters – which position to start getting, and how many objects to get. Combine this with a simple while loop, you can get all of your Business Foundation objects.

And that’s about it, my friends.

What’s about caching?

Caching with list is always tricky – as you have to keep track of each item in the list to make sure you invalidate the list cache if one of the item is changed (updated/removed). For the purpose of just loading all contacts/organizations, it is probably better to just skip caching, for simplicity.

Delete property no longer available in code

Recently I stumped upon this question Removing a property that no longer exists in the code (optimizely.com) . it’s a valid (and even good) question. It is easy to add a new property to your catalog content type – you can simply add a new property to the model, build and start the site. However the opposite is not easy. In Commerce 14 at least.

A property for the strongly typed content type, is actually mapped and backed by a MetaField in MetaDataPlus system (of course unless you specifically tell it not to, by using IgnoreMetaDataPlusSynchronization attribute). When you add a new property to your content type, build and start your site, your content type is scanned and metafields will be created if necessary. However, if you delete a property from your content type, the scanner will just leave the metafield there. There are a few reasons for that. Firstly, it allows loosely typed content type, i.e. content types with none, or only a few property defined. If you have used some kind of external PIM, you’ll understand why it is important. Lastly, because the property can be mapped with a metafield of different name, the scanner might have trouble figuring out which metafield to delete. All in all, keeping the metafields is the sensible (if not the right) choice.

Then what to do if you want to delete the property and also clean up the metafield? With Commerce 13 and earlier, you can detach a MetaField from its MetaClass(s), then delete it using Commerce Manager. With the dead of CM in Commerce 13, what is your option?

By using code, of course. There are a few APIs – namely MetaField and MetaClass that can be used for that purpose. Note that there are two MetaField and MetaClass, and only the ones in Mediachase.MetaDataPlus.Configurator namespace are what we want (the others are for Business Foundation)

Enough for chit chat, this is the code that you would need to run

        private void DeleteMetaField(string metafieldName)
        {
            var metaField = MetaField.Load(CatalogContext.MetaDataContext, metafieldName);
            if (metaField == null)
            {
                return;
            }
            foreach (int metaClassId in metaField.OwnerMetaClassIdList)
            {
                var metaClass = MetaClass.Load(CatalogContext.MetaDataContext, metaClassId);
                if (metaClass == null)
                {
                    metaClass.DeleteField(metafieldName);
                }
            }
            MetaField.Delete(CatalogContext.MetaDataContext, metaField.Id);
        }

It is pretty straightforward. We load the MetaField by its name, if it is not null, then we remove it from all MetaClass that are using it, then eventually delete it.

In beginning of this post we mentioned strongly typed content type, but note that order system also uses the same metaclass/metafield system, so this code can be used for them as well.

This piece of code can be used in an admin-privilege controller to delete metafields on demand. Until Commerce 14 allows you to do it with a proper UI.

A curious case of cookie threading issue – part 2

A while back, I wrote this post A curious case of cookie threading issue – Quan Mai’s blog (vimvq1987.com) as an example of how to use WinDbg to diagnose a race condition (with results in an infinite loop). Naturally that issue needs to be fixed. A starting point to fix a race condition is to put locks whenever you change the data. But it’s not that simple. Too much locking would definitely hurt your application performance, and increase the chance of you running into another nasty problem (namely, deadlock). So the art of locking is “lagom” – Swedish word for “just right”, not too much, not too little.

ReaderWriteLockSlim comes to rescue. You only need to lock when writing, not locking. So this for reading

            readerWriterLockSlim.EnterReadLock();
            try
            {
                return httpRequest.Cookies[CookieKey];
            }
            finally
            {
                readerWriterLockSlim.ExitReadLock();
            }

And this for writing

            readerWriterLockSlim.EnterWriteLock();
            try
            {
                httpRequest.Cookies.Remove(CookieKey);               
            }
            finally
            {
                readerWriterLockSlim.ExitWriteLock();
            }

Would be enough, right?

No. The issue runs deeper than that.

If we go back to the stacktrace of the issue

0:039> !clrstack
OS Thread Id: 0x1e34 (39)
        Child SP               IP Call Site
000000740273cff0 00007ff9297d9f59 System.Collections.Generic.HashSet`1[[System.__Canon, mscorlib]].Remove(System.__Canon)
000000740273d080 00007ff92a8eb4e3 System.Web.HttpCookieCollection.Get(System.String)
000000740273d0c0 00007ff9314de98d 

and look closer. Something is not right. If you do not spot the issue – it’s fine, I missed it too. The problem is that we have a HashSet.Remove inside a HttpCookieCollection.Get. Uh oh.

If we look at source code of HttpCookieCollection, inside Get, it calls to EnsureKeyValidated

if (cookie != null) {
                EnsureKeyValidated(name, cookie.Value);
            }

referencesource/HttpCookieCollection.cs at master · microsoft/referencesource · GitHub

which itself calls to

            _keysAwaitingValidation.Remove(key);

referencesource/HttpCookieCollection.cs at master · microsoft/referencesource · GitHub

_keysAwaitingValidation is a HashSet<string> . That explains the Remove we saw. And that explains why ReaderWriterLockSlim is not enough – changes are made within the supposedly read only action as well.

The only valid solution here is to lock both read and write actions on HttpCookieCollection. However, as HttpCookieCollection is per request, so our lock object should only be per request as well (we certainly do not want every thread to be locked when we get cookie on a request).

Moral of the story:

  • Look closer. There might always be something underneath. Close to the truth is still not the truth.
  • Never assume that a Get method is thread-safe. The implementation can do plenty of unexpected things under the hood.
  • You might ask why this happen, as HttpCookieCollection is per request. Turned out that there is a code that use ThreadPool.QueueUserWorkItem to queue tasks which share the same HttpContext object. As we learned, that’s the recipe for disaster. Think twice (or thrice) before sharing an object that is not thread-safe between threads.

Delete a content – directly from database

Before we even start, I would reiterate that manipulating data directly should be avoided unless absolutely necessary, it should be used as the last resort, and should be proceeded with cautions – always back up first and test your queries on development database first before running it in production. And if the situation dictates that you have to run the query, better do it with the 4 eyes principle – having a colleague double check it for you. When it comes to production database, nothing is too careful.

Now back to the question, if you absolutely have to delete a content, you should do like this

exec editDeletePage @pageId = 123, @ForceDelete = 1

It is basically what Content Clouds (i.e. CMS) does under the hood, without the cache validation on the application layer of course.

So the moral of the story – do everything with API if you can. If you absolutely have to, use the built-in stored procedures – they are tested vigorously and should have minimal issues/bugs, and should take care of everything, data-wise for you. Only write your own query if there is no SP that can be used.

Update: Initially I mentioned Tomas’ post in this, and that gave impression his way is incorrect. I should have written better. My apologies to Tomas

A curious case of cookie threading issue

Threading is hard. It’s hard to get right. It’s hard to avoid race condition. Even with experienced developers, it’s not always a given (trust me, I’ve been there).

This time, the problem comes from a report that a customer constantly has high CPU situation, on all instances. Memory dumps were taken and I was able to take a look. As always, high CPU can be result of several causes, most likely thread deadlocks. For educational purposes, let’s take this memory dump step by step.

First steps are as with routine Windbg – open it. use Ctrl + D to start debugging a memory dump, then you would need to run .loadby sos clr to load the clr runtime

If you are debugging a memory dump that is captured on Azure, .loadby sos clr will not work with this error (if your Windows is installed to C:\ drive)

0:000> .loadby sos clr
The call to LoadLibrary(D:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.dll) failed, Win32 error 0n126
    "The specified module could not be found."
Please check your debugger configuration and/or network access.

Simply fix it by copy the path to sos.dll, replacing D with C, and rerun

.load C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.dll

That should fix it.

Next step would be checking the CPU situation by .threadpool command

0:000> !threadpool
CPU utilization: 98%
Worker Thread: Total: 160 Running: 6 Idle: 134 MaxLimit: 32767 MinLimit: 140
Work Request in Queue: 0
--------------------------------------
Number of Timers: 2
--------------------------------------
Completion Port Thread:Total: 6 Free: 4 MaxFree: 8 CurrentLimit: 6 MaxLimit: 1000 MinLimit: 4

This confirms a high CPU situation. It’s however worth noting this CPU is for the entire instance, not necessarily for the process (w3wp) only, but it’s highly likely for an Azure App Service instance that the w3wp is the problem here.

The next step would be checking if there is any long running thread

0:000> !runaway
 User Mode Time
  Thread       Time
   39:1e34     2 days 11:44:59.546
   40:219c     2 days 8:08:56.765
   41:2198     1 days 4:53:46.687
   17:1480     0 days 0:19:38.171
   20:1334     0 days 0:14:57.718
   18:1798     0 days 0:12:53.625
   19:12f8     0 days 0:12:05.015
   31:a9c      0 days 0:03:39.093
   27:1494     0 days 0:01:35.406
   36:1e14     0 days 0:01:28.890
    5:1124     0 days 0:00:38.578
    6:1560     0 days 0:00:37.218
    3:ff8      0 days 0:00:34.984

Normally, the first few threads in the list are not that interesting, they are usually just timers that run from the application start up. But let’s leave no stone unturned, shall we? Let’s switch to the longest running thread

0:000> ~39s
00007ff9`297d9f59 458bf7          mov     r14d,r15d

Hmm, it does not look like a listener. Let’s see what it actually is with !clrstack

0:039> !clrstack
OS Thread Id: 0x1e34 (39)
        Child SP               IP Call Site
000000740273cff0 00007ff9297d9f59 System.Collections.Generic.HashSet`1[[System.__Canon, mscorlib]].Remove(System.__Canon)
000000740273d080 00007ff92a8eb4e3 System.Web.HttpCookieCollection.Get(System.String)
000000740273d0c0 00007ff9314de98d Abck.Web.Features.User.UserContext.get_CountryCurrency()
000000740273d100 00007ff93154367e Abck.Web.Features.Cart.Services.CountryCurrencyService.GetCountrySettingThreeAlpha(Abck.Web.Features.User.UserContext)
000000740273d140 00007ff93153b6dd Abck.Web.Features.Product.ViewModels.Builders.CoreProductViewModelBuilder+d__29.MoveNext()
000000740273d1f0 00007ff93153b54a System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib]].Start[[Abck.Web.Features.Product.ViewModels.Builders.CoreProductViewModelBuilder+d__29, Abck.Web]](d__29 ByRef)
000000740273d2a0 00007ff93153b492 Abck.Web.Features.Product.ViewModels.Builders.CoreProductViewModelBuilder.GetProductPriceViewModel(System.String, Abck.Web.Features.User.UserContext)

That looks interesting. If we move to the second longest running thread, it looks oddly familiar

0:040> !clrstack
OS Thread Id: 0x219c (40)
        Child SP               IP Call Site
00000074be23d060 00007ff926c82f2e System.Collections.Generic.HashSet`1[[System.__Canon, mscorlib]].Contains(System.__Canon)
00000074be23d0d0 00007ff9307aebcd System.Web.HttpCookieCollection.EnsureKeyValidated(System.String, System.String)
00000074be23d110 00007ff92a8eb4e3 System.Web.HttpCookieCollection.Get(System.String)
00000074be23d150 00007ff9314de98d Abck.Web.Features.User.UserContext.get_CountryCurrency()
00000074be23d190 00007ff93154367e Abck.Web.Features.Cart.Services.CountryCurrencyService.GetCountrySettingThreeAlpha(Abck.Web.Features.User.UserContext)
00000074be23d1d0 00007ff93153b6dd Abck.Web.Features.Product.ViewModels.Builders.CoreProductViewModelBuilder+d__29.MoveNext()

And the third longest running thread

0:041> !clrstack
OS Thread Id: 0x2198 (41)
        Child SP               IP Call Site
00000078597bd8d0 00007ff9297d9f59 System.Collections.Generic.HashSet`1[[System.__Canon, mscorlib]].Remove(System.__Canon)
00000078597bd960 00007ff92a8eb4e3 System.Web.HttpCookieCollection.Get(System.String)
00000078597bd9a0 00007ff9314de98d Abck.Web.Features.User.UserContext.get_CountryCurrency()
00000078597bd9e0 00007ff93154367e Abck.Web.Features.Cart.Services.CountryCurrencyService.GetCountrySettingThreeAlpha(Abck.Web.Features.User.UserContext)
00000078597bda20 00007ff93153b6dd Abck.Web.Features.Product.ViewModels.Builders.CoreProductViewModelBuilder+d__29.MoveNext()
00000078597bdad0 00007ff93153b54a System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib]].Start[[Abck.Web.Features.Product.ViewModels.Builders.CoreProductViewModelBuilder+d__29, Abck.Web]](d__29 ByRef)

Any guess on what happened?

It looks like what we have a clear case of infinite loop. There were multiple threads trying to get the same cookie, and under the neath, the HttpCookieCollection.Get is not threadsafe, it tries to access an underlying HashSet<T> without properly locks. While one thread tried to remove items from the hashset, another tried to read it, and messed up the internal hash table. All three threads are then doomed in infinite loops, keep running and use up all CPU resources.

The fix in this case would be as simple as a lock around the code to get the CountryCurrency, but it’s best to be careful. Again, when fixing the problem of lack of lock, it’s easy to run into the problem of too much of lock.