Don’t let the execution plan fools you

Don’t get me wrong, execution plan is one of the best tools at your disposal if you want to optimize a SQL query. No, it is the must have tool. It is not the only tool you will need, but if you have to pick only one, pick it.

But it is important to know that execution plan can be misleading. It is very useful to see where is the bottleneck is within a statement. It is not exactly useful when you need to compare two statements.

Let’s compare these two queries that I am working to optimize

SELECT	OG.OrderGroupId
		FROM	OrderGroup OG
		INNER JOIN	OrderGroup_PurchaseOrder PO ON OG.OrderGroupId = PO.ObjectId WHERE 1 = 1  AND OG.Status IN(SELECT Item FROM ecf_splitlist('Cancelled')) ORDER BY OG.OrderGroupId DESC
        OFFSET 0  ROWS 
        FETCH NEXT 50 ROWS ONLY

versus

SELECT	OG.OrderGroupId
		FROM	OrderGroup OG
		INNER JOIN	OrderGroup_PurchaseOrder PO ON OG.OrderGroupId = PO.ObjectId  WHERE 1 = 1  AND OG.Status IN('Cancelled') ORDER BY OG.OrderGroupId DESC
        OFFSET 0  ROWS 
        FETCH NEXT 50 ROWS ONLY

These are 99% similar, except for the statement OG.Status IN ..., with and without calling the split function.

If you look at the execution plan only, it seems the former is much faster than the latter. It takes only 14% of the time, while the latter takes 86%, so if based on those figures only, we might think the first one is ~6 times faster than the second one.

Except it is not. If we turn on the IO statistics, it is a very different story

The first query has significantly more IO operations than the second

(50 rows affected)
Table 'OrderGroup_PurchaseOrder'. Scan count 0, logical reads 162, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#BA76F977'. Scan count 1, logical reads 8386, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'OrderGroup'. Scan count 1, logical reads 356, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

versus

(50 rows affected)
Table 'OrderGroup'. Scan count 1, logical reads 356, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'OrderGroup_PurchaseOrder'. Scan count 1, logical reads 143, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

The first has slightly more logical reads on OrderGroup and OrderGroup_PurchaseOrder, but significantly more in a temp table (which is, inside the ecf_splitlist function).

The moral of the story? Execution plan is helpful, but not to compare query to query. In most cases, IO statistics are much more useful.

Migrate Catalog content properties

A colleague asked me yesterday – how do we migrate properties of catalog content. There is, unfortunately, no official way to do it. There are several unofficial ways to do it, however. Today we will explore the way I personally recommend – for its safety and backward compatible.

Let’s say we have FashionProduct with a MSRP property with type of Money, now we would want to change it to Decimal . There are a some hacky ways to do this, but all of them require direct database manipulation which we should try to avoid – if possible.

First we will need this piece of code. it was “stolen” from a colleague of mine and has been used for countless times. You probably want to bookmark it as it’ll likely be useful in the future (I should probably do so myself as I have to find it every time I need). It is a snippet to traverse the catalog structure based on the content type you’d want.

public virtual IEnumerable<T> GetEntriesRecursive<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : EntryContentBase
    {
        foreach (var nodeContent in LoadChildrenBatched<NodeContent>(parentLink, defaultCulture))
        {
            foreach (var entry in GetEntriesRecursive<T>(nodeContent.ContentLink, defaultCulture))
            {
                yield return entry;
            }
        }

        foreach (var entry in LoadChildrenBatched<T>(parentLink, defaultCulture))
        {
            yield return entry;
        }
    }

    private IEnumerable<T> LoadChildrenBatched<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : IContent
    {
        var start = 0;

        while (true)
        {
            var batch = _contentLoader.GetChildren<T>(parentLink, defaultCulture, start, 50);
            if (!batch.Any())
            {
                yield break;
            }

            foreach (var content in batch)
            {
                // Don't include linked products to avoid including them multiple times when traversing the catalog
                if (!parentLink.CompareToIgnoreWorkID(content.ParentLink))
                {
                    continue;
                }

                yield return content;
            }
            start += 50;
        }
    }

To make sure we don’t load to many content at once, the batch is set size 50 but that is of course configurable (up to you)!

Now the fun part, where it actually does the work. Once we have the content, we will need to actually migrate the data, it is can be simple as this

private void MigrateProperty<T>(IEnumerable<T> contents) where T: EntryContentBase
{
      var batch = new List<T>();
      foreach(var content in contents)
      {
           var writeableClone = content.CreateWriteableClone<T>();
           Transform(writeableClone);
           batch.Add(writeableClone);
      }
      _contentRepository.Publish(batch, PublishAction.SyncDraft);
}

With the Transform method you can do whatever you want with the property value. As you might just want to rename it – it can do nothing except assign value to the new property. Or in the case we mentioned at the beginning, convert Money to Decimal is an easy task (Money is the less precision version of Decimal). Note that if you convert between data types, for example from double to int , there are potential data loss, but you are probably aware of that already.

The final step is to publish the change. For performance reasons, it is probably the best that you the Publish extension method of IContentRepository and save multiple content in one batch – may of of size 50 or 100. Those will skip things like creating new versions for optimal performance. You can read it about here New simple batch saving API for Commerce | Optimizely Developer C

The remaining question is where to put it. In a perfect world, I’d say in a migration step (i.e. a class that implement IMigrationStep ), so you ensure that your data will be properly migrated before anything else run, for example your new code that access the new property, or indexing of your content after migration. But if you have a sizeable catalog, this will take time and it might not be a good idea to let your users wait for it to complete. For that, it makes senses to do this in a schedule job and when it completes, you make a switch.

Migrating properties is not an easy or quick task, but it can be done with relative ease. It also reminds us about modeling – try to get it right from beginning so we don’t have to migrate. In the end, the fastest code is the code that does not need to be run!

Switching away from serializable cart mode

If you are using Optimizely Customized Commerce, the common wisdom is that you should be using serializable cart mode. It’s not perfect (we discussed the drawbacks in, um, my book), but generally it has performance benefits. But for any reason that you need to use the legacy cart mode, there is a switch you can use – IFeatureSwitch which can be used to change between those modes

It is important to remember that IFeatureSwitch only switch between to modes, it does not migrate any carts between those two. there is a one time cart migration job but that’s it.

To change the cart mode, it is simply as this

GetInstance<IFeatureSwitch>().DisableFeature(SerializedCarts.FeatureSerializedCarts);

However, there is a catch here.

To let IOrderRepository use the correct cart system, there is an ICartProvider which will be either CartProvider or SerializableCartProvider . The problem is that happens much earlier in the pipeline than IInitializationModule. In fact it is determined in IConfigurableModule.ConfigureContainer , which means before any IInitializationModule.Initialize. Even if we call DisableFeatures in another ConfigureContainer, there is no warranty that our code will be called before the framework code (the order of ConfigureContainer execution is indeterministic )

But fortunately, we can do that inside Startup.Configure. Due to how the feature switch data structure, it’s not as simple as adding a setting in appsettings.json, but it can be done easily in code:

            services.Configure<ApplicationOptions>(x =>
            {
                x.Features[SerializedCarts.FeatureSerializedCarts] = new ApplicationFeature
                {
                    Feature = "SerializedCarts",
                    State = FeatureState.Disabled,
                    Type = "Mediachase.Commerce.Core.Features.SerializedCarts, Mediachase.Commerce"
                };
            });

Of course, this is a workaround. The feature switch should be done as documented. It will be fixed in the future.

Command timeout for Commerce 14

While we always want to have fast database queries, it is not doable all the time. Sometimes we need to run slow queries, and we need to tell the underlying framework that this query can take some time to complete, and we’re fine with it. Otherwise, it will try to terminate the query after 30 seconds (the default time out limit)

There is a different between connection timeout and command timeout. Connection timeout is the time .NET will try to connect to the database before giving up. Command timeout is the time .NET will try to execute a command before giving up.

With Commerce 13, we have a setting added in 9.23.1, as we talked here Episerver Commerce commandTimeout configuration – Quan Mai’s blog (vimvq1987.com) , however, in Commerce 14, it’s … different.

Things are a bit complicated when it comes to command timeout with .NET 5 and up. With later versions of Microsoft.Data.SqlClient, it is possible to set command timeout directly using connection string. It is indeed a simple way to do it, but with a caveat.

The new setting is not recognized by Entity Framework/Entity Framework Core, and it will throw exception if you try to access a connection string with command timeout setting. It has another way to set the command timeout itself by each DbContext , but it does not accept the setting via Connection string. It will throw “Keyword not supported: ‘command timeout'” if such setting is present.

The workaround is to configure the command timeout for EcfSqlConnection connection string, and another different connection string without command timeout just for Entity Framework.

However, that’s with a caveat: using command timeout in connection string means that value applies to all database queries. As we discussed in the previous post above, doing so is not without drawbacks – it hides slow queries rather than let it fails. A failed query might not bring down your website, but an overloaded database will likely do.

In Commerce 14.15.6 which should be released shortly, we introduce a new setting SqlDataProviderOptions.CommandTimeout which let you set the command timeout for queries that are using SqlDataProvider – most notably the MetaDataPlus system like orders.

The important advantage of the new setting is that you can set it on the fly. If you know that some operation will be slow, you can set the command timeout to a higher value just for that operation, then set it back to default value. In most cases, you can leave it to default value (30 seconds), and do optimization on application level (reduce batch size for example) or database layer (rewrite the query, adding/optimizing indexes etc.). But sometimes you know the query would be slow and you acknowledge that – this is the tool.

Green coffee bean suppliers in EU

Why the criteria

FOB Price: one of the important aspect of specialty coffee is fair trade – farmers produce high quality coffee, and they should get fair price for what they harvest. FOB Price is basically what the importer (in many cases, the supplier themselves) pays for the coffee. It’s a great deal of transparency. And while it’s not required, I would be happy to know what the farmers are getting.

Cupping score: Coffee that is rated as specialty has to at least 80 cupping score by SCA (it’s not the only criteria, but an important one). If you are paying 200/20 EUR or more per kg for green coffee beans, you should know what you get. While a higher cupping score is not necessarily more expensive, it depends a lot on the supply-demand balance, but I’d be happier to pay more money for higher cupping score.

Year of harvest: Green coffee beans have much longer shelf life than roasted coffee ones. While for roasted, you should finish your bag within 2 months (and some purists might say just 1 month), green beans can be keep for 12 months in proper storage conditions, if not more. But they do not last forever, the fresher the beans = the better, which is why it’s important for the supplier to disclose the year of harvest (or even, month of harvest)

Suppliers in Sweden

I started with Whileelkcoffee and Rawcoffee. Too early to tell about the bean’s quality, but I must say I really like Whiteelkcoffee’s packaging. They are a branch/subsidiary of Kafferosteriet Koppar AB, which has some great coffees.

It is told that almost any roasters will sell your raw coffee if you ask nicely, so you probably ask your favorite roaster if they have green beans to sell. The commonly accepted “rule” is green coffee is 1/2 price of roasted ones, so you might offer that to the roasters.

I would recommend to buy from suppliers that focus on coffee beans – they usually know more about their stuffs – than the suppliers that sell general groceries. But you can always try your luck, maybe there is a hidden gem somewhere.

SiteUrlFree shipping thresholdBag sizeFOB priceCupping ScoreYear of HarvestNote
White elk coffeeButik: Råkaffe – White Elk Coffee350kr1, 10kgNoNoNo10% for order over 1000kr
Raw coffeeKaffebönor – RawCoffee350kr1kgNoYesYes10% coupon code for next order
Drop coffeeBuy Coffee – scroll down to buy your coffee – Drop Coffee300kr1kgYesNot specific, 86+YesSelect unroasted coffee from drop downlist
Muttley & Jack’sUrsprungskaffe – Muttley & Jack’s (muttleyandjacks.se)No free shipping, 69kr regardless of order value1kgNoYesNo
BaristashoppenRÅKAFFE – ROSTA SJÄLV – Baristashopen350kr250gr, 1kgNoNoNo
MoccazinoHuvudkategori 3 (moccazino.se)No free shipping, min 229kr, shipping cost increased with weight0,75kgNoNoYes
Humle gårdenGreen coffee for those who roast their own coffee – Humlegårdens EkolagerNo free shipping, 99kr shipping, increased with weight500grNoNoYesReduced price for 10 bags

Suppliers outside Sweden

Of course I would prefer buying from Sweden for obvious reasons. But I can’t resist the urge to try some exotic beans, so I have been looking around to see where I can source my green beans from EU.

This is inspired by EU Green Coffee Sources List 2023 (home-barista.com), but with more information you can find in one place. Sadly many of them are are not shipping to Sweden, but you probably can arrange some shipping if you ask nicely.

SiteUrlCountryShip to SwedenShipping costFOB priceCupping scoreBag sizesYear of harvest
fleurdecafeUnroasted coffee beans – Green coffee beans | Fleur de café (fleurdecafe.nl)NetherlandsNoN/ANoNo
groenekoffiewinkelUnroasted Coffee Beans | Up To 15% Discount At Green Coffee Shop (groenekoffiewinkel.nl)NetherlandsNoN/A
fascino-coffeeGreen Coffee Beans | Coffee roasting company Fascino (fascino-coffee.com)NetherlandsNoN/A
godincoffeeUnroasted coffee beans – Green coffee beans | Fleur de café (fleurdecafe.nl)NetherlandsNo
roastrebelsBuy green coffee in small quantities (1kg / 5kg / 15kg) | Roast RebelsGermany/SwizerlandYes9.9EUR, free if order value is 120EUR or moreNoNo1, 5, 15kgYes
rohebohnenGreen Coffee | Rohebohnen.deGermanyYes13.9EURNoNo500gr, 1kgNo
hoofdkwartierUnroasted Coffee Beans Archives – Headquarters Coffee Roasting (hoofdkwartier-koffiebranderij.nl)NetherlandsNo
ShokuninGreen coffee – ShokuninNetherlandsYes22 EUR, free if order value is 200 EUR or moreNoNo
RoestartRohkaffee (roestart.de)GermanyYes15.9 EURNoNo500gr, 1kgNo
goodkarmacoffeeRohkaffee – Good Karma CoffeeGermanyNo
docklandsRohkaffee kaufen und selber rösten (docklands-coffee.de)GermanyYes16EURNoNo1,5 and 10kgYes
rjavitukanSurova kava Archives – Rjavi tukanSlovakiaYes21.9EURNoNo1,3,5, and 10kgNo
koffiebranderijdekoepoortGROENE ONGEBRANDE KOFFIEBONEN – Koffiebranderij de KoepoortNetherlandsYesFlat rate 24.5 EURNoNo1kgNo
Falcon-microShop Europe – Falcon Micro (falcon-micro.com)UK (EU warehouse)Yes246kr for minimum 5kg, increased with weightNoYes5 and 10kgNo
88 grainsShop | 88 GrainesFranceYes6.9EUR minimum, increased with weight, free for 300 EUR+NoYes1kgYes (Month/year)
kaffeboxKaffeabonnement med Skandinavias topp mikrobrennerier – KaffeBox.noNorwayYesMinimum 62.5NOK, increased with weightNoNo250grNo
risterietGrønne/Rå kaffebønner – Risteriet.dk WebshopDenmarkYesNo, 350DKKNoNo1kg, 5kgNo

Cheap coffee stuffs from China – a review

A word of warning, buying stuffs from China means long shipping time, and you will have almost no support or customer service (including warranty) ever. If things go wrong during transport – with very cheap items it’s not trackable once it leaves China, so it’s potentially

Timemore Black Mirror Basic Plus – $40/400kr

This is the most “luxurious” item I bought, and I think it’s well worth it. It’s well built, it’s fast, it’s accurate. Maybe it’s not as good as an Acaia – I have never been able to justify spending $200 for a scale, but I’d say it’s more than enough. It does not really matter if your espresso is off by a few tenth of a gram.

My rating: Buy!

Sprouted cup – 70kr

Once upon a time I made a double shots for me and my wife. I used the sprouted portafilter to divide the espresso into 2 cups, nice and easy. But that poses 2 problems: first i lose the fantastic view of bottomless portafilter extraction. Second, the sprouted portafilter is a PITA to clean properly. So I tried a different option – sprouted cup.

You can see in the photo above, a cup with sprouts that makes it easier to divide the espresso. It works, much better than normal cup. But it is also thinner and loose heat much faster.

We have now switched to 2x18gr shots every time I made coffee, so this cup just sits idle around, as it should.

My rating: Buy if no option is available to you.

Bottomless portafilter and balanced tamper

The balanced tamper is to fix the unbalanced tamping – with the traditional tamper, you might title your tamper a bit – i.e. it is not completely balanced and that might result in an uneven tamping. The plate will sit on the wall of the portafilter, and the base will do tamping. Because of that, you are guaranteed a perfect tamping every time.

The balanced tamper is very nice and I liked it a lot. But it has another design flaw – coffee ground gets into between the base and the outer plate. You will have to clean it as often as daily.

My rating:

Bottomless portafilter: Skip. Save and buy some nicer one

Balanced tamped: Buy if you can keep up with cleaning.

No name coffee scale – 20kr (96% off)

I am happy with my Timemore but I hate to move it back and forth between grinder and espresso machine, so I bought another one just for weighing coffee beans – because of crazy deal on Temu. It is a copy of the Black Mirror but smaller. The scale is quite flimsy, and not intuitive to use – you have to hold down the Power button for a few seconds to turn it on. The scale is fairly accurate, but slow to respond, and despite being tout as a coffee scale, there is no silicone pad to protect the scale from heat.

For 20kr, because I got the 96% off discount for first order on Temu, it’s OK. No way I would buy this otherwise. Certainly not at the “full” 250kr price.

My rating: Avoid. Buy Timemore.

Coffee bean dosing bowl – 65kr

When you get into single dosing, a dosing bowl is a must – it is nice to pour beans into it and then pour them into the grinder. I ordered one but it arrived broken (who could have thought a china would not stand shocks of ~10.000km traveling without a lot of wrapping?).

The bowl looks good in photos and seems practical. In the end China is known for their china, so what could go wrong. Well, it’s well made, but with one flawed design – as the nose is very low, beans will jump out of the bowl when you pour them into it. Not much, but 1 bean out of the bowl is 1 bean too many. The bowl was meant for tea (which is not as jumpy as coffee beans)

If you compare the design of this bowl

With the equivalent of Loveramics:

You can clearly see the difference. Loveramics obviously thought about the issue and their design is meant to fix it! I’m ordering the loveramics ones, although they are much more expensive!

My rating: Avoid. Buy Loveramics.

WDT – 100kr

You can see from some of photos above this WDT – I actually bought it for much less, but the price you can get now is closer to 100kr. It is as simple as some long, thin needles attached to a base. Surprisingly it works well to distribute the coffee ground. This is one thing you should own, and because it is so simple, you can’t go wrong. This is one thing that you can buy from Aliexpress without much thinking.

My rating: Buy!

Dosing cup – 70kr

When I decided to try single dose on my Eureka Mignon Specialita, I bought two things: the hopper and the dosing cup.

The dosing cup allows you to grind into it, maybe give it a few shakes then pour it to the portafilter. It easier to use in cases you can’t hold a portafilter, and the shakes are equivalent to using WDT (but some people still use WDT after that), so it has some values. However, the dosing cup has marks inside the cup which allows coffee ground to stick. You will eventually have to clean it daily to avoid build up.

Once you get a hold of Niche Zero dosing cup – you immediately notice the differences in craftmanship and finish. It is much better built, and it is entirely smooth inside. It’s unfair comparison because while the NZ dosing cup is $39.99 without shipping, but as you only need to buy once, maybe save up for that if you need a dosing cup.

Single Dosing Hopper – 200kr

The idea is by slamming on the cover, it forces the remaining coffee grounds inside the burr out. It was pretty well made and fits well to my Specialita (and was advertised to fit with any Eureka Mignon grinder). However, it has the bad plastic smell. Not really strong but definitely there, which made me question if it is safe for food. The It works, but I hate the smell. the main problem is that Eureka Mignon Specialita is not designed to be a single dosing grinder, so while it works to some extend, the workflow is not smooth or intuitive

Closing thoughts

So my advice when it comes to ordering cheap coffee stuffs from China (or Amazon.se with Chinese sellers) is … don’t. If you have to, stay with some establish brands like Timemore. Others are cheap for a reason and don’t expect them to feel nice or perform well.

Performance optimization – the hardcore series – part 4

Let’s take a break from the memory allocation, and do some optimization on another aspect, yet as important (if not even more important) – database.

We all know that database queries play an essential part in any serious app. It’s almost a given that if you want your app to perform well, your database queries must also perform well. And for them to perform well, you need things like proper design (normalization, references etc.), properly written queries, and proper indexes. In this post, we will explore how an index can improve query performance, and how can we do it better.

Let’s start with a fairly simple table design

CREATE TABLE [dbo].[UniqueCoupon](
	[Id] [int] identity primary key clustered, 
	[PromotionId] [int] NOT NULL,
	[Code] [nvarchar](10) NOT NULL,
	[ExpiredOn] [datetime] NULL,
	[Redeemed] [bit] NULL
) ON [PRIMARY]

Nothing extraordinary here, pretty common if you ask me. Now for testing purpose, let’s insert 1.000.000 rows into it

INSERT INTO  dbo.[UniqueCoupon] (PromotionId, Code)
SELECT

FLOOR(RAND()*(100)+1),
SUBSTRING(CONVERT(varchar(255), NEWID()), 0, 7)

GO 1000000

We need to query data by the code, so let’s create an user defined type

CREATE TYPE CouponTable AS TABLE (
    Code NVARCHAR(10));

Time to run some query against data, let’s go with this

SELECT Id, PromotionId, Code, ExpiredOn, Redeemed FROM dbo.UniqueCoupons
                                                                    WHERE PromotionId = @PromotionId AND Code in (SELECT Code FROM @Data)

This is the complete query as we need some data

	declare @data CouponTable
	insert into @data 
	select top 10 code from dbo.UniqueCoupon 
	where promotionid = 36

	SELECT Id, PromotionId, Code, ExpiredOn, Redeemed FROM dbo.UniqueCoupon
                                                                    WHERE PromotionId = 36 AND Code in (SELECT Code FROM @Data)

As we learned that execution plan is not a good way to compare performance, let’s use the statistics, our trusted friends

																	set statistics io on
																	set statistics time on

And this is how it takes with our default setting (i.e. no index)

(10 rows affected)
Table '#AEDEED61'. Scan count 1, logical reads 1, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'UniqueCoupon'. Scan count 9, logical reads 7070, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

If you are somewhat experienced with SQL Server, you might guess it would not be exactly happy because of, obviously an index is needed. As we query on PromotionId, it does makes sense to add an index for it, SQL Server does give you that

If we just blindly add the index suggested by SQL Server

(10 rows affected)
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'UniqueCoupon'. Scan count 1, logical reads 53, physical reads 0, page server reads 0, read-ahead reads 5, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#A7AA9B2B'. Scan count 1, logical reads 1, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

But can we do better?

If we look at the index, there’s something not very optimized about it – we are query by both PromotionId and Code, so not really makes senses to have Code as included. How’s about we have the index on both PromotionId and Code?

(10 rows affected)
Table 'UniqueCoupon'. Scan count 10, logical reads 30, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#A1F9F38F'. Scan count 1, logical reads 1, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

Yet we can make it better! From 53 to 30 logical reads might not sound a lot, but if you have thousands of queries every hour, it will be fairly significant.

Prepare yourself for some pleasant surprises – when we eventually applied the change on an actual database, the change was staggering, much more than what we hoped for. The query that were run for 24h in total, every day, now takes less than 10 minutes (yes you read it right, 10 minutes).

At this point you can certainly be happy and move on. But can we do better? For the sake of curiosity ? Yes we do.

SQL Server is rather smart that it knows we are getting the other columns as well, so those will be included in the index, to avoid a key lookup. Let’s see if we can remove that and see how it performs

(10 rows affected)
Table 'UniqueCoupon'. Scan count 10, logical reads 60, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#B1996E94'. Scan count 1, logical reads 1, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

So it was indeed worse, a key lookup is performed for every row (SQL Server uses the index to track down the rows and read the other columns from there)

There are two way to get rid of those key lookup – includes the columns in the index itself, or, more dramatic, make the index the clustered. As we can see the data should be accessed by PromotionId and Code, it makes perfect senses.

It is a commonly belief that Identity column should be clustered index – it is unique, it is not null. However, it only makes senses if it is the most heavily accessed column. In this case, Id only serves as an Identity column, it does not need to be the clustered index (although being an unique means it will has a non clustered index for it)

ALTER TABLE [dbo].[UniqueCoupon] DROP CONSTRAINT [PK__UniqueCo__3214EC0744C2FF38] WITH ( ONLINE = OFF )
GO

ALTER TABLE [dbo].[UniqueCoupon] ADD PRIMARY KEY NONCLUSTERED 
(
	[Id] ASC
)

Does this bring dramatically performance change? Unlikely. My test show no improvement in statistic. However, there is one critical impact here: we significantly reduced the size of indexes in the table. (data incoming)

Moral of the story

  • Indexes are crucial.
  • You can almost always do better than the auto suggested indexes.
  • Real test is the only true validation.

Building a better wish list – part 1

If you have been using Optimized Customized Commerce, you probably know that, by default, wish list is just a cart with a special name. Can you guess the name? Surprise, surprise, it’s “Wishlist”. It’s been there since forever, from the early day of Mediachase, and then carried over to the new serializable cart. I have been “fine” with it – i.e. I accept the approach unconsciously. But until very recently I realized there are several problems with the approach.

How come it is not a very good idea?

First of all, it shares same table as the normal cart. To search for abandoned carts, you would have to skip the carts with “wishlist” name. There are only a few cart names and they are not evenly distributed, you will have hard time filtering carts by their names.

But there is more. As most customers are using the serializable cart mode now, ever growing wishlists also pose a another problem – each operation on the wishlist – adding or removing item, will result in a big write to the SerializableCart table. If you have just a few items, it might be fine, but a simple test on Commerce shows that with only 9 items in wishlist, the Data column is more than 2700 characters. And wishlists are meant to be kept forever – they will only grow in size.

My saved for later on Amazon – which is the closet thing to a “wish list”. Imagine having that on Optimizely Customized Commerce.

As wishlists are carts, they have to be in same format even though a lot of them are redundant/unnessary.

The biggest benefit, and I think it triumphs all other disadvantages we have listed, of the default wishlist implementation is it’s built-in. You can start using it without almost no additional effort. Get a cart with the predefined name and you are good to go. Building a different wish list definitely costs time and resource, a luxury not everyone can afford.

For that, I have been starting building a wish list service on my free time. I plan to make it open source when the time is right, but we’ll see about that.

Moral of the story

  • It is critical to take a step back, from time to time, to think about what you have done. Things might make less senses when you see it from a different perspective.
  • You can almost always do better.

Cleaning/maintaining routine for espresso machines

“There is no too clean espresso machine”. That is my favorite quote when it comes to cleaning espresso machines and equipment. When you use your machine, the coffee ground and oil build up, and it can, and will affect the taste of your espresso. Why spend thousands of USD buying fancy machines, and few dozens for each bag of specialty coffee, without getting the best out of it.

Property cleaning and maintaining machine is also helping to prolong your machine and increase your enjoyment of using it.

For every machine

Keep your group head clean after each use. There are several ways of doing that, and you can do a combination of them that you like best

  • Draw some hot water from the group head with an empty portafilter to clean any debris remaining
  • Use a paper filter or a puck screen. This prevents the coffee ground from being attached to the group head.
  • Wipe the group head with a wet cloth (preferably microfiber) after the shot
  • Use this fancy tool from Espazzola to clean it up.

You will also need to backflush – i.e. using a blind basket – a basket without holes so water can’t be escaped. It will flow back to the machine and escape through the OPV (over pressure valve), bringing with it any coffee ground and oil that is inside the group head. Each type of group head needs a different backflush schedule – more on that later.

For milk wand

  • Purge the wand before each use.
  • Wipe the wand right after frothing. Immediately if your is not non burn – i.e. it gets very hot to touch. Otherwise the milk will be baked and is very hard to remove.
  • Purge the wand as soon as possible after each froth.
  • If your milk wand has removable tip, remove it once every month to check for blockages

For equipment

If you are using a bottomless portafilter, either wipe it or rinse it under running water after each use to remove any stuck coffee ground. One quick way to check if the basket is clean is to use a household paper to wipe it. If it comes out clean, you are good. If it comes out black – you need to clean a bit more.

If you are using the normal portafilter with sprout, pop the basket out and clean both it the portafilter (if you have never done it, you might be surprised, yuck!). This is also one of the reason I’d prefer the bottomless.

Every week, soak your equipment that have been in contact with coffee ground in a detergent that can clean coffee oil. I recommend to use puly caff as it’s effective, safe (it’s NSF-certified), and cheap to use. Add 10gr of pulycaff to 1 liter of hot water, stir it well then soak your equipment for 15 minutes, then clean and rinse them thoroughly.

For Integrated/saturated group head

Those group heads can be backflush as many times as needed.

  • Once every week, use 3-4gr of pulycaff in a blind basket, and draw a few shots until the pulycaff is dissolved, then draw a few more until the water in blind basket is “clean”. Remove the blind basket, and draw a few more shots without the portafilter locked in.
  • Every 3 months, or less, open the shower head and clean it. (tip: make sure that the group head is cooled down and completely comfortable to touch. it can retain heat for a long time)
  • Change your gasket every year if it is rubber (as it degrades with heat), or every other year if it is silicone. That is just the guideline, check if it is hard and has lost its elasticity.

For E61 group head

E61 group head needs lubing with food grade silicone grease, and backflushing with pulycaff washing that away, so you need to be conservative about that. Instead:

  • Backflush with water only after the final shot of the day.
  • Backflush with pulycaff every other month, then grease your lever. If you do not, your lever will be squeaky, it will feel tight to open/close, and it will wear much faster.
  • Open your shower head every week and clean it up. Use a spoon and gently remove the shower head. If you have a hardened rubber tool to avoid scratches, even better.
  • Change your gasket every year if it is rubber (as it degrades with heat), or every other year if it is silicone. That is just the guideline, check if it is hard and has lost its elasticity.

Descaling

Limescale is the #1 enemy of espresso machine, especially for dual boilers ones with the steam boiler – as the water boils, it leaves the remaining mineral behind, the TDS in the water increases, and the chance for limescale build up gets higher.

  • If your water is relatively soft, always use the water softener and change it when it is used up.
  • If your tap water is very hard, you might need some other options instead of using it directly. You might have to use distilled water + added mineral (distilled water does not taste good, and it can also be harmful with electronic component in the boilers. Certain sensors rely on the ions available in the water to work (properly).
  • Draw 200ml of water from the hot water tap to increase the water exchange, use that for heat your cup. Don’t draw too much as it can expose the heating element to the air and fry it. This ensure that your steam boiler gets fresh water every day, avoid high concentration.
  • Descale according to the manufacturer guideline. NOTE: be more cautious if one or both of your boilers are brass, as descaling chemical can cause harm to them.

Routines

Each use

Draw some water from steam boiler if you have dual boilers

Clean group head and portafilter

Wipe and purge milk wand

Wipe splashes of coffee (from channeling) or milk (from frothing) if any

Every day

For E61: backflush with water only after last pull of the day

Weekly (or every 3 days, depends on our usage)

Soak portafilter, basket etc. in pulycaff solution, and clean them thoroughly

Clean the dip tray

For saturated group head: backflush with pulycaff

For E61: remove and clean shower head

Every other week

Clean water tank with some disk soap, rinse it thoroughly

Every other month

For E61: backflush with pulycaff, then lubricate the lever

Every 3 months

For saturated group head: remove and clean the shower head

For E61 with flow control: lubricate the o rings of the flow control

Every year

Check gasket and replace if they become hard

Remove cover and check for internal for any sign of leaks

Every other year

Consider descaling if necessary

Performance optimization – the hardcore series – part 3

“In 99% of the cases, premature optimization is the root of all devil”

This quote is usually said to be from Donald Knuth, usually regarded as “father of the analysis of algorithms”. His actual quote is a bit difference

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should not pass up our opportunities in that critical 3%.

If you have read my posts, you know that I always ask for measuring your application before diving in optimization. But that’s not all of the story. Without profiling, your optimization effort might be futile. But there are things you can “optimize” right away without any profiling – because – they are easy to do, they make your code simpler, easier to follow, and you can be certain they are faster.

Let’s see if you can spot the potential problematic piece of code from this snippet

public Something GetData()
{
var market = list.FirstOrDefault(x => x.MarketId == GetCurrentMarket().MarketId)
{
//do some stuffs
}

}

If you are writing similar code, don’t be discouraged. It’s easy to overlook the problem – when you call FirstOrDefault, you actually iterate over the list until you find the first matching element. And for each and every of that, GetCurrentMarket() will be called.

Because we can’t be sure when we will find the matching element, it might be the first element, or the last, or it does not exist, or anywhere in between. The median is that GetCurrentMarket will be half called half the size of list

We don’t know if GetCurrentMarket is a very lightweight implementation, or list is a very small set, but we know that if this is in one very hot path, the cost can be (very) significant. These are the allocations made by said GetCurrentMarket

This is a custom implementation of IMarketService – the default implementation is much more lightweight and should not be of concern. Of course, fewer calls are always better – no matter how quick something is.

In this specific example, a simple call to get the current market and store it in a local variable to be used in the scope of the entire method should be enough. You don’t need profiling to make such “optimization” (and as we proved, profiling only confirm our suspect )

Moral of the story

  • For optimization, less is almost always, more
  • You definitely should profile before spending any considerable amount optimizing your code. But there are things that can be optimized automatically. Make them your habit.