.NET developers have been in the transition to move from synchronous APIs to asynchronous API. That was boosted a lot by await/async keyword of C# 5.0, but we are now in a dangerous middle ground: there are as many synchronous APIs as there are async ones. The mix of them requires the ability to call async APIs from a synchronous context, and vice versa. Calling synchronous APIs from an async context is simple – you can fire up a task and let it does the work. Calling async APIs from a sync context is much more complicated. And that is where AsyncHelper comes to the play.
AsyncHelper is a common thing used to run async code in a synchronous context. It is simple helper class with two methods to run async APIs
public static TResult RunSync<TResult>(Func<Task<TResult>> func)
{
var cultureUi = CultureInfo.CurrentUICulture;
var culture = CultureInfo.CurrentCulture;
return _myTaskFactory.StartNew(() =>
{
Thread.CurrentThread.CurrentCulture = culture;
Thread.CurrentThread.CurrentUICulture = cultureUi;
return func();
}).Unwrap().GetAwaiter().GetResult();
}
public static void RunSync(Func<Task> func)
{
var cultureUi = CultureInfo.CurrentUICulture;
var culture = CultureInfo.CurrentCulture;
_myTaskFactory.StartNew(() =>
{
Thread.CurrentThread.CurrentCulture = culture;
Thread.CurrentThread.CurrentUICulture = cultureUi;
return func();
}).Unwrap().GetAwaiter().GetResult();
}
There are slight variants of it, with and without setting the CurrentCulture and CurrentUICulture, but the main part is still spawning a new Task to run the async task, then blocks and gets the result using Unwrap().GetAwaiter().GetResult();
One of the reason it was so popular was people think it was written by Microsoft so it must be safe to use, but it is actually not true: the class is introduced as an internal class by AspNetIdentity AspNetIdentity/src/Microsoft.AspNet.Identity.Core/AsyncHelper.cs at main · aspnet/AspNetIdentity (github.com) .That means Microsoft teams can use it when they think it’s the right choice to do, it’s not the default recommendation to run async tasks in a synchronous context.
Unfortunately I’ve seen a fair share of threads stuck in AsyncHelper.RunSync stacktrace, likely have fallen victims of a deadlock situation.
Async/sync is a complex topic and even experienced developers make mistake. There is no simple way to just run async code in a sync context. AsyncHelper is absolutely not. It is simple, convenient way, but does not guarantee to be correct thing in your use case. I see it as a shortcut to solve some problems but create bigger ones down the path.
Just because you can. doesn’t mean you should. That applies to AsyncHelper perfectly
Recently I helped to chase down a ghost (and you might be surprised to know that I, for most part, spend hours to be a ghostbuster, it could be fun, sometimes). A customer reported a weird issue when a visitor goes to their website, have every thing correct in the cart, including the discount, only to have the discount disappeared when they check out. That would be a fairly easy task to debug and fix if not for the problem is random in nature. It might happen once in a while, but on average, daily. It could not be reproduced locally, or reproduced consistently on production, so all fix is based on guess work.
After a lot of dry code reading and then log reading, it turned out that it seems the problem with missing discount was problem with the missing cache. Once in a while, the cache that contains the promotion list is returned empty, resulting that no discount is applied to the order.
But why?
After a few guesses, it eventually came to me that the problem is with the caching using Dictionary, more specifically, campaigns are loaded and cached using a Dictionary, using IMarket as a key. It would be fine and highly efficient and well, if not for the fact that the default implementation of IMarket is not suitable to be a Dictionary key. It does not implement IComparable<T> and IEquatable<T> which means, for the only case that two IMarket instances to be equal, is that they are the same instances. Otherwise even if their properties all equal in value, they will not be equal.
This is a short program that demonstrates the problem. You can expect it write “False” to the output console.
public class Program
{
private static Dictionary<AClass, int> dict = new Dictionary<AClass, int>();
public static void Main()
{
dict.Add(new AClass("abc", 1), 1);
dict.Add(new AClass("xyz", 2), 2);
Console.WriteLine(dict.ContainsKey(new AClass("abc", 1)));
}
}
public class AClass
{
public AClass(string a, int b)
{
AString = a;
AnInt = b;
}
public string AString { get; set; }
public int AnInt { get; set; }
}
The question arises is that if the key is not matched and an empty list of campaigns returns, why this only happens sometimes. The answer is the IMarket instances themselves are cached, by default in 5 minutes. So for the problem to occur, a cache for campaigns must be loaded in memory, just before the cache for IMarket instances to be expired (then new instances are created). Once the new IMarket instances are loaded, then the campaigns cache must be accessed again before itself expires (default to 30 seconds). The timing needs to be “right” which causes this problem elusive and hard to find from normal testing – both automated and manual.
Time to some blaming and finger pointing. When I fix something I usually try to check the history of the code to understand the reason behind the original idea and intention. Was there a reason or just an overlook. And most importantly
Who wrote such code?
Me, about 7 months ago.
Uh oh.
The fix was simple enough. Instead of IMarket, we can change the key to MarketId which implements both IEquatable<T> and IComparer<T>. So it does not matter if you have two different instances of MarketId, as long as they have the same value, they will be equal.
A workaround was sent to the customer to test and after a week or so they reported back the problem is gone. The official fix is in Commerce 14.31 which was released yesterday https://nuget.optimizely.com/package/?id=EPiServer.Commerce.Core&v=14.31.0 , so you’re, always, highly recommended to upgrade.
Lessons learned:
Pick the dictionary key carefully. It should implement IEquatable<T> and IComparable<T> , properly I might ask. In general, a struct is a better choice than a class, if you can.
No matter how “experienced” you think you are, you are still a human being and can make mistake. It’s important to have someone to check your work from time to time, spotting problems that you couldn’t.
A colleague asked me yesterday – how do we migrate properties of catalog content. There is, unfortunately, no official way to do it. There are several unofficial ways to do it, however. Today we will explore the way I personally recommend – for its safety and backward compatible.
Let’s say we have FashionProduct with a MSRP property with type of Money, now we would want to change it to Decimal . There are a some hacky ways to do this, but all of them require direct database manipulation which we should try to avoid – if possible.
First we will need this piece of code. it was “stolen” from a colleague of mine and has been used for countless times. You probably want to bookmark it as it’ll likely be useful in the future (I should probably do so myself as I have to find it every time I need). It is a snippet to traverse the catalog structure based on the content type you’d want.
public virtual IEnumerable<T> GetEntriesRecursive<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : EntryContentBase
{
foreach (var nodeContent in LoadChildrenBatched<NodeContent>(parentLink, defaultCulture))
{
foreach (var entry in GetEntriesRecursive<T>(nodeContent.ContentLink, defaultCulture))
{
yield return entry;
}
}
foreach (var entry in LoadChildrenBatched<T>(parentLink, defaultCulture))
{
yield return entry;
}
}
private IEnumerable<T> LoadChildrenBatched<T>(ContentReference parentLink, CultureInfo defaultCulture) where T : IContent
{
var start = 0;
while (true)
{
var batch = _contentLoader.GetChildren<T>(parentLink, defaultCulture, start, 50);
if (!batch.Any())
{
yield break;
}
foreach (var content in batch)
{
// Don't include linked products to avoid including them multiple times when traversing the catalog
if (!parentLink.CompareToIgnoreWorkID(content.ParentLink))
{
continue;
}
yield return content;
}
start += 50;
}
}
To make sure we don’t load to many content at once, the batch is set size 50 but that is of course configurable (up to you)!
Now the fun part, where it actually does the work. Once we have the content, we will need to actually migrate the data, it is can be simple as this
private void MigrateProperty<T>(IEnumerable<T> contents) where T: EntryContentBase
{
var batch = new List<T>();
foreach(var content in contents)
{
var writeableClone = content.CreateWriteableClone<T>();
Transform(writeableClone);
batch.Add(writeableClone);
}
_contentRepository.Publish(batch, PublishAction.SyncDraft);
}
With the Transform method you can do whatever you want with the property value. As you might just want to rename it – it can do nothing except assign value to the new property. Or in the case we mentioned at the beginning, convert Money to Decimal is an easy task (Money is the less precision version of Decimal). Note that if you convert between data types, for example from double to int , there are potential data loss, but you are probably aware of that already.
The final step is to publish the change. For performance reasons, it is probably the best that you the Publish extension method of IContentRepository and save multiple content in one batch – may of of size 50 or 100. Those will skip things like creating new versions for optimal performance. You can read it about here New simple batch saving API for Commerce | Optimizely Developer C
The remaining question is where to put it. In a perfect world, I’d say in a migration step (i.e. a class that implement IMigrationStep ), so you ensure that your data will be properly migrated before anything else run, for example your new code that access the new property, or indexing of your content after migration. But if you have a sizeable catalog, this will take time and it might not be a good idea to let your users wait for it to complete. For that, it makes senses to do this in a schedule job and when it completes, you make a switch.
Migrating properties is not an easy or quick task, but it can be done with relative ease. It also reminds us about modeling – try to get it right from beginning so we don’t have to migrate. In the end, the fastest code is the code that does not need to be run!
If you are using Optimizely Customized Commerce, the common wisdom is that you should be using serializable cart mode. It’s not perfect (we discussed the drawbacks in, um, my book), but generally it has performance benefits. But for any reason that you need to use the legacy cart mode, there is a switch you can use – IFeatureSwitch which can be used to change between those modes
It is important to remember that IFeatureSwitch only switch between to modes, it does not migrate any carts between those two. there is a one time cart migration job but that’s it.
To let IOrderRepository use the correct cart system, there is an ICartProvider which will be either CartProvider or SerializableCartProvider . The problem is that happens much earlier in the pipeline than IInitializationModule. In fact it is determined in IConfigurableModule.ConfigureContainer , which means before any IInitializationModule.Initialize. Even if we call DisableFeatures in another ConfigureContainer, there is no warranty that our code will be called before the framework code (the order of ConfigureContainer execution is indeterministic )
But fortunately, we can do that inside Startup.Configure. Due to how the feature switch data structure, it’s not as simple as adding a setting in appsettings.json, but it can be done easily in code:
services.Configure<ApplicationOptions>(x =>
{
x.Features[SerializedCarts.FeatureSerializedCarts] = new ApplicationFeature
{
Feature = "SerializedCarts",
State = FeatureState.Disabled,
Type = "Mediachase.Commerce.Core.Features.SerializedCarts, Mediachase.Commerce"
};
});
Of course, this is a workaround. The feature switch should be done as documented. It will be fixed in the future.
While we always want to have fast database queries, it is not doable all the time. Sometimes we need to run slow queries, and we need to tell the underlying framework that this query can take some time to complete, and we’re fine with it. Otherwise, it will try to terminate the query after 30 seconds (the default time out limit)
There is a different between connection timeout and command timeout. Connection timeout is the time .NET will try to connect to the database before giving up. Command timeout is the time .NET will try to execute a command before giving up.
Things are a bit complicated when it comes to command timeout with .NET 5 and up. With later versions of Microsoft.Data.SqlClient, it is possible to set command timeout directly using connection string. It is indeed a simple way to do it, but with a caveat.
The new setting is not recognized by Entity Framework/Entity Framework Core, and it will throw exception if you try to access a connection string with command timeout setting. It has another way to set the command timeout itself by each DbContext , but it does not accept the setting via Connection string. It will throw “Keyword not supported: ‘command timeout'” if such setting is present.
The workaround is to configure the command timeout for EcfSqlConnection connection string, and another different connection string without command timeout just for Entity Framework.
However, that’s with a caveat: using command timeout in connection string means that value applies to all database queries. As we discussed in the previous post above, doing so is not without drawbacks – it hides slow queries rather than let it fails. A failed query might not bring down your website, but an overloaded database will likely do.
In Commerce 14.15.6 which should be released shortly, we introduce a new setting SqlDataProviderOptions.CommandTimeout which let you set the command timeout for queries that are using SqlDataProvider – most notably the MetaDataPlus system like orders.
The important advantage of the new setting is that you can set it on the fly. If you know that some operation will be slow, you can set the command timeout to a higher value just for that operation, then set it back to default value. In most cases, you can leave it to default value (30 seconds), and do optimization on application level (reduce batch size for example) or database layer (rewrite the query, adding/optimizing indexes etc.). But sometimes you know the query would be slow and you acknowledge that – this is the tool.
If you have been using Optimized Customized Commerce, you probably know that, by default, wish list is just a cart with a special name. Can you guess the name? Surprise, surprise, it’s “Wishlist”. It’s been there since forever, from the early day of Mediachase, and then carried over to the new serializable cart. I have been “fine” with it – i.e. I accept the approach unconsciously. But until very recently I realized there are several problems with the approach.
How come it is not a very good idea?
First of all, it shares same table as the normal cart. To search for abandoned carts, you would have to skip the carts with “wishlist” name. There are only a few cart names and they are not evenly distributed, you will have hard time filtering carts by their names.
But there is more. As most customers are using the serializable cart mode now, ever growing wishlists also pose a another problem – each operation on the wishlist – adding or removing item, will result in a big write to the SerializableCart table. If you have just a few items, it might be fine, but a simple test on Commerce shows that with only 9 items in wishlist, the Data column is more than 2700 characters. And wishlists are meant to be kept forever – they will only grow in size.
As wishlists are carts, they have to be in same format even though a lot of them are redundant/unnessary.
The biggest benefit, and I think it triumphs all other disadvantages we have listed, of the default wishlist implementation is it’s built-in. You can start using it without almost no additional effort. Get a cart with the predefined name and you are good to go. Building a different wish list definitely costs time and resource, a luxury not everyone can afford.
For that, I have been starting building a wish list service on my free time. I plan to make it open source when the time is right, but we’ll see about that.
Moral of the story
It is critical to take a step back, from time to time, to think about what you have done. Things might make less senses when you see it from a different perspective.
Hi again every body. New day – new thing to write about. today we will talk about memory allocation, and effect it has on your website performance. With .NET, memory allocations are usually overlooked because CLR handles that for you. Except in rare cases that you need to handle unmanaged resources, that you have to be conscious about releasing that said resources yourself, it’s usually fire and forget approach.
Truth is, it is more complicated than that. The more objects you created, the more memory you need, and the more time CLR needs to clean it up after you. When you might have written code that is executed blazing fast in your benchmarks, in reality, your website might still struggle to perform well in long run – and that’s because of Garbage collection. Occasional GC is not of a concern – because it’s nature of .NET CLR, but frequent GC, especially Gen 2 GC, is definitely something you should look into and should fix, because it natively affects your website performance.
The follow up question – how do you fix that.
Of course, the first step is always measuring the memory allocations of your app. Locally you can use something like Jetbrains dotMemory to profile your website, but that has a big caveat – you can’t really mimic the actual traffic to your website. Sure, it is very helpful to profile something like a scheduled job, but it is less than optimal to see how your website performs in reality. To do that, we need another tool, and I’ve found nothing better than Application Insights Profiler trace on Azure. It will sample your website periodically, taking ETL ( event trace log) files in 220 seconds (Note, depends on your .NET version, you might download a .diagsession or a .netperf.zip file from Application Insights, but they are essentially the same inside (zipped .ETL)). Those files are extremely informative, they contains whole load of information which might be overwhelming if you’re new, but take small steps, you’ll be there.
Note that once extracted ETL can be very big – in 1GB or more range often. Perfview has to go through all that event log so it’s extremely memory hungry as well, especially if you open multiple ETL files at once. My perfview kept crashing when I had a 16GB RAM machine (I had several Visual Studio instances open), and that was solved when I switched to 32GB RAM
The first step is to confirm the allocation problems with GCStats (this is one of the extreme ones, but it does happen)
Two main things to look into – Total Allocs, i.e. the total size of objects allocated, and then the time spent in Garbage collection. They are naturally closely related, but not always. Total allocation might not be high but time for GC might be – in case of large objects allocation (we will talk about it in a later post). Then for the purpose of memory allocation analysis, this is where you should look at
What you find in there, might surprise you. And that’s the purpose of this series, point out possible unexpected allocations that are easy – or fairly easy – to fix.
In this first post, we will talk about a somewhat popular feature – Injected<T>.
We all know that in Optimizely Content/Commerce, the preferred way of dependency injection is constructor injection. I.e. if your class has a dependency on a certain type, that dependency should be declared as a parameter of the constructor. That’s nice and all, but not always possible. For example you might have a static class (used for extension methods) so no constructor is available. Or in some rare cases, that you can’t added a new parameter to the constructor because it is a breaking change.
Adding Injected<T> as a hidden dependency in your class is at least working, so can you forget about it?
Not quite!
This is how the uses of Injected<T> result in allocation of Structuremap objects – yes every time you call Injected<T>.Service the whole dependency tree must be built again.
And that’s not everything, during that process, other objects need to be created as well. You can right click on a path and select “Include item”. The allocations below are for anything that were created by `module episerver.framework episerver.framework!EPiServer.ServiceLocation.Injected1[System.__Canon].get_Service() i.e. all object allocations, related to Injected<T>
You can expand further to see what Injected<T>(s) have the most allocations, and therefore, are the ones should be fixed.
How can one fix a Injected<T> then? The best fix is to make it constructor dependency, but that might not always be possible. Alternative fix is to use ServiceLocator.GetInstance, but to make that variable static if possible. That way you won’t have to call Injected<T>.Service every time you need the instance.
There are cases that you indeed need a new instance every time, then the fix might be more complicated, and you might want to check if you need the whole dependency tree, or just a data object.
Moral of the story
Performance can’t be guessed, it must be measured
Injected<T> is not your good friend. You can use it if you have no other choice, but definitely avoid it in hot paths.
With Find-backed IEntrySearchService in the previous post , we can now put SearchProvider to rest. There are, however, parts of the framework that still rely on SearchManager, and it expects a configured, working SearchProvider. The Full search index job, and the Incremental search index job are two examples. To make sure we don’t break the system, we might want to give SearchManager something to chew on. A do nothing SearchProvider that is!
And we need a DoNothingSearchProvider
public class DoNothingSearchProvider : SearchProvider
{
public override string QueryBuilderType => GetType().ToString();
public override void Close(string applicationName, string scope) { }
public override void Commit(string applicationName) { }
public override void Index(string applicationName, string scope, ISearchDocument document) { }
public override int Remove(string applicationName, string scope, string key, string value)
{ return 42; }
public override void RemoveAll(string applicationName, string scope)
{
}
public override ISearchResults Search(string applicationName, ISearchCriteria criteria)
{
return new SearchResults(new SearchDocuments(), new CatalogEntrySearchCriteria());
}
}
And a DoNothingIndexBuilder
public class DoNothingIndexBuiler : ISearchIndexBuilder
{
public SearchManager Manager { get; set; }
public IndexBuilder Indexer { get; set; }
public event SearchIndexHandler SearchIndexMessage;
public void BuildIndex(bool rebuild) { }
public bool UpdateIndex(IEnumerable<int> itemIds) { return true; }
}
What remains is simply register them in your appsettings.json
If you have been using Find, you might be surprised to find that CSR UI uses the SearchProvider internally. This is a bit unfortunate because you likely are using Find, and that creates unnecessary complexity. For starter, you need to configure a SearchProvider, then you need to index the entries, separately from the Find index. If you install EPiServer.CloudPlatform.Commerce, it will setup the DXPLucenceSearchProvider for you, which is basically a wrapper of LuceneSearchProvider to let it work on DXP (i.e. Azure storage). But even with that, you have to index your entries anyway. You can use FindSearchProvider, but that actually just creates another problem – it uses a different index compared to Find, so you double your index count, yet you have still make sure to index your content. Is there a better way – to use the existing Find indexed content?
Yes, there is
Searches for entries in CSR is done by IEntrySearchService which the default implementation uses the configured SearchProvider internally . Fortunately for us, as with most thing in Commerce, we can create our own implementation and inject it. Now that’s with a caveat – IEntrySearchService is marked as BETA remark, so prepare for some breaking changes without prior notice. However it has not changed much since its inception (funny thing, when I checked for its history, I was the one who created it 6 years ago, in 2017. Feeling old now), and if it is changed, it would be quite easy to adapt for such changes.
IEntrySearchService is a simple with just one method:
It is a bit weird to return an IEnumerable<int> (what was I thinking ? ), but it was likely created as a scaffolding of SearchManager.Search which returns an IEnumerable<int>, and was not updated later. Anyway, an implementation using Find should look like this:
Note that I am not an expert on Find, especially on NestedFilterExpression, so my FilterPriceAvailableForCurrency might be wrong. Feel free to correct it, the code is not copyrighted and is provided as-is.
As always, you need to register this implementation for IEntrySearchService. You can add it anywhere you like as long as it’s after .AddCommerce.
It’s not a secret, I love optimizing things. In a sense, I am both an Optimizer (literally) and an optimizer. And today we will be back to basic – optimizing a tricky SQL query.
The query in question is this particular stored procedure ecf_CatalogNode_GetAllChildNodes, this is used to get all children nodes of specific nodes. It is used in between to find all entries that are direct, or indirect children of specific nodes. Why, you might ask, because when you change the url segment of the node, you want to make sure that all entries that are under that node, will have their indexed object refreshed.
Let’s take a look at this stored procedure, this is how it looks like
CREATE PROCEDURE [dbo].[ecf_CatalogNode_GetAllChildNodes]
@catalogNodeIds udttCatalogNodeList readonly
AS
BEGIN
WITH all_node_relations AS
(
SELECT ParentNodeId, CatalogNodeId AS ChildNodeId FROM CatalogNode
WHERE ParentNodeId > 0
UNION
SELECT ParentNodeId, ChildNodeId FROM CatalogNodeRelation
),
hierarchy AS
(
SELECT
n.CatalogNodeId,
'|' + CAST(n.CatalogNodeId AS nvarchar(4000)) + '|' AS CyclePrevention
FROM @catalogNodeIds n
UNION ALL
SELECT
children.ChildNodeId AS CatalogNodeId,
parent.CyclePrevention + CAST(children.ChildNodeId AS nvarchar(4000)) + '|' AS CyclePrevention
FROM hierarchy parent
JOIN all_node_relations children ON parent.CatalogNodeId = children.ParentNodeId
WHERE CHARINDEX('|' + CAST(children.ChildNodeId AS nvarchar(4000)) + '|', parent.CyclePrevention) = 0
)
SELECT CatalogNodeId FROM hierarchy
END
I previously wrote about the relations between entities in Commerce catalog, here Commerce relation(ship), a story – Quan Mai’s blog (vimvq1987.com) , so relations between nodes can be a bit complicated – a node can have one true parent defined in CatalogNode table, and then other “linked” nodes in CatalogNodeRelation . So to find all children – and grand children of a node, you need to get from both.
Getting children of a node from CatalogNode or CatalogNodeRelation is simple, but things become more complicated when you have to get grandchildren, then great-grandchildren, and so on, and so forth. with that, CTE needs to be used in a recursive way. But then there is a problem arises – there is a chance, small, but still, that the data was added in a correct way, so circular reference is possible. i.e. A is a parent of B, which is a parent of C, and itself is a parent of A. To stop the SP from running forever, a check needs to be added to make sure any circular reference is cut short.
This brings back memory as the first ever support case I worked on at Optimizely (then Episerver) was with a circular reference. The site would crash whenever someone visited the catalog management in Commerce Manager. That was around June, 2012 (feeling old now?). My “boss” at that time involuntarily volunteered me for the case. See what you made me do, boss.
Now you can grasp the basic of what the SP does – let’s get back to the original problem. it’s slow to run especially with big catalog and complex node structure. As always, to optimize everything you need to find the bottleneck – time to fire up SQL Server Management Studio and turn on the Actual Execution Plan
I decided to go with 66, the “root” catalog node. this query yield around 18k rows
Mind you, this is on my machine with pretty powerful CPU (AMD Ryzen 7 5800x, 8 cores 16 threads), and a very fast nvme PCIe SSD (Western Digital Black SN850 2TB). If this was executed on Azure Sql database for example, a timeout is almost certainly guaranteed. So time of execution should only be compared relatively with each other.
If we look at the execution plan, it is quite obvious where the bottleneck is. A scan on CatalogNode table is heavy (it read 79M rows on that operation). As suggest by Anders from Timeout when deleting CatalogNodes from a large catalog (optimizely.com), adding a non clustered index on ParentNodeId column would improve it quite a lot. And indeed it does. The execution time is reduced to 5 second.
And the number of rows read on CatalogNode reduced to just 17k
This is of course a very nice improvement. But the customer reported that it is not enough and the SP is still giving timeout, i.e. further optimization is needed.
Naturally, the next step would be to see if we can skip the circular check. It was added as a safe measure to avoid bad data. It should not be there, as the check should be performed at data modification. But it is there for historical reasons and we can’t just change it, not trivially. So let’s try it for our curiousity.
The modified query looks like this (basically just commented out any code related to the CyclePrevention
ALTER PROCEDURE [dbo].[ecf_CatalogNode_GetAllChildNodes]
@catalogNodeIds udttCatalogNodeList readonly
AS
BEGIN
WITH all_node_relations AS
(
SELECT ParentNodeId, CatalogNodeId AS ChildNodeId FROM CatalogNode
WHERE ParentNodeId > 0
UNION
SELECT ParentNodeId, ChildNodeId FROM CatalogNodeRelation
),
hierarchy AS
(
SELECT
n.CatalogNodeId
--, '|' + CAST(n.CatalogNodeId AS nvarchar(4000)) + '|' AS CyclePrevention
FROM @catalogNodeIds n
UNION ALL
SELECT
children.ChildNodeId AS CatalogNodeId
--, parent.CyclePrevention + CAST(children.ChildNodeId AS nvarchar(4000)) + '|' AS CyclePrevention
FROM hierarchy parent
JOIN all_node_relations children ON parent.CatalogNodeId = children.ParentNodeId
--WHERE CHARINDEX('|' + CAST(children.ChildNodeId AS nvarchar(4000)) + '|', parent.CyclePrevention) = 0
)
SELECT CatalogNodeId FROM hierarchy
END
And the improvement is quite impressive (more than I expected), the query completes almost instantly (less than 1s). The read on CatalogNodeRelation significantly reduced
A word of warning here, execution plan can’t be simply compared as-is. If I run two versions side by side, it gives quite misleading comparison
Even though the top one (without the circular reference check) is much faster than the original (the bottom one), SQL Server estimates that the first is slower (almost 2x slower than the second). So execution plan should be used to see what has been done and what is likely the bottleneck inside a query, it should not be used as comparison between queries. In most cases, comparing statistics using set statistics io on is the best way to compare.
If not for the fact that we are changing the behavior of the stored procedure, I would be happy with this approach. The chance of running into circular reference is small, but it is not zero. As we said, we can in theory gating the relation during insert/updating, but that would be too big a change to start with. This is one of constraint as we work at framework level – we have to step carefully to not break anything. A breaking change is bad, but a data corruption is simply unacceptable. I spent a few hours (probably more than I should) trying to optimize the circular reference check, but no better solution is found.
The next approach would be – as we can guess, to make sure that we get rid of the Clustered Index Scan happened on the CatalogNodeRelation table. The solution would be quite simple, a non clustered index on the ParentNodeId should be enough.
Great success. The performance is comparable with the “non circular reference check” approach.
As adding an index is a non breaking change (and albeit in some cases it can cause performance regression, like in A curious case of SQL execution plan – Quan Mai’s blog (vimvq1987.com) , but it is rare, also, in this case the cardinality of the ParentNodeId is most likely quite well distributed).
That is all for today. Hopefully you learn one thing or two about optimizing queries in your daily works.