One of the most important aspect of data storage is integrity, meaning that the accuracy and consistency is maintained throughout the life cycle. However, there are cases when the integrity can be compromised in certain tables. It is difficult (if not impossible) to enforce a constraint between those columns technically, because of the schema design. The data integrity is enforced by other means (such as triggers, or carefully written stored procedures).
If you are using Episerver APIs, it’s guaranteed that no orphan rows are left behind. However, if you are doing data manipulation directly yourself (which we generally advise against), or if you are syncing databases between environments (for example, between production and development, or between development and UAT), there might be chances that the there are rows that meant to be deleted, but were left behind unintentionally.
Those will cause problem later when you try to load data, or when you upgrade to a later version where data schema is updated.
Some places to look into:
- CatalogItemAsset: There are chances that an entry or a node was deleted, but their assets were left behind.
Query to check for:
SELECT * FROM CatalogItemAsset WHERE CatalogEntryId = 0 AND CatalogNodeId NOT IN (SELECT CatalogNodeId FROM CatalogNode) SELECT * FROM CatalogItemAsset WHERE CatalogNodeId = 0 AND CatalogEntryId NOT IN (SELECT CatalogEntryId FROM CatalogEntry)
- ecfVersion: There are chances that an entry or a node was deleted, but their versions were left behind
Query to check for:
SELECT * FROM ecfVersion WHERE ObjectTypeId = 0 AND ObjectId NOT IN (SELECT CatalogEntryId FROM dbo.CatalogEntry) SELECT * FROM ecfVersion WHERE ObjectTypeId = 1 AND ObjectId NOT IN (SELECT CatalogNodeId FROM dbo.CatalogNode) SELECT * FROM ecfVersion WHERE ObjectTypeId = 2 AND ObjectId NOT IN (SELECT CatalogId FROM dbo.Catalog)
Query to check for:
SELECT * FROM CatalogContentProperty WHERE ObjectTypeId = 0 AND ObjectId NOT IN (SELECT CatalogEntryId FROM dbo.CatalogEntry) SELECT * FROM CatalogContentProperty WHERE ObjectTypeId = 1 AND ObjectId NOT IN (SELECT CatalogNodeId FROM dbo.CatalogNode) SELECT * FROM CatalogContentProperty WHERE ObjectTypeId = 2 AND ObjectId NOT IN (SELECT CatalogId FROM dbo.Catalog)
There might also be orphan rows left in ecfVersionProperty, but they will be taken care of when the orphan rows in ecfVersion are taken care of.
(You can see there is no way to constrain the integrity by SQL Server constraints, like in other tables)
If there are no orphan rows returned by both queries – which is most common – that’s good, nothing to worry about. However, if there are, you know what to do. As always, back up your databases first, and proceed with cautions.
To avoid such issues in future, make sure that you always favor Episerver APIs over direct database manipulation. And if you have to do such things, proceed with cautions, and beware of the references between tables, so no rows will be left behind.
6 thoughts on “Watch out for orphan rows in Episerver Commerce database”
A question regarding these tables:
We have a lot of rows in these tables. ecfVersionProperty as an example contains over 11 million rows. Is there a safe way to clean this up?
As a rule of thumb you should not try to clean up the database directly. You can set the max version count, then run the Trim version scheduled job to delete the unwanted versions. that will help keep those tables in check without risking the data integrity
We have uiMaxVersions=”10″ in Episerver.config and the “Trim Content Versions” is running every day. But since we have around 40 000 variants (with lots of properties) in 7 different languages I guess it can be quite a lot in that table.
It would be good if it was possible to set this for both CMS and Catalog content. CMS content would actually benefit with 10 versions but catalog content only needs 1 version since we are using PIM and never publish old version inside Episerver UI.
The question is still: Does a commerce site function with those tables being emptied or is there a strict dependency link between the published version and the data in the tables?
I see that there is a stored procedure called [ecfVersion_DeleteByObjectId], and I want to execute it for each row in the [ecfVersion] table.
If you are using PIM, consider using this https://world.episerver.com/blogs/Magnus-Rahl/2015/3/optional-performance-tweaks-in-episerver-commerce-8-9/
40k variants/7 languages is small set. I don’t think you have to worry about performance in that case if your indexes are well maintained