From time to time, I have to dig into some customers’ profiler traces to figure out why their site is slow (yeah, if you follow me, you’d know that’s kind of my main job). There are multiple issues that can eat your website performance for breakfast, from loading too much content, to unmaintained database indexes. While my blog does not cover everything, I think you can get a good grasp of what mistakes to avoid.
But sometimes the problem might come from a 3rd party library/framework. It’s not new, as we have seen it with A curious case of memory dump diagnostic: How Stackify can cause troubles to your site – Quan Mai’s blog (vimvq1987.com). The problem with those types of issues is that they are usually overlooked.
The library we’ll be investigating today would be Maxmind.db. To be honest, I’ve never used it my own, but it seems to be a very popular choice to geography-map the visitors. It’s usually used by Optimizely sites for that purpose, using VisitorGroup (which is why it came under my radar).
For several sites that use it, it seems more often than not stuck in this stack
It’s essentially to think that CreateActivator
is doing something heavy here (evidently with the LambdaCompiler.Compile
part. A peek from decompiling actually shows that yes, it’s heavy. I’m not quite sure I can post the decompiled code here without violating any agreement (I did, in fact, accepted no agreement at this point), but it’s quite straightforward code: TypeActivatorCreator
uses reflection to get the constructors of the Type
passed to it, to sees if there is any constructor decorated with MaxMind.Db.Constructor
attribute, then prepares some parameters, and creates an LambdaExpression
that would create an instance of that Type
, using found constructor (which is a good thing because a compiled expression would perform much better than just a reflection call).
(I’m using Mindmax.db 2.0.0, for the record)
The code is straightforward, but it is also slow – as any code which involves reflection and lambda compilation would be. The essential step would be to cache any result of this. This is actually a very good place to cache. The number of types are fixed during runtime (except for very edge cases where you dynamically create new types), so you won’t have to worry about cache invalidation. The cache would significantly improve the performance of above code.
And in TypeActivatorCreator
there is a cache for it. It is a simple ConcurrentDictionary<Type, TypeActivator>
, which would return an TypeActivator
if the Type
was requested before, or create one and cache it, it it hasn’t been. As I said, this is a very good place to add cache to this.
There is a cache for that, which is good. However, the very important tidbit here is that the dictionary is not static. That means, the cache only works, if the class is registered as Singleton (by itself, or by another class down the dependency chain), meaning, only one of the instance is created and shared between thread (which is why the ConcurrentDictionary part is important).
But except it’s not.
When I look at a memory dump that collected for a customer that is using Maxmind.db, this is what I got:
0:000> !dumpheap -stat -type TypeAcivatorCreator
Statistics:
MT Count TotalSize Class Name
00007ffa920f67e0 1 24 MaxMind.Db.TypeAcivatorCreator+<>c
00007ffa920f6500 147 3528 MaxMind.Db.TypeAcivatorCreator
Total 148 objects
So there were 147 instances of TypeAcivatorCreator
. Note that this is only the number of existing instances. There might be other instances that were disposed and garbaged by CLR.
Now it’s clear why it has been performing bad. For supposedly every request, a new instance of TypeActivatorCreator
is created, and therefore its internal cache is simply empty (it is just newly created, too). Therefore each of request will go through the expensive path of CreateActivator
, and performance suffers.
The obvious fix here is to make the dictionary static, or making the TypeActivatorCreator
class Singleton. I don’t have the full source code of Mindmax.Db to determine which is better, but I’d be leaning toward the former.
Moral of the story:
- Caching is very, very important, especially when you are dealing with reflection and lambda compilation
- You can get it right 99%, but the 1% left could still destroy performance.
Update:
I reached out to Maxmind.db regarding this issue on November 9th, 2021
About 6h later they replied with this
I was at first confused, then somewhat disappointed. It is a small thing to fix to improve overall performance, rather than relying on/expecting customers to do what you say in documentation. But well, let’s just say we have different opinions.