Debug .NET memory dump with WinDBG – crash course. Part 1
If you ask me what had I been doing the last two weeks – then the answer is I was pulling my hairs. A customer had a problem with their site as the memory hiked up after catalog imports and stayed there “forever” – and in the end it slowed the site down. I jumped in and almost regretted that decision – had to spent days messing around with WinDBG and memory dumps. In the end – I found the problem and it was fixed. A lot of hairs were loss in progress, but I learned something about WinDBG – and that’s what I’m sharing today.
WinDBG is probably the most famous tool for debugging stuffs on Windows. Out of the box, it only works with native applications, aka assembly and such – but lucky for us, there are plenty of extensions to allow it to work with .NET application. The “standard” SOS and more advanced extension SOSEX. SOS is included in WinDBG, while you can download SOSEX from here (for 64 bit) or here (for 32 bit) . Download the zip file and extract the dll somewhere.
WinDBG comes with the Windows SDK, not the standard .NET framework, so you’ll probably need to install it separately from here
If you don’t have any memory dump at your disposal, time to create one. The easiest way for doing so is to using Task Manager – fire it up and right click on the application you want to debug and choose Create dump file – and wait for Task Manager to finish writing that application memory to disk. Technically you can create memory dump of any application, but we are only interested in .NET application here – so I’m choosing an IIS Apppool – w3wp process.
With the memory dump available, now it’s time to analyse it. Open WinDBG, and start debugging by Ctrl+D, point to the memory dump you created and load it:
Loading Dump File [D:\memdump.dmp]
User Mini Dump File with Full Memory: Only application data is available
*** procdump.exe -64 -ma -o 3368 hansapostee_memdmp.dmp
*** Manual dump'
Symbol search path is: srv*
Executable search path is:
Windows 8.1 Version 9600 MP (2 procs) Free x64
Product: Server, suite: TerminalServer SingleUserTS
Built by: 6.3.9600.18217 (winblue_ltsb.160124-0053)
Debug session time: Mon Sep 5 08:49:16.000 2016 (UTC + 2:00)
System Uptime: 25 days 17:11:20.437
Process Uptime: 2 days 17:33:17.000
Loading unloaded module list
00007fff`480206fa c3 ret
WinDBG is ready, but it’s almost useless for us at the moment. Now we need to load the extensions so we can use the CLR “exports” to analyse the memory dumps.
.loadby sos clr
.loadby will load the module name, so we don’t have to specify the full path of the library as we do with .load. When you use .loadby, the debugger finds the module that the ModuleName parameter specifies, determines the path of that module, and then uses that path when the debugger loads the extension DLL.
When you load sosex.dll, it will suggest you to run a command to allow it to build special “index” for the memory dump, so it can quickly “query” information from heap. Run !bhi (build heap index) and wait for a bit until the command completes. A .bin file with the same name of your .dmp will appear in the same folder. This only needs to be done once. Next time you open the dump file and load sosex, it’ll know where to load its heap index.
Time to do some investigation. SOS and SOSEX provide a lot of commands which you can be used, injunction with what WinDBG provides by default. The full list of commands can be found here.
However, let follow the flow which I did with the investigation of the (supposedly) memory leak I mentioned in the beginning:
Naturally, we need to check for stuffs in heap. We can call !dumpheap to get the information from the heap – object by object. But the memory dump I was investigating was at 6GB and has ~42 million objects in there, so going through that list is not an option. Luckily, we can run it with -stat parameter, so it’ll group all objects of same type into one, and order the heap from the least memory consumption type to the most:
With a big memory dump, it will takes time to run commands such as !dumpheap – stat, so watch out for the status in the command bar. If it says *BUSY*, it means WinDBG is hard at work – so just wait.
Normally, we only need to pay attention to the types on the bottom of the list. For a normal Episerver Commerce site, it’s easy to have 500-600 types, and many types only have very few, or even one instance, so they are not really interesting. So here we have plenty of types which have more than 1000.000 instances. System.String – the most commonly used type, has almost 10.000.000 instances. We also have more than 3 million instances in Free type – the objects which were disposed and are waiting for GC to reclaim the memory.
But also there is a troubling report: there were a lot of fragmented blocks (there were more in the result, but I cut out from the picture to fit with the post), blocks of memory which are free, but GC couldn’t “compact” the memory. This is some indicator about the GC was not working properly – as it should.
At this point we are pretty clueless about what is wrong. So just pick something and try to see what in there. System.String seems to be interesting, so let pick it.
!dumpheap -mt 00007fff3cbcda88
There is two ways to dump the objects of a type: either you can specify the specific type by -mt parameter (
I suppose it means “Machine type” it means MethodTable), or you can have the -type parameter. However, the -type parameter is not strict. If you specify System.Threading.Thread for example, it will filter all the types which names match that string, including “System.Threading.Thread”, “System.Threading.ThreadPool”, “System.Threading.ThreadStartException”,…
!dumpheap comes with plenty of parameters, which you can try them yourselves:
- -min <size>: only shows the objects which are at least <size> in bytes
- -max <size>: only shows the objects which are at most <size> in bytes
- -type: as above
- -mt: as above
- -gen (0,1,2,3): the generation which objects belong to. 3 is for LOH (Large Object Heap)
Back to our example, the only problem with dumping strings is … there is too much of them. For testing purpose we only need to examine some of the instances. But the train is already running. To stop it, use the combination of Ctrl+Break.
Now we can examine an object by !do or !dumpobj command. For array, there is !da – or !dumparray command. I usually find the !mdt command by SOSEX works better:
An object is not really meaningful, but the important thing about it is we can track down its “root” object – the object which references it and keeping it from garbage collected.
There are two ways to track down an object. The first way is to the command !gcroot. It might work, but I find it to be slow for a big memory dump. The alternative (or supplementary way) is the !refs command from sosex. It shows the objects our object references to, and the objects which reference to it. Continue click on “follow” on the suspicious object will eventually lead you to the “root” object.
To trace down the leaking memory objects, it’s important to know the actual size of an object – how much memory it’s holding up. However, the size WinDBG gives you, in most cases, is the size of references. An empty object (when you call new object();) is 24 bytes.
Object size: To see how much memory an object takes up, sos provides an export “objsize”, which can be used with the address of the object. Note that it will also calculates the memory needed for referenced objects, so it might take a while to run in big, complex objects:
0:000> !mdt 000000df248f23f8
000000df248f23f8 (System.Collections.Generic.HashSet`1+Slot[[EPiServer.Core.LoaderOption, EPiServer]], Elements: 3, ElementMT=00007ffedffac1c0)
expand all 3 items
0:000> !objsize 000000df248f23f8
sizeof(000000df248f23f8) = 6848 (0x1ac0) bytes (System.Collections.Generic.HashSet`1+Slot[[EPiServer.Core.LoaderOption, EPiServer]])