Automated Crash Processor

I’ve written an automated crash processer about 4 times. Every time I’ve had to google how exactly to do it. Every time I’ve thought to myself “fuck me, I wish I’d written this down”.

We’re getting a bunch of crashes in Rust. When unity crashes it drops a folder in the game root with a dump and a log.

Uploading the crash log

We have no control over the engine, once it crashes we’re done. So we can’t just upload crashes from there. So the first time the game starts up, the very first thing it does is looks at all of the folders in the game root.

It looks inside each one and if it sees a error.log, it zips it up and sends it to our website, which then stores it in the database.

Processing the crash log

So we have the log. Something to be considerate of is that the stack trace in the output log that unity generates is probably bullshit, depending on whether it could find the right symbols when it generated it.

So we need to process the mdmp ourselves. Which involves running cdb.exe – the command line version of windbg. The format I use is:

cdb.exe -z "crashlog.mdmp" -c "!analyze -v; q"

I’m calling this from a quick c# program I wrote, so inside I capture the results and save it to the database.

Symbols

Unity has a symbol server. Which is great.

http://symbolserver.unity3d.com/

So on the command line to cdb we can provide them too.

-y "Srv*c:/symbols*http://msdl.microsoft.com/download/symbols;Srv*c:/symbols*http://symbolserver.unity3d.com/"

If you don’t have the symbols you won’t see shit. It would be cool of AMD/Nvidia had symbol servers too.

Grouping

The stack trace you get from cdb.exe will look a bit like this.

00000000`0a3be610 00007ff7`ea14aa8a : 00000000`08718e64 00000000`0a3be6c9 00000000`083e7a00 00000001`28a066d0 : d3d11!CContext::TID3D11DeviceContext_Map_+0x1a
00000000`0a3be660 00007ff7`ea16509b : 00000000`0000000c 00000000`083e7a00 00000000`548215a0 00000000`052ead20 : RustClient!StreamOutSkinPoseBuffer::Update+0x10a
00000000`0a3be730 00007ff7`ea165baf : 00000000`052ead20 00000000`083e7a00 00000000`0021fdd4 00000000`000000f8 : RustClient!GfxDeviceWorker::RunCommand+0x558b
00000000`0a3bf820 00007ff7`ea15f38a : 00000000`00001000 00007ff7`ea9e8c38 00000000`00001528 00000000`00000000 : RustClient!GfxDeviceWorker::Run+0x2f
00000000`0a3bf850 00007ff7`e9d38ec6 : 00000000`044ade40 00000000`00000000 00000000`044ade40 00000000`00000000 : RustClient!GfxDeviceWorker::RunGfxDeviceWorker+0x3a
00000000`0a3bf880 00007ff9`52972d92 : 00007ff7`e9d38e90 00000000`00000000 00000000`00000000 00000000`00000000 : RustClient!Thread::RunThreadWrapper+0x36
00000000`0a3bf8b0 00007ff9`53479f64 : 00007ff9`52972d70 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0x22
00000000`0a3bf8e0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x34

You could generate a hash from this to group similar stack traces together but the memory locations create too much fragmentation. So I trim my stacks to look more like this.

d3d11!CContext::TID3D11DeviceContext_Map_
RustClient!StreamOutSkinPoseBuffer::Update
RustClient!GfxDeviceWorker::RunCommand
RustClient!GfxDeviceWorker::Run
RustClient!GfxDeviceWorker::RunGfxDeviceWorker
RustClient!Thread::RunThreadWrapper
kernel32!BaseThreadInitThunk
ntdll!RtlUserThreadStart

Then create a hash from that to group them together. So far it seems to work well where symbols are present.

Visualization

So with all that done, just a quick website to visualize the database and download zip files of interesting crashes.

Uses

As you can see, most of our crashes are caused by steam_api. I suspect we’re shipping the wrong version of this file to 32bit windows users, so I’ve updated them all to the latest to see if the crashes stop coming in after the next update.

This gives us the ability to prioritize fixes like this, and to raise them with Unity. It would be awesome from our point of view if Unity did this automatically and really made their engine bullet proof, but I guess it’s unfair to suggest that most of these crashes are Unity’s fault when they could be hardware specific.

Footnotes

cdb.exe doesn’t seem to download symbols when running in a service. So I had to make an app that just runs indefinitely, querying the database for new dumps to process. I remember something like this from the first couple of times I made this.

Is there a service that does this? I know we have things like sentry for exceptions, but is there anywhere we can send minidumps, provide with a list of symbol servers and then have everything processed and organised for us?

The closest thing is the error stuff Steamworks has built in, but that doesn’t let you provide symbols, doesn’t let you manually upload mdmps and hasn’t really changed in 10 years.