Tuesday, October 30, 2012
For the last few days I’ve been trying to squeeze more performance out of GMod. I’ve been looking for slow areas, bottlenecks, over-called hooks. To do this I’ve recorded a demo and played it back using timedemo and used the Source engine’s vprof tools and the very awesome very sleepy.
Here’s some stuff I discovered.
User data was stored in a kind of stupid way. To push the userdata I just used Lua’s userdata function to create a pointer sized userdata pointer, then stored the pointer to the userdata on that. This was cool but when it came to determine the type of the userdata I had to do some wacky shit - which involved getting the metatable and looking up an integer on it. This is even worse than you’d imagine - because every call to an entity, such as ent:SetPos checks that the type of userdata passed as the first argument is an entity. So pretty much every call to entity functions, or physics object functions, or vector functions was looking up a metatable and then looking up a number on that metatable. For nothing.
So now what I do is push a custom struct as the userdata, which holds the type and a pointer. So to check the type all you have to do is grab the userdata pointer, cast it to the struct and BAM.
This took the FPS from 127 to 135.
I then moved the timers module to the engine. The timers module is really very simple, so it was easy to move. The performance increase wasn’t massive, but I was happy because it’s made the module feel less fragile.. and any gain is a gain.
This took the FPS from 135 to 137.
Entity handles in GMod were over-engineered. This was to account for the fact that when you lag the entities clientside are deleted and re-created. At the time this didn’t make sense because I really wanted to keep the entity tables around (If I could do it again this wouldn’t be the case). So I had this whole system where entity references are stored by serial/ehandle numbers in a global table called _ent. Which has worked fine for about 8 years.
But it was stupid. Server entities don’t need to give a fuck about that shit. And neither do purely clientside only entities. And we only really need to give a fuck when the entity is deleted - not all the time.
So now entity and entity table references are stored on the entity themselves. When a an entity is deleted clientside (and it has an entindex > 0) it keeps the reference around for a frame, then removes it. If an entity is created in that frame with a matching ehandle it uses those references and doesn’t call :Initalize. Clean as a bean.
This took FPS from 137 to 145.
Then I decided to tackle the biggy. The hook system is used by pretty much every system in GMod. I guessed it would be a lot faster if it was in the engine rather than in Lua. I guessed wrong. Very wrong.
Turns out that moving it to the engine actually made it slower, by about 15fps. The issue I’m guessing is the multiple pcall’s from C to Lua (for each function, and each hook). Coupled with having to re-push the arguments to the stack for each one.
So even though it took a few hours to do and the code was super awesome, I ended up having to revert it all. Sadface. It kind of opened my eyes too. I’m sure there are other places where it would be faster to have stuff running in Lua instead of in C. The __index functions on the entities come to mind.
While coding the new hook system I noticed that this function was being called about 50 times more often than it should be. Literally. So I cached the result, and invalidated the cache when rendering a new scene - or calling cam.Start.
And the rest
While I was trying to make the hook system work at a decent rate I found quite a few other optimizations. Common sense optimizations that wouldn’t be worth doing if the functions weren’t being called thousands of times every frame. These all added up.
So in my timedemo I’ve managed to get it from 127fps to 157fps. By my math that’s a 23% gain. And that’s just running a very simple (client only) demo.