It’s been a long time since I blogged here but we recently stumbled upon a so interesting performance issue that it motivated me enough to share it with the entire world (well the entire world of people aware of this blog).
The virtue of always testing (even the obvious)
We are about to ship Babylon.js v3.3 and while running our performance validation tests we found that a specific demo was running slower than in v3.2 whereas it should be faster (at least in our minds because we specifically optimized some code used by the demo).
So the scene we tested is a simple scene made of 8000 identical spheres: https://www.babylonjs-playground.com/debug.html#QQGCL6#5
This scene is mostly designed to make sure that the time spent traversing scene graph is same (or smaller) than with previous versions.
Funnily (or sadly if you ask my wife who did not see me for a couple of days), this scene runs HALF the speed we expected even though the code used was supposed to be faster (in the latest Chrome release).
You can try it by yourself:
- Actual version: http://www.babylonjs.com/demos/fatobjects/slim/
- Version with ONLY one additional variable added (named I_AM_NOW_FAT) in the babylon.max.js (line #19666 in babylon.max.js): http://www.babylonjs.com/demos/fatobjects/fat/
The art of optimization
When something like that happens, the first thing to do is to run both versions side by side, using your browser profiler to see what is going on.
We quickly found that the culprit was the _evaluateActiveMesh function which is here to determine which meshes are in the camera frustum (in other words, which meshes can be seen by the camera).
The code of the function did not evolve a lot and the changes should have increased optimization.
When you face this kind of problem, the best way is to work step by step. So we decided to restore the function to its original version (the v3.2 version). And guess what? Nothing changed. The function was STILL slower.
Understanding what is going on under the hood
So same code is now behaving differently. Cool.
In this case, the only explanation is that the browser cannot optimize the new version like it was able to previously. And this is not related to the code itself because, well, it is now the same as it was.
So, the only other reason is that the supporting class (the Mesh class) is less “optimizable”.
And as a proud member of the “I Run Only In World of Warcraft” association, I know that if you want to run fast, it is better to be thin than fat. So we slowly removed properties from the class itself until we reached a point where, suddenly, performance was the same.
The fix is not obvious as we will need to be able to add properties to our Mesh class. But beyond a certain point, going through an array of fat objects is just slower.
We are now in a position where adding this line…
public heyWantToSlowDown = true;
…will reduce the test scene performance from 60 fps to 40 fps. Just removing the line will restore performance.
So now what?
I contacted Chrome’s team because this is something we will need their help with.
I’ll keep this post updated as soon as I get more info.
[Edit – 9/7/2018 at 4:00]: The Chrome dev team replied to my cries on Twitter and kindly offer their help through a Chromium ticket. You can track it here: https://bugs.chromium.org/p/chromium/issues/detail?id=881977