Welcome the latest installment in Friendly Fixer's Must Read series. We haven't spoken in a while. How are you? I'm great. This is a format that I've used internally for a while now and thought I'd open it up to the community, as well. So, today I bring you an explanation on what happened to our World Map loading times. At some time on Wednesday or Thursday, transitioning to the world map started taking 45+ seconds instead of 1 or 2.
(Since you are new to this, I'll explain the format. There is a MUST READ section that is kept to the point, followed by a MAYBE READ section where I expand/explain/brag a bit. Enjoy.)
- Wednesday we deployed as usual.
- We discovered our throttle no longer works as intended.
- We could let in everyone, or no one.
- We were experiencing greater than average load on our mysql databases.
- We 'pulsed' the throttle to avoid crippling load.
- As a result the deploy was 'messy'.
- Players began complaining of world map loading times.
- We assumed this was from the DB load
- Fast forward to Thursday. At this point the team starts noticing that ALL our development environments were loading slowly.
After much investigation, it has been determined to be caused by a change in the Flash Player plugin, which was released on June 6th and 7th.
Adobe put out a security update to the flash player that bumped it up a major version number, from 29 to 30. This was to mitigate two things:
- A security flaw in Microsoft Office embedded flash content
- Security flaws due to Spectre/Meltdown
- As part of this work Adobe changed the 'getTimer()' function at a low level.
- This is a core flash API function that we use for timing data. It is the closest thing to a high resolution timer that flash has and is EXTREMELY common for people to use.
- It's performance cost was increased at least 100 times
We have a tight loop that iterates over all 250 thousand hex map cells and populates them based on a server defined height map. This is to place mountains and other features into each hex map cell. This naturally takes a bit to process so it is deferred over several frames.
The exact mechanism to do this is a loop that has a counter into the server sent byte array. Each iteration through the loop processes exactly one hex map cell and reads one byte from the array. It then checks how long it has been since we started the loop. If that time goes over a defined threshold, which is 25 milliseconds, it breaks out of the loop. If we are not done loading yet, the next frame tick will go back into the loop and pick up where it left off, controlled by that index counter.
The problem is that we are effectively calling getTimer() 250 thousand times. In flash player version 29 the algorithm could handle 6 thousand or more iterations per 25 millisecond time slice. In the latest flash player this dropped to 300 or less. Understandably this took a LOT longer to process world map data.
I fixed it with a bit of a bodgy hack. Instead of calling getTimer EVERY iteration, I do it only every 1000. This means we only call it 250 times and process roughly 5 thousand per 25 ms. Much better. Also, simple and fast to get out to you guys!
There are other places in our code where we call getTimer and I will be evaluating our use. In the meantime we will be deploying this fix later today. It is a flash client only deploy and does not require down time. It might however require you to delete your cache, or at least delete the game's SWF from your cache. This shouldn't matter since we invalidate the game's SWF in our cloud storage, but I can't control browser's caching strategies.