16 Critical Tips for Scaling Backend Web Services
The Definitive Checklist
This definitive list can be used as a tick list when you are designing and optimizing any back end web service. Follow it and you don’t have to hand your notice in when your service goes for soft launch, let alone leave the country when it goes world wide !
Yes ! There will be feedback and improvements in this article,
Yes ! I will make a few mistakes in this post (so please correct me politely below). But I hope sharing all this knowledge is seen as a benevolent act, so we can all enjoy apps and games without throwing our mobile phones out the window when they lose the last 10 minutes of game play when the back end game server goes down.
1. Do not put any resource locks on anything anywhere, ever
I know, I am stating the obvious, but locking a resource means other consumers will have to wait. By doing this you are creating a conduit exactly one molecule wide in your ninety GB/s data infrastructure. Don’t lock anything in the server side code.
2. Do not optimize until the day before the app goes live
It’s an exaggeration, but only just ! Yes, we’ve all been there, the code is amazing looking and nobody really understands it.
Try and avoid complex, unreadable code, even if it is in this weeks edition of “C# Zeitgeist Weekly”, as this often means the code will not be easy to change without unknown side effects. A developer may have spent ages elbowing that code in, when really just a simple switch statement might have done. This point also includes not over designing, OO patterns have their place, but simplicity is King.
3. Minimize Server Requests
Static data doesn’t change, as opposed to Dynamic which does. Understand the difference between the two types.
There are also different lifetimes for static Data. If the data is the same forever, install it with the app/game — avoiding any backend effort. If the static data changes at each log in/game start up, send it once from server to client. Design your system so it understands when data needs to be refreshed, try and make as much as possible static. Dont keep getting the same static data from the server if you don’t need to.
4. Reduce refreshes
Does all the data need to be up to date all the time ? The latest global highest scores are not critical, letting some data go out of sync sometimes saves bandwidth. Don’t be obsessed with showing the latest data. Also don’t setup the client to auto refresh, if no-ones watching the screen why bother updating the data ?
5. Use the right containers in your code
Do not use non-associative containers like lists when you are going to search through them looking for what really is a key. Use associative containers (maps/dictionaries) if there is a natural index in your data. Use containers that do not allow duplicates if that is what you want (non multi containers i.e. set). If you don’t know the number of elements you’ll be dealing with don’t use a fixed size container that has to grow when its size gets too big. use a list instead.
Consider what the data content that you’ll be putting into your containers does mostly. Will you be mostly inserting, retrieve, removing ? There are easy ways to choose your containers if you know the most common use case.
6. Store time and date as Unix time
Date and time calculations should be done in Unix Epoch time, i.e. seconds from 1970 or whatever it is, that way it’s calculating anything is math of two longs, and we all know two longs don’t make a right !
7. Denormalize DB table design
Have redundancy in your database, you need it to avoid all joins. When joins occur the DB tables are locked. This contravenes rule 1 above. If the service can get all the data from a single table row then its going to be super quick. This means replicate fields, consider this
8. Split tables into
Writing to a single table is a bottleneck. Have multiple smaller identical schemed tables will allow for more writes. Simply hash the key and use the relevant table. In it’s simplest form when storing players IDs, have one table per group of the IDs first letter, so one group for A to G, another for H to P and lastly Q-Z. Of course eventually you will hit the transaction log limit of the DB (as there can only be 1 transaction log per DB).
9. Shard/Replicate Databases
Why not replicate the entire DB ? After all do you want players from Europe playing against the North America ? For non turn based games, pings times would not be good enough anyway.
10. Use Transient or In-Memory Tables
If you have data that becomes useless when the system goes down you would not want it persisted, so don’t put it in a persisted table. Normal DB tables get written to disk, and this is very slow. Store the tables for this data in memory, there is then no disk lookups etc. memory is 10,000 quicker than reading from an external device, even a PCIe 4 NVMe SSD will slow down as it gets fuller, avoid persistent storage if you don’t need it. Transient tables are your new friend (not sure if MySQL supports these yet). Just be careful of memory usage on your servers.
Don’t allow the client to get 40,000 monsters locations from the DB if they can only display 4 at a time. As the server is stateless it can’t hold the rest in memory in case the clients wants the next 4.
If a game client can send 10 requests per minute. Then allow them 20. Throttle the service so it can not be hit with a Denial Of Service (normally traffic from multiple IPs, but each IP is sending an unusual amount).
13. Set a Maximum concurrent users
Set a soft and hard cap. So if your service can scale 10000 concurrent users, that is the hard cap, if it hits that number stop taking new users. But before it gets that bad set a soft cap, say 9000, and only let in privileged users at that point. In games backends this is often players who have spent coin.
14. Profile Transactions
When optimizing, take the worst 5 transactions and profile these on the DB.
15. Index Tables Properly
Seek vs Scan. Read up on it here.
16. Push CPU intensive work to the client
Know your game client hardware. Sometimes you can send everything to the client and let it work things out for itself, i.e. for sorting lists of data for display etc. Remember not to trust the client though.