A lot has happened since, and we believe the previous decisions were correct at the time. Most of them still are, but we have now a more realistic deadline and more limited resources, which will change some decisions.
Here’s a quick look at the outlined architecture, circa June 2015, and how it changed.
We would follow a SOA approach (or microservices, if you prefer), with each core game service being independent. We knew that this was hard , but the ease of scaling things out would eventually pay off.
That remains true for the reviewed architecture.
Operations & Deployment
We made a bet on Docker and stateless servers. It would fit perfectly with our microservices approach. We stand with this choice.
However, we knew that Docker alone wouldn’t fix stuff like CI, CD, Canary releases, Blue-Green deployment, automatic failover, scalability and redundancy, among other common problems related to (dev)ops.
Back then, Renato already was working on a generic software to automatically handle all this complex stuff. This software was known as "Reinventing the Wheel™", but a lot was learned from it.
Change: The first change, which was actually done several months ago, is to drop this custom software in favor of Google’s Kubernetes.
The game backend would be done mostly in Python and Elixir.
Major change: Programming in Elixir proved more fun and efficient than we expected. Seeing what was already done in Python, and how much work would take to rewrite it in Elixir, we decided to move the backend completely to Elixir and Erlang/OTP .
This brings several benefits. Aside from the common advantages of using a functional language, now we have:
- a team that speaks the same language and uses the same libraries.
- higher efficiency when developing new features.
- higher reliability on all code being developed by us.
Any choice is about trade-offs. Moving to Elixir is not different. Python was doing its job just fine, but given our limited funding, we feel we will be much more productive in Elixir.
A Python framework for AsyncIO that was being developed by us was also dropped.
NOTE: We are aware of Erlang’s advantages and disadvantages. Programmers should always choose the right tool for the job. Whenever Erlang/Elixir is not the best candidate, we’ll make use of a different language, either as a completely independent service or as a port/NIF.
Messaging & inter-service communication
We’ve had choose RabbitMQ as being responsible for sending data between services. It is robust, reliable and its routing features are amazing. RabbitMQ was doing the job perfectly fine, but we’ll resort to something else.
Major change: With the full migration to Elixir, we can now take advantage of the battle-tested Erlang/OTP environment. We were designing something that turned out to be similar to what José Valim is doing with GenRouter/GenBroker . This is our highest bet so far. We plan to have all Elixir services to communicate with each other using Erlang/OTP’s reliable framework.
will have services written in languages other than Elixir/Erlang. In that case, we will create a "Translator" service that, er, translates specific internal GenBroker messages and publishes them to RabbitMQ topics.
With a team familiar with systems administration, we made the choice of renting dedicated servers. It is cheaper and has better performance. We’ve been using OVH for over 2 years on HE1, and have only had a couple of problems in the meantime.
We stand with this choice. We believe that having one or two spare servers will provide nearly as much redundancy as we’d have with AWS or GCE, at a fraction of the price. Plus, we can’t afford the possibility of being DDoSed and having to pay thousands of dollars on bandwidth.
One additional benefit, now with our choice of Kubernetes, is that we can deploy the game continuously on a couple of AWS or GCE servers, and then buy the dedicated hardware only when releasing the first version of game.
We’ve choose PostgreSQL as our main database and we stand with it. It will hold most of the game data.
The second most important datastore within the game is Elastic
. It holds most of the geographical data and (in-game) log data.
Finally, we make use of Mnesia to hold internal settings, as well as specific game/feature flags.
For the cache layer we use Aerospike.
We resort to AWS S3 and GCE Nearline to backups and archives, respectively.
GIS (Geographic Information System)
PostGIS and Elasticsearch do the trick for us.
However, none of us have experience with GIS system, so we are still researching the best technologies there are. If you’ve got experience or recommendation for Mapserver, Geoserver or Mapnik, please talk to us.