In the last 2 months the daily traffic at VersionEye doubled. The traffic on the site is constantly growing. We updated the EC2 instances with more RAM and more CPUs. Of course that is a very short term solution. To be able to scale better we did quiet some refactoring work in the last weeks.
The web application VersionEye started as a default Rails app. Exactly with this command:
rails new versioneye
That was more than 2 years ago. Over time the application was growing. We added parsers and crawlers to the Rails app, because it was easy to reuse the models and it was a fast way to add new features. At one point we added Grape to the project for implementing the API. For long time the Rails monolith was running on 1 single machine together with database and the crawlers. The server was lovely hand crafted by myself 🙂
With growing traffic the infrastructure had to change as well. First of all the backend systems (MongoDB, ElasticSearch and MemcacheD) got their on EC2 instance. The next step was to deploy the crawlers on a separate EC2 instance.
Ansible – a simple way to automate IT
2 servers are already reason enough to automate the IT infrastructure. I took a look to Chef and Puppet. But honestly I couldn’t get warm with any of them. They are too complicated and a bit over engineered for my current purpose. Marvin and Christian from Playtestcloud recommend me to take a look to Ansible. So I did and I really liked it. Ansible requires no master and nothing needs to be installed on the remote servers. The only requirement is that you are able to login via SSH to the remote server.
Currently the whole infrastructure for VersionEye is described in 17 roles, with Ansible. With 1 single command the whole infrastructure can be setup. Or parts of it. The ansible scripts are all managed in a separate git repository. That way all changes are tracked and documented. IT-Infrastructure is just another code repository now! That’s pretty cool!
Small code repositories
In the past the API was deployed together with the Rails application. And as the API become a little bit more popular that caused some issues. If somebody used heavily the API, that slowed down the web application. And vice versa.
In the last weeks we split up the big monolith Rails app. First of all we extracted the models, mailers, services and parsers into the ‘versioneye-core’ project. As next step we pulled out the Grape API into a separate application. The web application and the API are using ‘versioneye-core’ as a regular dependency now. Now the web application and the API can be deployed & scaled independently from each other, without effecting each other.
Splitting up the code in smaller repositories was not the only thing. The code itself was refactored as well. Big classes have been splitted up in smaller classes and test coverage was increased for each repository.
Next steps
I think we are on the right way. In the mean while the whole code base for VersionEye is distributed over 12 git repositories. There is still room for improvements. And as the service grows we will continue to split up in smaller services / git repositories.
But the directly next step will be to introduce RabbitMQ and EventMachine. Many tasks can be outsourced to RabbitMQ, to get done in parallel. For example tasks like sending out emails, parsing files and importing meta data from external APIs.
Interesting to read how you improved your performance. Thanks for sharing!
Thanks for your sharing this note!
I am interested in Ansible now 🙂
Anyone using it also? (with Rails)
You are welcome 🙂 I know that ThoughtWorks is using Ansible in a project together with Vagrant.
Sounds like Blinkist app architecture 😀 Good job. I got to check out ansible
Ansible is awesome. If you want to move away from Heroku you should really check it out!