MongoDB is currently the primary database for VersionEye. In the last couple weeks we had some performance and scaling issues. Unfortunately that caused some down times. Here are the learnings from the last 3 weeks.
MongoID
The Ruby code at VersionEye is using the MongoID driver to access MongoDB. All in one MongoID is a great piece of open source software. There is a very active community which offers a great support.
In our case MongoID somehow didn’t close the opened connections. With each HTTP Request a new connection to MongoDB is created. If the HTTP Response is generated the connection can be closed. Unfortunately this didn’t happened automatically. So the open connections summed up on the MongoDB Replica Set and the application become slower and slower over time. After a restart of the Replica Set the game started by 0 again the application was fast again. At least for a couple hours until the open connections summed up again into the hundreds.
For right now that’s fixed with this filter in the ApplicationController.
after_filter :disconnect_from_mongo def disconnect_from_mongo Mongoid.default_session.disconnect rescue => e p e.message Rails.logger.error e.message Rails.logger.error e.stacktrace.join "\n" end
Still not sure if this is a bug in MongoID or a misconfiguration on our side.
MongoDB Aggregation Framework
We have a cool Feature at VersionEye which shows the references for software packages. This are the references for the Rails framework, for example.
This feature shows you which other software libraries are using the selected software library as a dependency. Usually many references are a good sign for quality software.
In the beginning this feature was implemented with the Aggregation Framework of MongoDB and it was fast enough. This is the aggregation code snippet we used for this feature.
deps = Dependency.collection.aggregate( { '$match' => { :language => language, :dep_prod_key => prod_key } }, { '$group' => { :_id => '$prod_key' } }, { '$skip' => skip }, { '$limit' => per_page } )
At the time this was implemented we had less than 4 Million dependency records in the collection. Over time the collection was growing. Right now there are more than 9 Million records in the collection and the aggregation code snippet above is just terrible slow. And it slows down everything else too. If multiple HTTP Requests trigger this code the whole database is getting super slow! I wrote already a blog about that here.
One thing I learned is that the Aggregation Framework doesn’t take advantage of Indexes. Same is true for the Map & Reduce Feature in MongoDB. Originally Map & Reduce was created to crunch data in parallel, super fast. On MongoDB Map & Reduce is running on a single Thread, without indexes
Wrong Indexes
Instead of calculating the references in real time with MongoDBs Aggregation Framework, we wanted to pre calculated the references with a simple query. This one:
prod_keys = Dependency.where(:language => product.language, :dep_prod_key => product.prod_key).distinct(:prod_key)
The advantage of this distinct query over the Aggregation Framework is that it can take advantage of Indexes. And specially for that query there is an index!
index({ language: 1, dep_prod_key: 1 }, { name: "language_dep_prod_key_index" , background: true })
On localhost the query was running quiet fast. Still to slow for real time, but fast enough to pre calculate all values over night. On production it was running super slow! It needed for each query 17 seconds. Calculating the references for all 400K software libraries in our database would take 78 days.
Finally Asya gave the right hint. He recommended to double check the query in the mongo console with “.explain()”, to see which indexes are used. And indeed MongoDB was using the wrong index on production! Only God and the core-committers know why. For me that’s a bug!
This is what happens if you run a couple distinct queries which use the wrong index.
I deleted 5 indexes on the collection until MongoDB had no other choice than using the dam right index! And now it’s running fast enough. Finally!
Conclusion
Here are the conclusions for working with MongoDB:
- Check regularly the logs on the MongoDB Replica Set to recognize odd things.
- Close open connections.
- Avoid The Aggregation Framework if you can do the same with a simple query.
- Ensure that MongoDB is using the right Index for your query.
So far so good.
Just a note for those who find this later, “One thing I learned is that the Aggregation Framework doesn’t take advantage of Indexes.” is not correct. Aggregation Framework can and does use indexes – in your case though it was using the same wrong index as a regular query. More discussion why is in my reply in mongodb-user google group.
Hi Asya. Thanks for the correction. You are right. The Aggregation Framework does take advantage of indexes, but in my case it was using the wrong index and that’s why it was terrible slow. As slow as without indexes.
There is a related discussion to this blog post on Google Groups. For more insights read this thread:
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/mongodb-user/FKYeTgn6rqI