Another lesson learned with MongoDB

MongoDB is the primary database at VersionEye. Currently VersionEye is crawling more than 600K open source projects on a daily basis. Some of the crawlers are implemented in Java, others in Ruby. You can follow a library at VersionEye and as soon the next version comes out you get a email notification. Today I got this email from VersionEye.

Screen Shot 2015-06-25 at 18.20.59

As you can see the version information is missing for the Java libraries. The email template was not touched in the last couple days. Obviously the crawlers for Maven repositories are implemented in Java :-) and they get updated more frequently. The error must be somewhere in the Java crawlers.

The version object is an embedded document in the product object. Every time a crawler finds a new version it adds a new version object to the corresponding product object. The code for that looks like that.

BasicDBObject productMatch = new BasicDBObject();
productMatch.put(Product.LANGUAGE, language);
productMatch.put(Product.PROD_KEY, prodKey);

BasicDBObject versionObj = version.getDBObject();
versionObj.put(Version.VERSION, version.getVersion());

BasicDBObject versionsUpdate = new BasicDBObject();
versionsUpdate.put("$push", new BasicDBObject(Version.VERSIONS, versionObj));

So far so good. In the next lines the product object is updated with the current time.

DBObject verUpdate = getDBObjectByKey(language, prodKey);
verUpdate.put(Product.UPDATED_AT, new Date());
getCollection().update(productMatch, verUpdate);

And of course there is a unit test case for this code and the test case is always green. On production sometimes the new version just disappears. Not always! Just sometimes. At first I thought I found a bug in MongoDB, but this only happened to the Java crawlers, never to the Ruby crawlers. So the root of all eval must be the implementation. I needed a whole day to figure it out!

On production MongoDB is running in a Replica Set on multiple hosts and 2 days ago I changed the read preference of the mongodb driver to “secondary”. That means that the read operations are distributed to ALL nodes in the Replica Set. And this is what happened.

The first code snippet always runs through and adds a new version to the product. But then the 2nd code snippet is reloading the product object from the db and executing an update.

DBObject verUpdate = getDBObjectByKey(language, prodKey);
verUpdate.put(Product.UPDATED_AT, new Date());
getCollection().update(productMatch, verUpdate);

If the changes are not yet distributed in the whole Replica Set and the read operation goes to a node which doesn’t has the new version yet, a product object is loaded without the new version. On this object the “updated_at” field is updated and stored back to the database. But the “update” method on the java driver doesn’t update only the changed field, it updates the whole object. And so it comes that it stores the object without the new version.

There are different solutions to this. First of all I could change the read preference back to “primary” again. But there is a better solution. Actually there is a way to only update single properties in a document in MongoDB. That works like this.

DBObjectnewValues = getDBObjectByKey(language, prodKey);
newValues.put(Product.UPDATED_AT, new Date());
BasicDBObject set = new BasicDBObject("$set", newValues);
getCollection().update(productMatch, set);

The big difference is in line 3. That tells the java driver to only update the changed properties. On day headache for a one liner! I hope I can save somebody else 1 day headache with this blog post.

VersionEye Maven Plugin 3.5.0

Version 3.5.0 of the VersionEye Maven Plugin is out. It offers another configuration option to skip certain dependencies with a defined scope.

By default the VersionEye Maven Plugin is resolving ALL dependencies and sending ALL dependencies to the VersionEye API. The VersionEye Server is then checking the dependencies and is sending out emails. But sometimes you get notifications for a dependency which is under test scope or provided scope. Something you don’t really ship with your application. Ideally you want VersionEye to ignore those dependencies. Now you can configure which scopes should be skipped by the plugin. Simply add this line to your plugin configuration.

<skipScopes>test,provided</skipScopes>

The line above will ignore all dependencies which have test or provided scope.

VersionEye Update

VersionEye can monitor your project on GitHub/Bitbucket and notify you about out-dated dependencies and license violations. The integration via the GitHub/Bitbucket API works very well and is very convenient.

However. Not everybody is using GitHub/Bitbucket. Through the VersionEye API you can create/update projects as well and take advantage of the VersionEye infrastructure. Assume you created already a project through file upload or via the URL feature. Now you want to update the project every day automatically with your current project file to get the notifications. For that purpose there is a new open source project, versioneye_update at GitHub.  It’s a very simple shell script which is using CURL to upload a project file to the VersionEye API. In the first lines of the script the variables need to be configured.

#!/bin/bash 

VERSIONEYE_SERVER=https://www.versioneye.com
API_KEY=<YOUR_SECRET_API_KEY> 
PROJECT_ID=<YOUR_PROJECT_ID>

If that is done, it’s dead easy to update an existing project. Simply run:

./update.sh <PROJECT_FILE>

For example:

./update.sh composer.lock

To update an existing project with your current composer.lock file. The script will output the number the dependencies, number of out-dated dependencies and the number of license violations for the case you have a license whitelist assigned to the project at VersionEye.

update-sh

This project is meant to be executed on a Continuous Integration Server. Ideally you update your VersionEye project with the current project file on each build.

Let me know how this works for you. You are also welcome to open a ticket on the repo or a to send a pull request with improvements ;-)

Introducing pessimistic mode for license whitelist

With VersionEye you can setup very easily a license whitelist. Simply put software licenses on the license whitelist you want/allowed to use in your software project. In the edit mask you get even suggestions via autocomplete. The suggestion are from the SPDX license list.

03-license-whitelist

The coole thing here is that VersionEye is doing the normalisation for the licenses. Even if there is no exact match VersionEye will recognise the licenses in your project and will be able to assign the rules.

Some software libraries have more than 1 license. Some have a dual license and some software libraries offer even 3 licenses. Assume you are using a software library which has 2 licenses. A GPL-2.0 license and a Ruby license. The Ruby license is on your license whitelist, but the GPL-2.0 not. Does this software library violate your license whitelist? Should this dependency counted as violation or not? It depends. By default VersionEye is optimistic and as long the dependency has at least 1 license which is on the license whitelist the dependency doesn’t count as violation.

Here in this example we can see that several rows are marked red. But in the project head we can see that there are only 2 license violations. The libraries kgio and raindrops have both just 1 single license (LGPL-2.1+) and it is not on the license whitelist. This 2 dependencies are violations of the license whitelist. The other dependencies have at least 1 license which is on the license whitelist.

Screen Shot 2015-06-22 at 10.55.45

Now you can configure this behaviour. Now in the detail view of the license whitelist there is a checkbox for “pessimistic mode”.

Screen Shot 2015-06-22 at 11.37.45

If the pessimistic mode is turned on VersionEye will count every dependency which has at least 1 license not on the license whitelist, as a violation of the license whitelist. With pessimistic mode turned on the same project looks like that.

Screen Shot 2015-06-22 at 11.39.08

Instead of 2 violations of the license whitelist we have 4 now. Because 4 unique dependencies have at least 1 license which is not on the license whitelist.

This is a very new feature, please try it out and give feedback. If you are not sure how to use it you should talk to your compliance department.

Improved mute feature

VersionEye shows you which of your project dependencies are out-dated. Sometimes you have good reasons not to update. Sometimes you want to stick to an out-dated version, because the newest version is buggy or insecure. On the other side you don’t want to get the notification emails from VersionEye every day for that library. You know that it’s out-dated and you have your reasons to stick to it. In that case you can “mute” that specific artefact. Simply click on the “mute” icon in the left column in the dependency table.

Screen Shot 2015-06-22 at 11.46.02

Here in this example we muted “jbuilder” version 2.3.0 and the dependency turns “green”.

Screen Shot 2015-06-22 at 11.46.37

In the example above jbuilder is marked as “green”, even through it is out-dated, but version “2.3.0” is muted. If the next version of jbuilder comes out, let’s say 2.3.1, than the dependency turns red again. In that case you can take a look and try out the new version. If it is still buggy or insecure you can mute that version again.

This mute feature affects the overall project numbers and the project dependency badge. If you mute all your out-dated dependencies the dependency badge turns green.

This feature worked very well for single file projects. It was buggy for multi file projects in the past. That bug is fixed now. There are several unit tests for that feature for multi file projects. Now you can use the mute feature in child projects as well.

Improved Security Feature

Since a couple days VersionEye is showing security vulnerabilities for PHP.  If VersionEye is monitoring a composer.json/composer.lock file for you you will see a security tab in your project detail view, there all the known security vulnerabilities are displayed. The problem with that is that you still have to go into the project and into the security tab to see that. If you have many projects, that can be time consuming. It would be great to see directly in the project overview which of your projects are affected. And now it works like that. Now the vulnerable projects are marked completely red in the project overview.

Screen Shot 2015-06-12 at 12.02.14

That way you can see immediately which of your projects are affected and how many known security vulnerabilities are assigned to your project dependencies.

Project Summary Report

Probably you are using VersionEye for a multi file project. Most projects nowadays are using multiple languages/package managers. But even if you stick to 1 language you can end up with multiple files describing your dependencies.

If you have a big Java project on VersionEye, created through the VersionEye Maven Plugin, you will have a lot of dependencies. In the project view header you can see the summed up numbers over all your project files. Probably you will see that X dependencies are violating your license whitelist. But in order to see the affected dependencies you had to click through the project files in the license tab. All the ‘green’ dependencies are in the way and hiding the actual information you are looking for. That’s why the project view got a big improvement now!

Now every project view has a “Summary”. This shows you in the “Versions” tab all out-dated and unknown dependencies over all your project files. That way you can see immediately what’s going wrong and you don’t have to click through the project files anymore to find your out-dated dependencies.  Here is an example how it looks like.

versioneye-summary-report

In the “License” tab you get a similar summary report where you can see all dependencies which violate your license whitelist.

You can click on the sub projects header to dig deeper and to see all the dependencies. The old project views are still available.

This is a very new feature. Please give feedback and tell how it works for you.