Shazam for software libraries

Nowadays its’ common to use package managers to describe, fetch and install 3rd party open source dependencies. But that was not always the case and there are many legacy projects out there in the Enterprises of this world there 3rd party libraries are stored in a “lib” directory without any documentation. It’s not a rare case that there are dependencies like a “beanutils.jar”. But which version is it? Which license does it have? Are there any known security vulnerabilities? Nobody knows!

The VersionEye API can identify such components by their SHA values. If you generate the SHA value for your JAR file and send it to the VersionEye API then you will get back the coordinates of that JAR file in Maven. That way you will know the exact GroupId, ArtifactID and Version of the component and based on that you can find out the license and known security vulnerabilities!

Of course it’s a lot of work to do that by hand, especially if you have a couple hundred files in your lib directory. That’s why we developed a command line tool to automate that. The veye_checker is implemented in Rust and we have binaries for Windows, Linux and Mac OS X which run out of the box! Simply download the binary from here, make the binary executable and run this command:

./veye_checker resolve ~/lib -a "YOUR_API_KEY" -o inventory.csv

With “-a” you specify your API key for the VersionEye API. With “-o” you specify the output file. The default the veye_checker outputs everything as CSV to the console. The results is a CSV which shows you exactly the files, their coordinates in the corresponding package manager, their license, the number of security vulnerabilities and a link to their VersionEye page. Here a screenshot:

os_inventory

Currently this works for Maven (Java), Nuget (C#), NPM (Node.JS) and PyPi (Python). Right now we have almost 10 Million SHA values in our database.

This is still a very early version of the “veye_checker”, but please try it out and give us feedback. We would love to hear from you!

Identifying components by SHA values

The public VersionEye API is exposing a lot of information about open source projects. If you know the coordinates of a component (software library) in a certain package manager, you can use the VersionEye API fetch all kind of meta information about the component. Meta information like available versions, licenses, security vulnerabilities and many more.

But what if you don’t know the coordinates of a component? What if you have somewhere on your disk a “beanutils.jar” and you don’t know which version it is or to which Maven GroupId it belongs to? You don’t know which license it has and you don’t know if it is vulnerable or not! For this problem the VersionEye API has a solution now. Simply create the SHA1 value for the “beanutils.jar” file. In Ruby it’s a one liner:

sha1 = Digest::SHA1.file "beanutils.jar"

Now take the SHA1 value and use the new SHA API Endpoint at VersionEye.

screen-shot-2017-02-08-at-08-54-36

With Curl it would be this command:

curl -X GET --header 'Accept: application/json' 'https://www.versioneye.com/api/v2/products/sha/427662b038bd8f52097f783f6ea163e45851b2a1?api_key=YOUR_API_KEY'

As response you get back the coordinates of the component. Now you know the Maven GroupId & ArtifactId and the version of your JAR file. That information you can use to gather even more information about the component from the VersionEye API. Simply use the products API Endpoint:

Screen Shot 2017-02-08 at 08.55.35.png

With Curl it would be this command:

curl -X GET --header 'Accept: application/json' 'https://www.versioneye.com/api/v2/products/Java/commons-beanutils%3Acommons-beanutils?prod_version=1.9.0&api_key=YOUR_API_KEY'

With that request you get back the dependencies, licenses and security vulnerabilities of the component. Now you know that your “beanutils.jar” is licensed under the Apache-2.0 license and it has at least 1 security vulnerability.

As you know the Maven coordinates of your “beanutils.jar” you could also simply visit the corresponding VersionEye page in the browser.

Screen Shot 2017-02-08 at 08.56.58.png

Here you see as well the license and the security vulnerability. And in the right upper corner you can see that there is a newer version of the component available, version 1.9.3. By clicking on that number you will see that the newest version of the component is not vulnerable.

The VersionEye database has SHA values for more than 90K .NET projects, more than 170K Java projects, almost 400K Node.JS projects and more than 88K Python projects. Altogether we have more than 9 Million SHA values.

screen-shot-2017-02-22-at-08-48-40

Every version of a component has at least 1 SHA value. In Maven a version of a component contains multiple files, like for example “*.jar”, “-javadoc.jar”, “-sources.jar” and “.pom”. In this case every file has his own SHA value, that means to 1 artefact we have multiple SHA values in our database.

The SHA method for Maven components and Node.JS modules is always “SHA1”. For Python projects we store the MD5 hash value. For .NET components we either have a base64 encoded SHA256 or SHA512 value. If you have a .NET Nuget package simply create a base64 encoded SHA256 value for it and fire it against our API. If the result is empty repeat the process with a SHA512 value. Here an example how it would look like in Ruby:

Base64.encode64 Digest::SHA512.digest File.read('Newtonsoft.Json.9.0.1.nupkg')

For more than 99% of all Nuget packages we have a SHA value.

Currently we have only SHA values for Java components from Maven central. That’s the official central repository for Java components. VersionEye is crawling a couple other Maven repositories as well, but for those we don’t have SHA values right now. But we are working on it.

Right now we do NOT have SHA values for Ruby but we are working on it!

Try out the new API and let us know how you like it.