B2B vs. Cloud

If you read news sites, such as TechCrunch or Mashable, you will get the impression that the B2B (on premise) market is dying and everybody is moving into the Cloud. Even SAP has some offerings in the Cloud. Specially Amazon is pushing the message that companies who are not in the Cloud will loose market shares in future because they are not competitive. But there is difference between Reality and what the Media sites are publishing!

B2B

I’m not saying that the Cloud is a bad thing. VersionEye is running on AWS and I’m very satisfied with the Amazon Cloud! But there are many companies with very sensitive data, who care a lot about privacy and they will never move their data into the Cloud. Beside that a Cloud infrastructure is not necessarily cheaper. It always depends.

If you build a new product you have to make a decision. Do you build it as SaaS (Software as a Service) or do you go for the B2B (Business to Business) market? VCs and Investors prefer to invest in SaaS companies these days. But selling a B2B product to big companies can make your company much faster profitable. And the successful vendors for developer tools (GitHub & Atlassian) still make most of their Money with selling licenses to big companies.

B2B is not hyped as much as Cloud, but it’s anyway a huge market!

AWS EC2 t2.small against m3.medium

The web frontend for VersionEye is a Ruby on Rails application which runs on Puma. For long time it was running on a couple EC2 m3.medium instances, with 2 CPUs and 4 GB RAM and SSD HD. The Puma server was starting 4 worker processes and each of them could have max 16 Threads.

EC2

At Christmas time the traffic usually goes down because everybody is spending time with family. That’s why started a small experiment. I downgraded the EC2 instances from m3.medium to t2.small. They have only 1 CPU and 2 GB RAM, no SSD and network performance is classified as “Low to Moderate”.

VersionEye was running on this instances for a couple days and performance wise it was really an issue. The application was really much slower and in the Google Webmaster Tools the crawling errrors increased dramatically, because of timeouts. The network performance was sometimes so bad that I was not even able to SSH into the machines.

So I upgraded the instances again to m3.medium. Now the performance is back to normal 🙂

Intro to Ansible

Ansible is a tool for managing your IT Infrastructure.

If you have only 1 single server to manage, you probably login via SSH and execute a couple shell commands. If you have 2 servers with the same setup you loose a lot of time if you do everything by hand. 2 Servers are already a reason to think about automation.

How do you handle the setup for 10, 100 or even 1000 servers? Assume you have to install ruby 2.1.1 and Oracle Java 7 on 100 Linux servers. And by the way both packages are not available via “apt-get install”. Good luck by doing it manually 😀

That’s what I asked myself at the time I moved away from Heroku. I took a look to Chef and Puppet. But honestly I couldn’t get warm with any of them. Both are very complex and for my purpose totally over engineered. A friend of my recommended finally Ansible.

AnsibleLogo_transparent_web

I never heard of it and I was skeptical in the beginning. But after I finished watching this videos, I was convinced! It’s simple and it works like expected!

Key Features

Here some facts

  • Ansible doesn’t need a master server!
  • You don’t need to install anything on your servers for Ansible!
  • Ansible works via SSH.
  • Just tell Ansible the IP Addresses of your servers and run the script!
  • With ansible your IT infrastructure is just another code repository.
  • Configuration in Yaml files

Sounds like magic? No it’s not. It’s Python 😉 Ansible is implemented in Python and works via the SSH protocol. If you configured password less login on your servers with public certificates, than Ansible only need the IP Addresses of the servers.

Installation

You don’t need to install Ansible on your servers! Only on your workstation. There are different ways to install it. If you are anyway a Python developer I assume you have installed Pypi, the package manager from Python. In that case you can install it like this:

sudo pip install ansible

On Mac OS X you can install it via the package manager brew.

$ brew update
$ brew install ansible

And there are much more ways to install it. Read here: http://docs.ansible.com/intro_installation.html.

Getting started

With Ansible your IT infrastructure is just another code repository. You can keep everything in one directory and put it under version control with git.

Let’s create a new directory for ansible.

$ mkdir infrastructure
$ cd infrastructure

Ansible has to know where your servers are and how they are grouped together. That information we will keep in the hosts file in the root of the “infrastructure” directory. Here is an example:

[dev_servers]
192.168.0.30
192.168.0.31

[www_servers]
192.168.0.33

As you can see there are 3 servers defined. 2 Of them are in the “dev_servers” group and one of them is in the “www_servers” group. You can define as many groups with as many IP addresses as you want. We will use the group names in Playbook, to assign roles to them.

Playbooks

A Playbook assigns roles (software) to server groups. It defines which role (software) should be installed on which server groups. Playbooks are stored in Yaml files. Let’s create a site.yml file in the root of the “infrastructure” directory, with this content:

---
- hosts: dev_servers
  user: ubuntu
  sudo: true
  roles:
  - java
  - memcached

- hosts: www_servers
  user: ubuntu
  sudo: true
  roles:
  - nginx

In the above example we defined that on all dev_servers the 2 roles “java” and “memcached” should be installed. And on all web servers (www) the role “nginx” should be defined. The “hosts” from the site.yml has to match with the names from the hosts file.

Otherwise we define here to each hosts the user (ubuntu), which should be used to login to the server via SSH. And we defined that “sudo” should be used in front of each command. As you can see there is no password defined. I assume that you can login to your servers without a password, because of cert auth. If you use AWS, that is the default anyway.

Roles

All right. We defined the IP Addresses in the hosts file and we assigned roles to the servers in the site.yml playbook. Now we have to create the roles. A role describes exactly what to install and how to install it. A role can be defined in a single Yaml file. But also in a directory with subdirectories. I prefer a directory per role. Let’s create the first role

$ mkdir java
$ cd java

A role can contain “tasks”, “files”, “vars” and “handlers” as subdirectories. But at least the “tasks” directory.

$ mkdir tasks 
$ cd tasks

Each of this subdirectories have to have a main.yml file. This is the main file for this role. And this is how it looks for the java role:

---
- name: update debian packages
  apt: update_cache=true

- name: install Java JDK
  apt: name=openjdk-7-jdk state=present

Ansible is organized in modules. Currently there are more than 100 modules out there. The “apt” module is for example for “apt-get” on Debian machines. In the above example you can see that the task directives are always 2 lines. The first line is the name of the task. This is what you will see in the command line if you start Ansible. The 2nd line is always a module with attributes. For example:

apt: name=openjdk-7-jdk state=present

This is the “apt” module and we basically tell here that the debian package “openjdk-7-jdk” has to be installed on the server. The full documentation of the apt module you can find here.

Let’s create another role for memcached.

$ mkdir memcached
$ cd memcached
$ mkdir tasks
$ cd tasks

And add a main.yml file with this content:

---
- name: update debian packages
 apt: update_cache=true

- name: install memcached
 apt: name=memcached state=present

Easy right? Now let’s create a more complex role.

$ mkdir nginx
$ cd nginx
$ mkdir tasks
$ mkdir files
$ mkdir handlers

This role has beside the tasks also files and handlers. In the files directory you can put files which should be copied to the server. This is especially useful for configuration files. In this case we put the nginx.conf into the files directory. The main.yml file in the tasks directory looks like this:

---
- name: update debian packages
  apt: update_cache=true

- name: install NGinx
  apt: name=nginx state=present

- name: copy nginx.conf to the server
  copy: src=nginx.conf dest=/etc/nginx/nginx.conf
  notify: restart nginx

The first tasks updates the debian apt cache.  That is similar to “apt-get update”. The 2nd tasks installs nginx. And the 3rd tasks copies nginx.conf from the “files” subdirectory to the server to “/etc/nginx/nginx.con”.

And the last line notifies the handler “restart nginx”. Handlers are defined in the “handlers” subdirectory and are usually used to restart a service. The main.yml in the “handlers” subdirectory looks like this:

---
- name: restart nginx
  service: name=nginx state=restarted

It uses the “service” module to restart the web server nginx. That is mandatory because we installed a new nginx.conf configuration file.

RUN

Allright. We defined 3 roles and 3 servers now. Lets setup our infrastructure. Execute this command in the root of the infrastructure directory:

ansible-playbook -i hosts site.yml

The “ansible-playbook” command takes a playbook file as parameter to execute it. In this case the “site.yml” file. In addition to the playbook file we let the command know where our servers are with “-i hosts”.

Fazit

This was a very simple example as intro. But you can do easily much complex things. For example manipulating values in existing configuration files on servers, checking out private git repositories or executing shell commands with the shell module. Ansible is very powerful and you can do amazing things with it!

I’m using Ansible to manage the whole infrastructure for VersionEye. Currently I have 36 roles and 15 playbooks defined for VersionEye. I can setup the whole infrastructure with 1 single command! Or just parts of it. I even use Ansible for deployments. Deploying the VersionEye crawlers into the Amazon Cloud is 1 single command for me. And I even rebuild the capistrano deployment process for Rails apps with Ansible.

Let me know if you find this tutorial helpful or you have additional questions. Either here in the comments or on Twitter.

Don’t miss tomorrow’s Tech Talk: “Continuous updating of software libraries in the cloud”

VersionEye founder & CEO, Robert Reiz, will talk about „Continuous updating of software libraries in the cloud” on August 6th in Berlin Mitte. The forum discussion is presented by SIBB Forum Saas & Cloud Computing.

All software developers who want to know how they can keep their software projects up to date, and want to find out quickly and easily what dependencies are out-dated, should learn more about our convenient SaaS solution. Using VersionEye you will be informed about new versions of your software libraries to be always up to date.

Computer scientist Robert Reiz founded his third start-up “VersionEye” in Silicon Valley in 2012. Previously, he was Chief Architect and Managing Director for “WildGigs” in San Francisco and founder & CEO of the Java consulting house “PLOIN” in Mannheim. Today VersionEye has two locations in Berlin and San Francisco and a team around the globe.

We are looking forward to your forum participation! Snacks and beer are available for free, while supplies last. Please register here: http://www.amiando.com/COYTZWA.html

date: August 6, 2013
time: 5 pm to 7 pm
location: VersionEye GmbH, Brunnenstr. 181, 1. HH. 1. OG, 10119 Berlin

 

Image

VersionEye at “meet the cloud” – meet & greet with open it berlin

ImageVersionEye founder & CEO, Robert Reiz, will talk about „Continuous Updating” on August 22nd at the “meet the cloud” meet & greet, presented by open it berlin.

All software developers who want to know how they can keep their software projects up to date, and want to find out quickly and easily what dependencies are out-dated, should learn more about our convenient SaaS solution. Using VersionEye you will be informed about new versions of your open source software libraries to be always up to date.

We are looking forward to seeing you there! The participation is free of charge, but please send an email to Michael Stamm for your registration: stamm@tsb-berlin.de or register here: http://www.open-it-berlin.de/veranstaltungen/meet-greet-mit-open-it-berlin-meet-cloud

date: August 22, 2013
time: 3 pm to 6 pm
location: TSB Innovationsagentur Berlin GmbH, Ludwig Erhard Haus, 5. OG, Fasanenstr. 85, 10623 Berlin

New date for event: „Continuous updating of software libraries in the cloud”

New date for event: „Continuous updating of software libraries in the cloud”

VersionEye founder & CEO, Robert Reiz, will talk about „Continuous updating of software libraries in the cloud” on August 6th. The forum discussion is presented by SIBB Forum Saas & Cloud Computing.

All software developers who want to know how they can keep their software projects up to date, and want to find out quickly and easily what dependencies are out-dated, should learn more about our convenient SaaS solution. Using VersionEye you will be informed about new versions of your software libraries to be always up to date.

Computer scientist Robert Reiz founded his third start-up “VersionEye” in Silicon Valley in 2012. Previously, he was Chief Architect and Managing Director for “WildGigs” in San Francisco and founder & CEO of the Java consulting house “PLOIN” in Mannheim. Today VersionEye has two locations in Berlin and San Francisco and a team around the globe.

We are looking forward to your forum participation! Snacks and beer are available for free, while supplies last. Please send an email to Astrid Vieth for your registration: astrid.vieth@sibb.de

date: August 6, 2013
time: 5 pm to 7 pm
location: VersionEye GmbH, Brunnenstr. 181, 10119 Berlin