web

Practical Matters When Building Servers in the Cloud - Configuration Management

For some time now I’ve been thinking about and reading about tools like Chef and Puppet.  A couple of years ago a got a couple of small projects off the ground with Puppet for a job I was working as well.  But, with the way the cloud is developing and my general belief that if you are really deploying a cloud computing application but then you find yourself logging into a server command prompt for some reason during normal operations then something has either gone wrong or you are doing something wrong.

The issue of scripted server build and configuration management is hardly new.  There are numerous other resources I’m sure you can search to learn the history of configuration management tools both commercial and open source.  For my part, I’ve been doing a number of experiments and have chosen to work with the Opscode Chef platform.  What follows in this article are a few of the things I’ve learned along the way.

Knowing some ruby helps a LOT!  Opscode chef is going to be challenging to get the hang of if you do not know the first thing about the ruby programming language.  If you are in that camp, you might just want to invest a bit of time w/ a good basic ruby programming language tutorial first.  A great deal of the flexibility and power of the chef tools is about being able to use ruby effectively.  This is not a major barrier because you do not need to be a l33t hax0r by any means.  But, it will help a great deal if you know how variables work, how to do a loop, case statements, and other basic constructs of the language.

I have been used to building and deploying things with push methods a lot in the past.  With a tool like chef things are kind of turned around the other way.  You need to put yourself the the shoes of the server you are trying to configure and think more about pulling data to yourself.  This is essentially what happens after you bootstrap a chef configured server.  It pulls a lot of data to it after it registers itself with a chef server somewhere.  It then uses this data(recipes and roles) to turn itself into the server it is told it should be more or less.

Why would I bother with all this you might be thinking!?  Well, assuming I have set up my environment properly, defined my roles, created the proper recipes assigned to that role, then with a command that looks a bit like the following:

knife rackspace server create 'role[serverrole]' --server-name myawesomeserver --image 49 --flavor 1

NOTE: This example uses the knife tools to interact w/ chef.  Knife is a command-line utility used to interact with a Chef server directly through the RESTful API.

I can start a server (a node) running the software I want running on the rackspace cloud with the code I want on it in about 4-5 minutes.  That’s 4-5 minutes from the time I hit the <enter> key on my keyboard!  Race ya!

Now, if I’m building one server, this might not seem very worthwhile.  But, if I am building 100 or 1000... or if I’m going to be building them and tearing them down constantly by the dozens or hundreds per day then yes, this makes ALL THE SENSE IN THE WORLD!  But WAIT! It gets better.

With this command I can launch THE SAME server on a different cloud in 4-5 minutes (AWS US-East-1c in this case)

knife ec2 server create -G WebServers,default --flavor m1.small -i ami-2d4aa444 -I /Users/me/Downloads/mykey.pem -S servername -x ubuntu 'role[serverrole]' -Z us-east-1c

Just think about this for a moment.  From my laptop (macbookpro in this case) I can launch a server that is exactly the same on two different cloud computing providers in well under 10 minutes without ever touching a cloud GUI like the AWS console or the Rackspace Cloud console (which would only slow me down).

Now, it wasn’t exactly trivial to set all this up so that it works.  But, the fact is, it wasn’t that bad either and I learned a TON along the way.

So, this was just a little intro article.  There are LOADS of great resources about chef out there.  I will warn you about one other thing I learned.  It’s a bit hard to search for information about this software because it’s called “chef” and it has “recipes” which means a lot of the time you’ll end up with search results from FoodTV network.  I like food so I don’t mind sometimes but it can be annoying.
I've worked with Puppet in the past, love it.  I'm working with Chef now, love it.  I'll almost certainly be using both for projects where they are best appropriate in the future.

Have good configurating and I'm sure I'll be writing more about this in the near future.

 

Dynamic DNS Rocks, More Sites Should Use It!

 

I was doing some thinking about DNS today and in particular, Dynamic DNS.  I'm still surprised more people haven't heard of and do not use this type of service.  DNS is one of those things that, in my opinion and if at all possible, you should outsource to people who can and will do it better than you.  Yes, that includes internal and external DNS.
In short, dynamic DNS services allow you to provide and programitcally control things like multiple load balanced A records or CNAMES for a single domain or web service.  This can be especially important in the context of an elastic cloud computing hosted service were certain things can sometimes be ephermal or come and go very quickly (like an IP address or a compute node).  Just like every other part of your infrastructure, your DNS needs to be elastic and programmable too.
Some of the reasons you might use Dynamic DNS:
  • Load Balancing - A Smarter version of round robin more or less
  • CDN Management
  • Site Migrations
  • Disaster Recovery
  • It'll make you all the rage at parties
I made a short list of some of the Dynamic DNS services I know about. Here they are:
Used this extensively over the years and have met the team.  It's a great service run by an excellent team. I highly recommend.
Used this one a couple of times and it worked out well.  Their interaface was a bit odd but I haven't used it for a couple of years.
Have not personally used this one so I can't provide much more information at the moment.  Will update in the future if that changes.
For further reading, WikiPedia Says...
"Dynamic DNS provers provide a software client program that automates the discovery and registration of client's public IP addresses. The client program is executed on a computer or device in the private network. It connects to the service provider's systems and causes those systems to link the discovered public IP address of the home network with a hostname in the domain name system. Depending on the provider, the hostname is registered within a domain owned by the provider or the customer's own domain name. These services can function by a number of mechanisms. Often they use an HTTP service request since even restrictive environments usually allow HTTP service. This group of services is commonly also referred to by the term Dynamic DNS, although it is not the standards-based DNS Update method. However, the latter might be involved in the providers systems."
So, while you are thinking about DNS I'll leave you with the following related tip...
Your DNS registrar is not the same as your dynamic DNS provider necessarily.  Your DNS Registrar should not necessarily be the same as your Dynamic DNS provider and it most definately should NEVER be your ISP/hosting provider.  Although, I have used www.dyndns.com and Dynect together for various reasons.  This is serious business if things go south w/ your hosting provider.  I have actually seen companies held hostage pending litigation over trivial matters when the wrong provider had registrar control.  Your domains are an asset.  Control them yourself and delegate control of them securely to someone you trust to get the help you do the work need.

 

Monitoring – Updated and Revisited

I’ve written several times on this blog about various monitoring tools and services.  But, in light of a recent project I’ve been working on for the last few months I have some updates.

The Overall Monitoring Architecture

There are areas of monitoring that I usually like to pay close attention to with a live web application.

  1. Process Monitoring - This makes sure things are running and stay running within certain tolerances. Examples are God, Monit, SMF.  Your choice will depending on your operating system and preferences with scripting languages.
  2. Resource Monitoring - This is fine grained CPU, Memory, Disk Space, Disk IO, Networking, application server threads, and much more. Examples are Nagios, Ganglia, and Munin. Choosing correctly depends on your specific situation.  There is a worth newcomer on the block called Reconnoiter that also looks very promising.
  3. UpTime Monitoring - This is the only monitor people usually do if they do any at all. This should be a disinterested 3rd party to provide accountability and what I call a 3rd party eye in the sky should any dispute about uptime arise.  I like pingdom and there are even free services as well.  I’ve also been using CloudKick in some situations for this purpose as well.

Those three above are from a post I wrote some time ago.  Today, I’m adding a 4th item to that list because it has finally become easy enough and reasonably affordable to add now that there is an affordable choice:

4. Synthetic Transaction Monitors – These actually perform tests of processes a user might go through in your application and report back any anomalies if they occur along with an error report, screen shot, and other data as appropriate.  I’ve been using a tool called BrowserMob and Selenium IDE for this.  You create scripts w/ Selenium, upload them to browswermob and then setup a monitor script.  That’s a simplified overview of course but it’s really quite effective and relatively affordable compared to historical solutions for synthetic transaction monitoring.  Historically it was prohibitively expensive to do synthetic transaction monitors.

The Monitoring Tools I use

What follows are some of my personal current favorites to meet the above goals.

Munin > http://munin.projects.linpro.no/

One of the things on my list for a while to get done is enable munin across your systems.  I use it a lot successfully.  You can see a demo here:

Monit > http://mmonit.com/monit/

Pingdom > http://www.pingdom.com

 

Things I’m testing and have high hopes for are Reconnoiter, BrowserMob monitoring, CloudKick

 

Sphinx Install: Ubuntu 9.04 Rackspace Cloud Server

Today, for a project, I needed to install Sphinx on an Ubuntu 9.04 Rackspace Cloud server.  This will get it done and might come in handy for folks.

Launch a shiny new cloud server w/ the ubuntu 9.04 server template provided by Rackspace.


apt-get update
apt-get install libmysql++-dev make gcc+ g++

Go to a directory of your choosing

Download The Sphinx Binary (latest as of this writing)
http://sphinxsearch.com/downloads/sphinx-0.9.8.1.tar.gz

tar zvxf sphinx-0.9.8.1.tar.gz

cd sphinx-0.9.8.1

./configure

make

make install (you will need superuser rights for this last step)

The binaries will be install in /usr/local/bin

/usr/bin/install -c 'indexer' '/usr/local/bin/indexer'
/usr/bin/install -c 'searchd' '/usr/local/bin/searchd'
/usr/bin/install -c 'search' '/usr/local/bin/search'
/usr/bin/install -c 'spelldump' '/usr/local/bin/spelldump'

some config files will drop into /usr/local/etc

/usr/bin/install -c -m 644 'sphinx.conf.dist' '/usr/local/etc/sphinx.conf.dist'
/usr/bin/install -c -m 644 'sphinx-min.conf.dist' '/usr/local/etc/sphinx-min.conf.dist'

Enjoy!