Cloud computing and being lazy
The need to create template images in our cloud environment is obvious. Especially with Amazon EC2 offering an amazing API and spot instances in ridiculously low prices.
In the following post I’ll show what I am doing in order to prepare a “puppet-ready” image.
Puppet for the rescue
In my environment I have puppet configured and provisioning any of my machines. With puppet I can deploy anything I need – “if it’s not in puppet – it doesn’t exist”.
Coupled with Puppet dashboard the interface is rather simple for manually adding nodes. But doing stuff manually is slow. I assume that given the right base image I (and you) can deploy and configure that machine with puppet.
In other words, the ability to convert a bare machine to a usable machine is taken for granted (although it is heaps of work on its own).
Handling the “bare” image
Most cloud computing providers today provide you (usually) with an interface for starting/stopping/provisioning machines on its cloud.
The images the cloud providers are usually supplying are bare, such as CentOS 6.3 with nothing. Configuring an image like that will require some manual labour as you can’t even auto-login to it without some random password or something similar.
Create a “puppet ready” image
So if I boot up a simple CentOS 6.x image, these are the steps I’m taking in order to configure it to be “puppet ready” (and I’ll do it only once per cloud computing provider):
# install EPEL, because it's really useful
rpm -q epel-release-6-8 || rpm -Uvh http://download.fedoraproject.org/pub/epel/6/`uname -i`/epel-release-6-8.noarch.rpm
# install puppet labs repository
rpm -q puppetlabs-release-6-6 || rpm -ivh http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-6.noarch.rpm
# i usually disable selinux, because it's mostly a pain
setenforce 0
sed -i -e 's!^SELINUX=.*!SELINUX=disabled!' /etc/selinux/config
# install puppet
yum -y install puppet
# basic puppet configuration
echo '[agent]' > /etc/puppet/puppet.conf
echo ' pluginsync = true' >> /etc/puppet/puppet.conf
echo ' report = true' >> /etc/puppet/puppet.conf
echo ' server = YOUR_PUPPETMASTER_ADDRESS' >> /etc/puppet/puppet.conf
echo ' rundir = /var/run/puppet' >> /etc/puppet/puppet.conf
# run an update
yum update -y
# highly recommended is to install any package you might deploy later on
# the reason behind it is that it will save a lot of precious time if you
# install 'httpd' just once, instead of 300 times, if you deploy 300 machines
# also recommended is to run any 'baseline' configuration you have for your nodes here
# such as changing SSH port or applying common firewall configuration for instance
yum install -y MANY_PACKAGES_YOU_MIGHT_USE
# and now comes the cleanup phase, where we actually make the machine "bare", removing
# any identity it could have
# set machine hostname to 'changeme'
hostname changeme
sed -i -e "s/^HOSTNAME=.*/HOSTNAME=changeme" /etc/sysconfig/network
# remove puppet generated certificates (they should be recreated)
rm -rf /etc/puppet/ssl
# stop puppet, as you should change the hostname before it will be permitted to run again
service puppet stop; chkconfig puppet off
# remove SSH keys - they should be recreated with the new machine identity
rm -f /etc/ssh/ssh_host_*
# finally add your key to authorized_keys
mkdir -p /root/.ssh; echo "YOUR_SSH_PUBLIC_KEY" > /root/.ssh/authorized_keys
Power off the machine and create an image. This is your “puppet-ready” image.
Using the image
Now you’re good to go, create a new image from that machine and any machine you’re going to create in the future should be based on that image.
When creating a new machine the steps you should follow are:
- Start the machine with the “puppet-ready” image
- Set the machine’s hostname
hostname=uga.bait.com
hostname $hostname
sed -i -e "s/^HOSTNAME=.*/HOSTNAME=$hostname/" /etc/sysconfig/network
Run ‘puppet agent –test’ to generate a new certificate request
Add the puppet configuration for the machine, for puppet dashboard it’ll be something similar to:
hostname=uga.bait.com
sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:add name=$hostname
sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:groups name=$hostname groups=group1,group2
sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:parameters name=$hostname parameters=parameter1=value1,parameter2=value2
Authorize the machine in puppetmaster (if autosign is disabled)
Run puppet:
# initial run, might actually change stuff
puppet agent --test
service puppet start; chkconfig puppet on
This is 90% of the work if you want to quickly create usable machines on the fly, it shortens the process significantly and can be easily implemented to support virtually any cloud computing provider!
I personally have it all scripted and a new instance on EC2 takes me 2-3 minutes to load + configure. It even notifies me politely via email when it’s done.
I’m such a lazy bastard.
After several conversations with my good friend and now also boss I decided to open a small little Blog which will describe my Linux/SysAdmin/C++/Bash adventures.
I’m Dan, you can learn a lot about me from my website at http://nevela.com . My expertise is Unix/Linux System administration and C++ development. Linux is my home and Bash is my mother’s tongue. In this Blog I hope to share interesting topics, stories and decisions in the mentioned fields.
I’ll start with something fresh and new for me – Puppet (http://reductivelabs.com/trac/puppet).
I believe System administration is an art, just like programming. SysAdmins divide into 2 types, from what I’ve seen:
- The fireman – most likely the mediocre SysAdmin type you are very familiar with. This type of SysAdmin knows how to troubleshoot and apply the specific fix where’s it’s needed. They get things done, but not more than that. Gradually their workload grows (especially if they do a bad job) and consume all of their time. This is the bad type of SysAdmin. Words like planning are usually not among their jargon (although they might argue with you).
- The developer – this type of SysAdmin plans before doing anything. I hope I can belong to this group. He thinks as a developer – whenever a problem arises in a system he manages – he treats it as a bug rather then a problem. This type of SysAdmin will always have his configuration stored in a source control repository. He will always solve bugs he encounter by fixing the script or program that caused the so called problem.
If you’re a fireman – then this post is not for you. However, if you’re the second type, then puppet is exactly for you.
As a SysAdmin in a small startup company, I got to a position where I have to manage 2 types of machines:
- Production machines
- In house machines for internal company usage
Luckily managing dozens of production machines is easy for us (for me :)) as the configuration is most likely to be the same on all of them. On the other hand, ironically, managing the in house machines became a more time consuming task, as each machine has many different and diverse jobs. This is where puppet kicks in.
Up until today, we had a few main servers doing most of the simple tasks, among them:
- LDAP
- Samba/NFS file sharing
- SVN
- Apache
- In house DNS and DNS caching
- Buildbot master
- Nagios
- And many more other things (you got the point)
Obviously I have all of my configuration stored in our SVN and backed up. Given that one of these servers would die, I could generate a new one within hours. But yet, most of the configuration there was done manually. Who is the crazy SysAdmin that would write a script to meta-generate named.conf? Who would auto create SVN repositories (especially if you have just one repository to manage)?
Yes, this is where puppet goes in. Puppet forces you to work correctly. By working with puppet, you realize that the way you’re going to work is only by writing puppet recipes and automating everything that makes your servers what they are. Puppet is also revolutionary in many other ways, but I’ll focus on the way that it forces the SysAdmin to work in a clean manner.
One thing I have to warn from, puppet has a slightly steep learning curve, but if you’re the serious Linux SysAdmin type – you wouldn’t find that one steep. You’ll struggle with it. You must.
So I said that no one (most likely) would write a script to generate its DNS configuration, but I saved mine in SVN and wrote a recipe for DNS servers to use:
$dns_username = 'named'
$dns_group = 'named'
$dns_configuration_file = '/etc/named.conf'
class dns-server {
# puppet takes care to install the latest package bind and it's cross platform!
package { bind:
ensure => latest
}
# configuring named. subscribing it on /etc/named.conf, which means that every time this
# file would change, named will refresh
service { named:
ensure => running,
enable => true,
subscribe => File[$dns_configuration_file],
require => Package["bind"]
}
# this is the /etc/named.conf file. in subclasses we'll specify the source of the file
file { $dns_configuration_file:
owner => $dns_username,
group => $dns_group,
replace => true,
mode => 640
}
}
Alright, this puppet code snippet only configures /etc/named.conf for usage with bind – but we never told puppet where to take the configuration from. I’ll share with you my DNS master server configuration:
class dns-master inherits dns-server {
# if it's a DNS master - lets take the named-master.conf file (obviously stored in SVN as well)
File[$dns_configuration_file] {
source => "$puppet_fileserver_base/etc/named/named-master.conf",
}
# copy all the zone files
file { "/var/named":
source => "$puppet_fileserver_base/etc/named/zones",
ignore => ".svn*",
checksum => "md5",
sourceselect => all,
recurse => true,
purge => false,
owner => $dns_username,
group => $dns_group,
replace => true,
mode => 640
}
# subscribe all the zone files in named
Service[named] {
subscribe +> File["/var/named"]
}
}
Great, so now every time I’ll change a zone in my puppet master, named will automatically restart. I believe this is great. Needless to say I do not use dynamic DNS updates, so I care not about zone files being updated.
Yes, so I did it once for DNS. If I ever needed now to change the configuration, I’d do it via puppet and let the configuration propagate. Maybe a small little phase of checking the new configuration, but right afterwards would come the svn commit – and it’s inside. Now I know I’m tidy and clean forever.
Needless to say that if I’ll ever need to configure another DNS server, I’ll use this recipe, or extend it as needed – puppet lets you inherit from base classes and add properties and behavior as needed (I have also a class for a dns-slave as you could have imagined).
OK, I should sum it up. To be honest – I’m glad I’m writing this part of the post after several discussions with a few of my wise colleagues.
Why should I use puppet?!
The answer to this one if not easy – and I’m not the one who should answer it. It is you who should answer it.
Adopting puppet reflects one main thing about you as a SysAdmin:
You are a lazy, strict and tidy SysAdmin, an expert.