Bashing Linux

Bash RPC 2 comments

Auto configuring complex cluster architectures is a task many of you might probably skip, using the excuse that it’s a one time task and it’ll never repeat itself. WRONG!
Being lazy as I could and the need to quickly deploy systems for hungry customers, I started out with a small little infrastructure that helped me along the way to auto configure clusters of two nodes or more.
My use cases were:
1. RedHat cluster
2. LVS
3. Heartbeat
4. Oracle RAC

Auto configuring an Oracle RAC DB is not an easy task at all. However, with the proper infrastructure, it can become noticeably easier.
The common denominators for all cluster configurations I had to carry were:
1. They had to run on more than one node
2. Except from entering a password once for all nodes, I didn’t want any interaction
3. They all consisted from steps that should either run on all nodes, or on just one node
4. Sometimes you could logically split the task into a few phases of configuration, making it easier to comprehend the tasks you have to achieve

Even though it is not the most tidy piece of Bash code I’m going to post here, I’m very proud of it as it saved me countless hours. I give you the skeleton of code, which is the essence of what I nicknamed Bash RPC. On top of this you should be able to easily auto configure various configurations involving more than one computer.

Using it

The sample attached is a simple script that should be able to bake /home/cake on all relevant nodes.
In order to use the script properly, edit it using your favorite editor and stroll through the configuration_vars() function. Populate the array HOST_IP with your relevant hosts.
Now you can simply run the script.
I’m aware to the slight disadvantage that you can’t have your configuration come from command line, on the other hand – when dealing with big, complex, do you really think your configuration can be defined in a single line of arguments?

Taming it

OK, this is the interesting part. Obviously no one needs to bake cakes on his nodes, it is truly pointless and was merely given as an example.
So how would you go about customizing this skeleton to your needs?
First and foremost, we must plan. Planning and designing is the key for every tech related activity you carry. Be it a SysAdmin task or a pure development task. While configuring your cluster for the first time, keep notes of the steps (and commands) you have to go through. Later on, try to logically separate the steps into phases. When you have them all, we can start hacking Bash RPC.
Start drawing your phases and steps, using functions in the form of PHASEx_STEPx.
Fill up your functions with your ideas and start testing! and that’s it!

How does it work?

Simplicity is the key for everything.
Bash RPC can be ran in 2 ways – either running all phases (full run) or running just one step.
If you give Bash RPC just one argument, it assumes it is a function you have to run. If no arguments are given it will run the whole script.
Have a look at run_function_on_node(). This function receives a node and functions it should run on. It will copy the script to the destination node and initiate it with the arguments it received.
And this is more or less the essence of Bash RPC. REALLY!

Oh, there’s one small thing. Sometimes you have to run things on just one host, in that case you can add a suffix of ___SINGLE_HOST for your steps. This will make sure the step will run on just one host (the first one you defined).

I’m more than aware that this skeleton of Bash RPC can be polished some more and indeed I have a list of TODOs for this skeleton. But all in all – I really think this one is a big time saver.

Real world use cases

I consider this script a success mainly because of 2 real world use cases.
The first one is the act of configuring from A to Z Oracle RAC. Those of you who had to go through this nightmare can testify that configuring Oracle RAC takes 2-3 days (modest estimation) of both a DBA and SysAdmin working closely together. How about an unattended script running in the background, notifying you 45 minutes later you have an Oracle RAC ready for service?

The second use case is my good friend and former colleague, Oren Held. Oren could easily take this skeleton and use it for auto configuring a totally different cluster using LVS over Heartbeat. He was even satisfied while using it. Oren never consulted me while performing this task – and this is the great achievement.

I hope you could use this code snippet and customize it for your own needs, continuing with the YOU CAN CREATE A SCRIPT FOR EVERYTHING attitude!!

Have a look at cluster-config-cake.sh to get an idea about how it’s being done.

Posted October 10, 2009 by malkodan in Bash, Linux, System Administration

Tagged with Bash, cluster, configure, lazy, Linux, oracle, Oren Held, RAC, rpc, setup, ssh

Buying time 1 comment

Wearing both hats of developer and SysAdmin, I believe you shouldn’t work hard as a SysAdmin. It is for a reason that sometimes developers look at SysAdmins as inferior. It’s not because SysAdmins are really inferior, or has an easier job. I think it is mainly because a large portion of their workload could be automated. And if it can’t be automated (very little things), you should at least be able to do it quickly.

Obviously, we cannot buy time, but we can at least be efficient. In the following post I’ll introduce a few of my most useful aliases (or functions) I’d written and used in the last few years. Some of them are development related while the others could make you a happy SysAdmin.
You may find them childish – but trust me, sometimes the most childish alias is your best friend.

Jumping on the tree

Our subversion trunk really reminds me of a tree sometimes. The trunk is very thick, but it has many branches and eventually many leaves. While dealing with the leaves is rare, jumping on the branches is very common. Many times i have found myself typing a lot of ‘cd’ commands, where some are longer than others, repeatedly, just to get to a certain place. Here my stupid aliases come to help me.

lib takes me straight away to our libraries sub directory, where sim takes me to the simulators sub directory. Not to mention tr (shortcut for trunk), which takes me exactly to the sub directory where the sources are checked out. Oh, and pup which takes me to my puppet root, on my puppet master. Yes, I’m aware to the fact that you probably wonder now “Hey, is this guy going to teach me something new today? – Aliases are for babies!!”, I can identify. I didn’t come to teach you how to write aliases, I’m here to preach you to start using aliases. Ask yourself how many useful day-to-day aliases you really have defined. Do you have any at all? – Don’t be shy to answer no.

*nix jobs are diverse and heterogeneous, but let’s see if I can encourage you to write some useful aliases after all.
In case you need some ideas for aliases, run the following:

$ history | tr -s ' ' | cut -d' ' -f3- | sort | uniq -c | sort -n

Yes, this will show the count of the most recent commands you have used. Still not getting it? – OK, I’ll give you a hint, it should be similar to this:

$ alias recently_used_commands="history | tr -s ' ' | cut -d' ' -f3- | sort | uniq -c | sort -n"

If you did it – you’ve just kick-started your way to liberation. Enjoy.
As a dessert – my last childish alias:

$ alias rsrc='source ~/.bashrc'

Always useful if you want to re-source your .bashrc while working on some new aliases.

Two more things I must mention though:

Enlarge your history size, I guess you can figure out alone how to do it.
If you’re feeling generous – periodically collect the history files from your fellow team members (automatically of course, with another alias) and create aliases that will suit them too.

The serial SSHer

Our network is on 192.168.8.0/24. Many times I’ve found myself issuing commands like:

$ ssh root@192.168.8.1
$ ssh root@192.168.8.2
$ ssh root@192.168.8.3

Dozens of these in a single day. It was really frustrating. One day I decided to make an end to it:

# function for easier ssh
# $1 - network
# $2 - subnet
# $3 - host in subnet
_ssh_specific_network() {
	local network=$1; shift
	local subnet=$1; shift
	local host=$1; shift
	ssh root@$network.$subnet.$host
}

# easy ssh to 192.168.8.0/24
# $1 - host
_ssh_net_192_168_8() {
	local host=$1; shift
	_ssh_specific_network 192.168 8 $host
}
alias ssh8='_ssh_net_192_168_8'

Splendid, now I can run the following:

$ ssh8 1

Which is equal to:

$ ssh root@192.168.8.1

Childish, but extremely efficient. Do it also for other commands you would like to use, such as ping, telnet, rdesktop and many others.

The cop

Using KDE’s konsole? – I really like DCOP, let’s add some spice to the above function, we’ll rename the session name to the host we’ll ssh to, and then restore the session name back after logging out:

# returns session name
_konsole_get_session_name() {
	# using dcop - obtain session name
	dcop $KONSOLE_DCOP_SESSION sessionName
}

# renames a konsole session
# $1 - session name
_konsole_rename_session_name() {
	local session_name=$1; shift
	_konsole_store_session_name `_konsole_get_session_name`
	dcop $KONSOLE_DCOP_SESSION renameSession "$session_name"
}

# store the current session name
_konsole_store_session_name() {
	STORED_SESSION_NAME=`_konsole_get_session_name`
}

# restores session name
_konsole_restore_session_name() {
	if [ x"$STORED_SESSION_NAME" != x ]; then
		_konsole_rename_session_name "$STORED_SESSION_NAME"
	fi
}

# function for easier ssh
# $1 - network
# $2 - subnet
# $3 - host in subnet
_ssh_specific_network() {
	local network=$1; shift
	local subnet=$1; shift
	local host=$1; shift
	# rename the konsole session name
	_konsole_rename_session_name .$subnet.$host
	ssh root@$network.$subnet.$host
	# restore the konsole session name
	_konsole_restore_session_name
}

Extend it as needed, this is only the tip of the iceberg! I can assure you that my aliases are much more complex than these.

Finders keepers

For the next one I’m not taking full credit – this one belongs to Uri Sivan – obviously one of the better developers I’ve met along the way.
Grepping cpp files is essential, many times I’ve found myself looking for a function reference on all of our cpp files.
The following usually does it:

$ find . -name "*.cpp" | xargs grep -H -r 'HomeCake'

But seriously, do I look like someone that likes to work hard?

# $* - string to grep
grepcpp() {
	local grep_string="$*";
	local filename=""
	find . -name "*.cpp" -exec grep -l "$grep_string" "{}" ";" | while read filename; do
		echo "=== $filename"
		grep -C 3 --color=AUTO "$grep_string" "$filename"
		echo ""
	done
}

OK, let’s generalize it:

# grepping made easy, taken from the suri
# grepext - grep by extension
# $1 - extension of file
# $* - string to grep
_grepext() {
	local extension=$1; shift
	local grep_string="$*"
	local filename=""
	find . -name "*.${extension}" -exec grep -l "$grep_string" "{}" ";" | while read filename; do
		echo "=== $filename"
		grep -C 3 --color=AUTO "$grep_string" "$filename"
		echo ""
	done
}

# meta generate the grepext functions
declare -r GREPEXT_EXTENSIONS="h c cpp spec vpj sh php html js"
_meta_generate_grepext_functions() {
	local tmp_grepext_functions=`mktemp`
	local extension=$1; shift
	for extension in $GREPEXT_EXTENSIONS; do
		echo "grep$extension() {" >> $tmp_grepext_functions
		echo '  local grep_string=$*' >> $tmp_grepext_functions
		echo '  _grepext '"$extension"' "$grep_string"' >> $tmp_grepext_functions
		echo "}" >> $tmp_grepext_functions
	done
	source $tmp_grepext_functions
	rm -f $tmp_grepext_functions
}

After this, you have all of your C++/Bash/PHP/etc developers happy!

Time to showoff

My development environment is my theme park, here is my proof:

$ (env; declare -f; alias) | wc -l
2791

I encourage you to run this as well, if my line count is small and I’m bragging about something I shouldn’t – let me know!

Posted August 21, 2009 by malkodan in Bash, Linux, System Administration

Tagged with /bin/bash, alias, Bash, DCOP, easy, find, grep, kde, lazy, ssh

Feeling at /home… 3 comments

The following took place more than a year ago, but it is still fresh in my mind. After a few colleagues urged me to write about it, I decided to finally do it. If the output of the commands does not match exactly whatever I had while dealing with it – bear with it. It’s far from being the point.

The horror

I took a break from work last year and decided to go and have some fun in NZ. Oh, did I have fun there!
There’s nothing more frustrating than returning to work, turning on your dusty computer and witnessing the following:

*** An error ocfcurred during the filesystem check.
Give root password for maintenance (or type Control-D to continue):

Investigating it just a bit more I got to a conclusion that my /home is not mounting. OMG!!! all of my personal customizations and some private data is at /home!
I must admit, it’s nothing that couldn’t be reproduced at a reasonable amount of time, but having my neat KDE customizations I didn’t want to start the process from the beginning. Think about yourself losing your /home, it’s no fun. I decided I wanted it back.
OK, so I’m running e2fsck:

# e2fsck /dev/sda5
e2fsck 1.39 (29-May-2006)
e2fsck: No such file or directory while trying to open /dev/sda5

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 

#

Oh man, how frustrating, e2fsck can’t read my superblock!
Something I did notice during boot up, is that the HD is very noisy, in addition to very slow boot process. It wasn’t a new HD and it has worked hard (I tested on it numerous times our FS infrastructure of video recording). Probably it’s time has come.

I wanted to feel at /home again.
I decided the smartest thing would be to first try and copy this whole partition aside as I knew there is a hardware problem with the HD. After I’ll solve that one, I could hopefully handle the missing superblock problem much better.

Getting physical

So I quickly inserted a fresh new HD to my machine, disconnected the old faulty HD (it caused the computer to boot so slow because of it’s defects) and issued a network install.
15 minutes later I’m again at linux, with a bare new /home, and the faulty HD connected to it, slowing the computer as hell.
I was sure dd would come for the rescue:

# dd if=/dev/sda5 of=home-sweet-home.ext3 bs=4M

After a few minutes of anticipation and cranky HD noises, I’m with:

dd: reading `/dev/sda5': Input/output error
0+0 records in
0+0 records out
# ls -l /home/home-sweet-home.ext3
-rw-r--r-- 1 root root 0 Jul  10 08:26 home-sweet-home.ext3

Great :(. I’m searching the net, searching for an aggressive dd program, something that instead of giving up on bad sectors, would fill them with zeros and continue on (hoping the defects on the HD are at a very specific place). I must admit I have almost written something by myself, but finally I’ve found dd_rescue.
And off we go:

# dd_rescue -e 1 -A /dev/sda5 /home/home-sweet-home.ext3

It ran for hours! It was 65GB that dd_rescue had to tackle. With a dying HD that could take a lot of time. After more or less 8 hours I was back at my desktop, looking at my old home:

# ls -lh /home/home-sweet-home.ext3
-rw-r--r--   1 root   root        61G Jul  10 20:43 home-sweet-home.ext3
#

Being logical

OK, that’s it, I have my data. Time to dump the old HD and deal with the logical errors I still have with this partition dump. Mounting the partition gave me the same result as I pasted above: no superblock – no fun!
Oh! but ext3 always creates a few backup superblocks, maybe this is my lucky day where I will finally be able to use one of these backups. You are probably familiar with the following output:

# e2fsck /tmp/e2fsck-test
...
27 block groups
8192 blocks per group, 8192 fragments per group
1992 inodes per group
Superblock backups stored on blocks:
        8193, 24577, 40961, 57345, 73729, 204801

Writing inode tables: done
Writing superblocks and filesystem accounting information: done
...

Now go figure out where your backup superblocks are. Trying the obvious of 8193 and 32768 did not work for me. I knew there should be more backup superblocks. Google comes for the rescue again. I was quite close at this time as well to writing a small C program that would search the partition dump for ext3 superblock signatures and tell me where the backup superblocks are. But then again, I thought I’m probably not the first one who needs such a utility, here TestDisk came for the rescue.
I simply ran TestDisk which reveled the remaining trustworthy superblocks on my damaged filesystem.
Later on I discovered that it is also possible to run e2fsck on a partition with the same size and see where the superblocks get written. However, I think that probing the superblocks is much cleaner altogether.

# mkdir /home2
# mount -o loop,ro,sb=884736 /home/home-sweet-home.ext3 /home2
#

Silence.

Did it work?

# ls -l /home2
drwxr-xr-x  41 dan    dan    8192 Feb  5 13:39 dan
drwx------   7 distcc distcc   88 Jul 26 15:24 distcc
drwxrwxrwx   2 nobody nobody    1 Feb  5 05:39 public
#

Wow, it did!
So how much of it was eventually damaged? – less than 1%!
So I’ve found a few garbled files I didn’t need anyway, but I was more than 99% back at /home.

Epilogue

Needless to say that ever since I’m backing up my /home in a strict manner. But this is obviously not the point.
The simplicity of Linux in general and ext2/3 in specific is something we should adore. I wouldn’t want to imagine what would have happened on a different OS or a different filesystem (and please don’t start a flame war about it now…).

Posted August 12, 2009 by malkodan in Linux, System Administration

Tagged with backup, dd, dd_rescue, e2fsck, ext2, ext3, Harddisk, Hardisk, HD, Linux, maintenance, mount, recovery, superblock, TestDisk

Conventions 11 comments

I like Bash, I really like Bash. It’s simple, it’s neat, it’s quick, it’s many things. No one will convince me otherwise. It’s not for big things though, I know, but for most system administration tasks it will suffice.
The problem with Bash begins where incompetent SysAdmins start to use it. They will most likely treat it poorly and lamely.
Conventions are truly needed when writing in Bash. In our subversion we have ~30,000 lines of bash. And HELL NO! our product is definitely not a lame conglomerate of Bash scripts gluing many pieces of binaries together. Bash is there for everything that C++ (in our case) shouldn’t do, things like:

Packaging and anything related to packaging (post scripts of packages for instance)
SysV service infrastructure
Backups
Customization of the development environment
Deployment infrastructure

Yes, so where have we been? – Oh, how do we manage ~30,000 lines of Bash without getting everything messed out. For this, we need 3 main things:

Version control system (CVS, SVN, git, pick your kick)
Competent people
Coding conventions

Configuring a version control system is easy, espcially if you are a competent developer, I’ll skip to the 3rd one.

Conventions. Show me just one competent C/C++/Java/C# developer who would skip on using coding conventions and program like an ape. OK, I know, there might be some, but we both think the same thing about them. Scripting in Bash shouldn’t be an exception in that case. We should use strict scripting conventions on Bash in particular and on any other scripting language in general.
There’s nothing uglier and messier than a Bash script that is written without conventions (well, maybe Perl without conventions).

I’ll try in the following post to introduce my Bash scripting conventions and if you find them neat – feel free to adopt them. Up until today, sadly, I havn’t seen anyone suggesting any Bash scripting conventions.

Before starting any Bash script, I have the following skeleton :

#!/bin/bash

main() {
}

main "$@"

Call me crazy, but without my main() function I ain’t going anywhere. Now we can start writing some bash. Once you have your main() it means you’re going to write code ONLY inside functions. I don’t care it’s a scripting language – let’s be strict.

#!/bin/bash

# $1 - hostname
# $2 - destination directory
main() {
	local hostname=$1; shift
	local destination_directory=$1; shift
}

main "$@"

Need to receive any arguments in a function? – Do the following:

Document the variables above the function
Use shift after receiving each variable, and always use $1, it’s easier to later move the order of the variables

Notice that i’m using the local keyword to make sure the scope of the variable is only within its function. Be strict with variables. If for some reason you decide you need a global variable, it’s OK, but make sure it’s in capital letters and declared as needed:

# a read only variable
declare -r CONFIGURATION_FILE="/etc/named.conf"
# a read only integer variable
declare -i -r TIMEOUT=600

# inside a function
sample_function() {
	local -i retries=0
}

You are getting the point – I’m not going to make it any easier for you. Adopt whatever you like and remember that being strict in Bash yields code in higher quality, not to mention better readability. Feeling like writing a sloppy script today? – Go home and come back tomorrow.

Following is a script written by the conventions I’m suggesting. This is a very simple script that simply displays a dialog and asks the user which host he would like to ping, then displays the result in a tailbox dialog. This is more or less how many of my scripts look like. I admit it took me a while to find a script which is not too long (under 100 lines) and can still represent some of the ideas I was mentioning.

I urge you to question my way and conventions and suggest some of your own, anyway, here it is:

#!/bin/bash

# dialog options (notice it's read only)
# notice as well it's in capital letters
declare -r DIALOG_OPTS="0 0 0"
declare -i -r OUTPUT_DIALOG_WIDTH=60

# ping timeout in seconds (using declare -i because it's an integer and -r because it's read only)
declare -i -r PING_TIMEOUT=5
# size of pings
declare -i -r DEFAULT_SIZE_OF_PINGS=56
# number of pings to perform
declare -i -r DEFAULT_NUMBER_OF_PINGS=1

# neatly pipes a command to a tailbox dialog
# here we can specify the parameters the function expects
# this is how i like to do it, perhaps there are smarter ways that may integrate better
# with doxygen or some other tools
# $1 - tailbox dialog height
# $2 - title
# $3 - command
pipe_command_to_tailbox_dialog() {
	# parameters extraction, always using $1, with `shift` right afterwards
	# in case you want to play with the order of parameters, just move the lines around
	local -i height=5+$1; shift
	local title="$1"; shift
	local command="$1"; shift

	# need a temporary file? - always use `mktemp`, please spare me the `/tmp/$$` stuff
	# or other insecure alternative to temporary file creations, use only `mktemp`
	local output_file=`mktemp`
	# run in a subshell and with eval, so we can pass pipes and stuff...
	# eval is a favorite of mine, it means you truely understand the Bash shell
	# nevertheless - it is really needed in this case
	(eval $command) >& $output_file &
	# need to obtain a pid? - it's surely an integer, so use 'local -i'
	local -i bg_job=$!

	# ok, show the user the dialog
	dialog --title "$title" --tailbox $output_file $height $OUTPUT_DIALOG_WIDTH

	# TODO my lame way of checking if a process is running on linux, anyone has a better way??
	# my way of specifying a 'TODO' is in the above line
	if [ -d /proc/$bg_job ]; then
		# if the process is stubborn, use 'kill -9'
		kill $bg_job || kill -9 $bg_job >& /dev/null
	fi
	# wait for process to end itself
	wait $bg_job

	# not cleaning up your temporary files is similar to a memory leak in C++
	rm -f $output_file
}

# pings a host with a nice dialog
ping_host_dialog() {
	local ping_params_tmp=`mktemp`
	# slice lines nicely, i have long commands, i'm not going even to mention
	# line indentation - that goes without saying
	# i like to use dialogs, it makes more people use your scripts
	if dialog --ok-label "Submit" \
		--form "Ping host" \
		$DIALOG_OPTS \
			"Address:" 1 1 "" 1 30 40 0 \
			"Size of pings:" 2 1 "$DEFAULT_SIZE_OF_PINGS" 2 30 40 0 \
			"Number of pings:" 3 1 "$DEFAULT_NUMBER_OF_PINGS" 3 30 40 0 2> $ping_params_tmp; then
		# ping_params_tmp will be empty if the user aborted the dialog...
		local address=`head -1 $ping_params_tmp | tail -1`
		# yet again if you expect an integer, use 'local -i'
		local -i size_of_pings=`head -2 $ping_params_tmp | tail -1`
		local -i number_of_pings=`head -3 $ping_params_tmp | tail -1`
	fi
	rm -f $ping_params_tmp

	# this is my standard way of checking if a variable is empty
	# may not be the prettiest way, but it surely catches the eye...
	if [ x"$address" != x ]; then
		pipe_command_to_tailbox_dialog 15 "Pinging host \"$address\"" "ping -c $number_of_pings -s $size_of_pings -W $PING_TIMEOUT \"$address\""
	fi
}

# main function, can't live without it
main() {
	ping_host_dialog
}

# although there are no parameters passed in this script, i still pass $* to main, as a habit
#main $*
# after Uri's comment, I'm fixing the following and calling "$@" instead
main "$@"

I don’t want this post to be too long (it is already quite long), but I think you got the idea. I still have some conventions I did not introduce here. In case you liked what you’ve seen – do not hesitate to contact me so I can provide you with a document describing all of my Bash scripting conventions.

Posted August 1, 2009 by malkodan in Bash, Linux, System Administration

Tagged with /bin/bash, Bash, Coding, Conventions, declare, dialog, eval, Linux, local, ping, Scripting, strict

The master of puppets 10 comments

After several conversations with my good friend and now also boss I decided to open a small little Blog which will describe my Linux/SysAdmin/C++/Bash adventures.

I’m Dan, you can learn a lot about me from my website at http://nevela.com . My expertise is Unix/Linux System administration and C++ development. Linux is my home and Bash is my mother’s tongue. In this Blog I hope to share interesting topics, stories and decisions in the mentioned fields.

I’ll start with something fresh and new for me – Puppet (http://reductivelabs.com/trac/puppet).

I believe System administration is an art, just like programming. SysAdmins divide into 2 types, from what I’ve seen:

The fireman – most likely the mediocre SysAdmin type you are very familiar with. This type of SysAdmin knows how to troubleshoot and apply the specific fix where’s it’s needed. They get things done, but not more than that. Gradually their workload grows (especially if they do a bad job) and consume all of their time. This is the bad type of SysAdmin. Words like planning are usually not among their jargon (although they might argue with you).
The developer – this type of SysAdmin plans before doing anything. I hope I can belong to this group. He thinks as a developer – whenever a problem arises in a system he manages – he treats it as a bug rather then a problem. This type of SysAdmin will always have his configuration stored in a source control repository. He will always solve bugs he encounter by fixing the script or program that caused the so called problem.

If you’re a fireman – then this post is not for you. However, if you’re the second type, then puppet is exactly for you.

As a SysAdmin in a small startup company, I got to a position where I have to manage 2 types of machines:

Production machines
In house machines for internal company usage

Luckily managing dozens of production machines is easy for us (for me :)) as the configuration is most likely to be the same on all of them. On the other hand, ironically, managing the in house machines became a more time consuming task, as each machine has many different and diverse jobs. This is where puppet kicks in.

Up until today, we had a few main servers doing most of the simple tasks, among them:

LDAP
Samba/NFS file sharing
SVN
Apache
In house DNS and DNS caching
Buildbot master
Nagios
And many more other things (you got the point)

Obviously I have all of my configuration stored in our SVN and backed up. Given that one of these servers would die, I could generate a new one within hours. But yet, most of the configuration there was done manually. Who is the crazy SysAdmin that would write a script to meta-generate named.conf? Who would auto create SVN repositories (especially if you have just one repository to manage)?

Yes, this is where puppet goes in. Puppet forces you to work correctly. By working with puppet, you realize that the way you’re going to work is only by writing puppet recipes and automating everything that makes your servers what they are. Puppet is also revolutionary in many other ways, but I’ll focus on the way that it forces the SysAdmin to work in a clean manner.

One thing I have to warn from, puppet has a slightly steep learning curve, but if you’re the serious Linux SysAdmin type – you wouldn’t find that one steep. You’ll struggle with it. You must.

So I said that no one (most likely) would write a script to generate its DNS configuration, but I saved mine in SVN and wrote a recipe for DNS servers to use:

$dns_username = 'named'
$dns_group = 'named'
$dns_configuration_file = '/etc/named.conf'
class dns-server {
	# puppet takes care to install the latest package bind and it's cross platform!
	package { bind:
		ensure => latest
	}
	# configuring named. subscribing it on /etc/named.conf, which means that every time this
	# file would change, named will refresh
	service { named:
		ensure    => running,
		enable    => true,
		subscribe => File[$dns_configuration_file],
		require   => Package["bind"]
	}
	# this is the /etc/named.conf file. in subclasses we'll specify the source of the file
	file { $dns_configuration_file:
		owner   => $dns_username,
		group   => $dns_group,
		replace => true,
		mode    => 640
	}
}

Alright, this puppet code snippet only configures /etc/named.conf for usage with bind – but we never told puppet where to take the configuration from. I’ll share with you my DNS master server configuration:

class dns-master inherits dns-server {
	# if it's a DNS master - lets take the named-master.conf file (obviously stored in SVN as well)
	File[$dns_configuration_file] {
		source => "$puppet_fileserver_base/etc/named/named-master.conf",
	}

	# copy all the zone files
	file { "/var/named":
		source       => "$puppet_fileserver_base/etc/named/zones",
		ignore       => ".svn*",
		checksum     => "md5",
		sourceselect => all,
		recurse      => true,
		purge        => false,
		owner        => $dns_username,
		group        => $dns_group,
		replace      => true,
		mode         => 640
	}

	# subscribe all the zone files in named
	Service[named] {
		subscribe +> File["/var/named"]
	}
}

Great, so now every time I’ll change a zone in my puppet master, named will automatically restart. I believe this is great. Needless to say I do not use dynamic DNS updates, so I care not about zone files being updated.

Yes, so I did it once for DNS. If I ever needed now to change the configuration, I’d do it via puppet and let the configuration propagate. Maybe a small little phase of checking the new configuration, but right afterwards would come the svn commit – and it’s inside. Now I know I’m tidy and clean forever.

Needless to say that if I’ll ever need to configure another DNS server, I’ll use this recipe, or extend it as needed – puppet lets you inherit from base classes and add properties and behavior as needed (I have also a class for a dns-slave as you could have imagined).

OK, I should sum it up. To be honest – I’m glad I’m writing this part of the post after several discussions with a few of my wise colleagues.

Why should I use puppet?!

The answer to this one if not easy – and I’m not the one who should answer it. It is you who should answer it.

Adopting puppet reflects one main thing about you as a SysAdmin:

You are a lazy, strict and tidy SysAdmin, an expert.

Posted July 27, 2009 by malkodan in Linux, System Administration

Tagged with Bash, dns, lazy, Linux, master, puppet, slave, svn, tidy

Bashing Linux Linux system administration, programming and everything that goes in between…