Linux | Bashing Linux

Archive for the ‘Linux’ Tag

Spanning files over multiple smaller devices 3 comments

Imagine you are in Tasmania and need to move 35TB (1 million files) to S3 in the Sydney region. The link between Tasmania and continental Australia will undergo maintenance in the next month, which means either one or both:

You cannot use network links to transfer the data
Tasmania might be drifting further away from the mainland now that it is untethered

In short, I’m going to be presented with a bunch of HDs and I need to copy the data on them, fly to Sydney and upload the data to S3. If the HD given would be 35TB I could just copy the data and be done with it – no dramas. Likely though, the HDs will be smaller than 35TB, so I need to look at a few options of doing that.

Things to consider are:

Files should be present on the HDs in their original form – so they can be uploaded to S3 directly without needing a staging space for unzipping etc
HDs should be accessible independently, in case a HD is faulty I can easily identify what files need copying again
Copy operation should be reproducible, so previous point could be satisfied if anything goes wrong in the copying process
Copying should be done in parallel (it’s 35TB, it’ll take a while)
It has to be simple to debug if things go wrong

LVM/ZFS over a few HDs

Building a larger volume over a few HDs require me to connect all HDs at the same time to a machine and if any of them fail I will lose all the data. I decide to not do that – too risky. It’ll also be difficult to debug if anything goes wrong.

tar | split

Not a bad option on its own. An archive can be built and split into parts, then the parts could be copied onto the detination HDs. But the lose of a single HD will prevent me from copying the files on the next HD.

tar also supports -L (tape length) and can potentially split the backup on its own without the use of split. Still, it’ll take a very long time to spool it to multiple HDs as it wouldn’t be able to do it in parallel. In addition, I’ll have to improvise something for untarring and uploading to S3 as I will have no staging area to untar those 35TB. I’ll need something along the lines of tar -O -xf ... | s3cmd.

tar also has an interesting of -L (tape length), which will split a volume to a few tapes. Can’t say I am super keen using it. It has to work the first time.

Span Files

I decided to write a utility that’ll do what I need since there’s only one chance of getting it right – it’s called span-files.sh. It operates in three phases:

index – lists all files to be copied and their sizes
span – given a maximum size of a HD, iterate on the index and generate a list of files to be copied per HD
copy – produces rsync --files-from=list.X commands to run per HD. They can all be run in parallel if needed

The utility is available here:
https://github.com/danfruehauf/Scripts/tree/master/span-files

I’ll let you know how it all went after I do the actual copy. I still wonder whether I forgot some things…

Posted February 7, 2016 by malkodan in System Administration

Tagged with Bash, files, filesystem, hard drive, Linux, span

Fault Tolerant Nagios Cluster

I’ve been searching for a while for a solution of “how to build a fault tolerant Nagios installation” or “how to build a Nagios cluster”. Nada.
The concept is very simple, but it seems like the implementation lacks a bit, so I’ve decided to write a post about how I am doing it.

Cross Site Monitoring

The concept of cross site monitoring is very simple. Say you have nagios01 and nagios02, all that you have to setup is 2 tests:

nagios01 monitors nagios02
nagios02 monitors nagios01

Assuming you have puppet or chef managing the show, just make nagios01 and nagios02 (or even more nagiosXX servers) identical. Meaning all of them have the same configuration and can monitor all of your systems. A clone of each other if you’d like to call it that way.
Lets check the common use cases:

If nagios01 goes down you get an alert from nagios02.
If nagios02 goes down you get an alert from nagios01.

Great, I didn’t invent any wheel over here.
The main problem in this configuration is that if there is a problem (any problem) – you are going to get X alerts. X being the number of nagios servers you have.

Avoiding Duplicate Alerts

For the sake of simplicity, we’ll assume again we have just 2 nagios servers, but this would obviously scale for more.
What we actually want to do is prevent both servers from sending duplicate alerts as they are both configured the same way and will monitor the exact same thing.
One solution is to obviously have an active/passive type of cluster and all sort of complicated shenanigans, my solution is simpler than that.
We’ll “chain” nagios02 behind nagios01, making nagios02 fire alerts only if nagios01 is down.
Login to nagios02 and change /etc/nagios/private/resource.cfg, adding the line:

$USER2$="/usr/lib64/nagios/plugins/check_nrpe -H nagios01 -c check_nagios"
$USER2$ will be the condition of whether or not nagios is up on nagios01.

Still on nagios02, edit /etc/nagios/objects/commands.cfg, replacing your current alerting command to depend on the condition. Here is an example for the default one:

define command{
        command_name    notify-host-by-email
        command_line    /usr/bin/printf "%b" ...
}

Change to:

define command{
        command_name    notify-host-by-email
        command_line    eval $USER2$ || /usr/bin/printf "%b" ...
}

What we have done here is simply configure nagios02 to query nagios01 nagios status before firing an alert. Easy as. No more duplicated emails.

For the sake of robustness, if you would like to configure also nagios01 with a $USER2$ variable, simply login to nagios01, change the alerting command like in nagios02 and have in /etc/nagios/private/resource.cfg:

$USER2$="/bin/false"

Assuming you have puppet or chef configuring all that, you can just assign a master ($USER2$=/bin/false) and multiple slaves that query themselves in a chain.
For example:

nagios01 – $USER2$=”/bin/false”
nagios02 – $USER2$=”/usr/lib64/nagios/plugins/check_nrpe -H nagios01 -c check_nagios”
nagios03 – $USER2$=”/usr/lib64/nagios/plugins/check_nrpe -H nagios01 -c check_nagios && /usr/lib64/nagios/plugins/check_nrpe -H nagios02 -c check_nagios”

Enjoy!

Posted June 1, 2013 by malkodan in Linux, System Administration

Tagged with cluster, fault tolerance, Linux, monitoring, nagios, nagios cluster

Hebrew Keyboard Layout In Linux 4 comments

Since I got this question from way too many people, I wanted to just share my “cross distribution” and “cross desktop environment” way of doing that very simple thing of enabling a Hebrew keyboard layout under Linux.

Easy As

After logging into your desktop environment, type this:

setxkbmap -option grp:switch,grp:alt_shift_toggle,grp_led:scroll us,il

Alt+Shift will get you between Hebrew and English. Easy as.

Sustainability

Making it permanent is just as easy:

mkdir -p ~/.config/autostart && cat <<EOF > ~/.config/autostart/hebrew.desktop
[Desktop Entry]
Encoding=UTF-8
Name=Hebrew
Comment=Enable a Hebrew keyboard layout
Exec=setxkbmap -option grp:switch,grp:alt_shift_toggle,grp_led:scroll us,il
EOF

Should sustain logout/login, reboots, reinstalls (as long as you keep /home on a different partition), distribution changes and choosing a different desktop environment (KDE, GNOME, LXDE, etc.).

Posted May 3, 2013 by malkodan in Bash, Linux

Tagged with desktop entry, gnome, hebrew, hebrew keyboard layout, kde, keyboard, Linux

Creating a puppet ready image (CentOS/Fedora) 10 comments

Cloud computing and being lazy

The need to create template images in our cloud environment is obvious. Especially with Amazon EC2 offering an amazing API and spot instances in ridiculously low prices.
In the following post I’ll show what I am doing in order to prepare a “puppet-ready” image.

Puppet for the rescue

In my environment I have puppet configured and provisioning any of my machines. With puppet I can deploy anything I need – “if it’s not in puppet – it doesn’t exist”.
Coupled with Puppet dashboard the interface is rather simple for manually adding nodes. But doing stuff manually is slow. I assume that given the right base image I (and you) can deploy and configure that machine with puppet.
In other words, the ability to convert a bare machine to a usable machine is taken for granted (although it is heaps of work on its own).

Handling the “bare” image

Most cloud computing providers today provide you (usually) with an interface for starting/stopping/provisioning machines on its cloud.
The images the cloud providers are usually supplying are bare, such as CentOS 6.3 with nothing. Configuring an image like that will require some manual labour as you can’t even auto-login to it without some random password or something similar.

Create a “puppet ready” image

So if I boot up a simple CentOS 6.x image, these are the steps I’m taking in order to configure it to be “puppet ready” (and I’ll do it only once per cloud computing provider):

# install EPEL, because it's really useful
rpm -q epel-release-6-8 || rpm -Uvh http://download.fedoraproject.org/pub/epel/6/`uname -i`/epel-release-6-8.noarch.rpm

# install puppet labs repository
rpm -q puppetlabs-release-6-6 || rpm -ivh http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-6.noarch.rpm

# i usually disable selinux, because it's mostly a pain
setenforce 0
sed -i -e 's!^SELINUX=.*!SELINUX=disabled!' /etc/selinux/config

# install puppet
yum -y install puppet

# basic puppet configuration
echo '[agent]' > /etc/puppet/puppet.conf
echo '  pluginsync = true' >> /etc/puppet/puppet.conf
echo '  report = true' >> /etc/puppet/puppet.conf
echo '  server = YOUR_PUPPETMASTER_ADDRESS' >> /etc/puppet/puppet.conf
echo '  rundir = /var/run/puppet' >> /etc/puppet/puppet.conf

# run an update
yum update -y

# highly recommended is to install any package you might deploy later on
# the reason behind it is that it will save a lot of precious time if you
# install 'httpd' just once, instead of 300 times, if you deploy 300 machines
# also recommended is to run any 'baseline' configuration you have for your nodes here
# such as changing SSH port or applying common firewall configuration for instance
yum install -y MANY_PACKAGES_YOU_MIGHT_USE

# and now comes the cleanup phase, where we actually make the machine "bare", removing
# any identity it could have

# set machine hostname to 'changeme'
hostname changeme
sed -i -e "s/^HOSTNAME=.*/HOSTNAME=changeme" /etc/sysconfig/network

# remove puppet generated certificates (they should be recreated)
rm -rf /etc/puppet/ssl

# stop puppet, as you should change the hostname before it will be permitted to run again
service puppet stop; chkconfig puppet off

# remove SSH keys - they should be recreated with the new machine identity
rm -f /etc/ssh/ssh_host_*

# finally add your key to authorized_keys
mkdir -p /root/.ssh; echo "YOUR_SSH_PUBLIC_KEY" &gt; /root/.ssh/authorized_keys

Power off the machine and create an image. This is your “puppet-ready” image.

Using the image

Now you’re good to go, create a new image from that machine and any machine you’re going to create in the future should be based on that image.

When creating a new machine the steps you should follow are:

Start the machine with the “puppet-ready” image
Set the machine’s hostname

hostname=uga.bait.com
hostname $hostname
sed -i -e "s/^HOSTNAME=.*/HOSTNAME=$hostname/" /etc/sysconfig/network

Run ‘puppet agent –test’ to generate a new certificate request
Add the puppet configuration for the machine, for puppet dashboard it’ll be something similar to:

hostname=uga.bait.com
sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:add name=$hostname
sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:groups name=$hostname groups=group1,group2
sudo -u puppet-dashboard RAILS_ENV=production rake -f /usr/share/puppet-dashboard/Rakefile node:parameters name=$hostname parameters=parameter1=value1,parameter2=value2

Authorize the machine in puppetmaster (if autosign is disabled)

Run puppet:

# initial run, might actually change stuff
puppet agent --test
service puppet start; chkconfig puppet on

This is 90% of the work if you want to quickly create usable machines on the fly, it shortens the process significantly and can be easily implemented to support virtually any cloud computing provider!

I personally have it all scripted and a new instance on EC2 takes me 2-3 minutes to load + configure. It even notifies me politely via email when it’s done.

I’m such a lazy bastard.

Posted March 23, 2013 by malkodan in Bash, Linux, System Administration

Tagged with amazon, ami, CentOS, cloud, ec2, fedora, image, Linux, puppet, ssh, template, yum

Bash Scripting Conventions 2 comments

Have decided to publish the infamous Bash scripting conventions.

Here they are:
https://github.com/danfruehauf/Scripts/tree/master/bash_scripting_conventions

Please, comment, challenge and help me modify it. I’m very open for feedback.

Posted January 28, 2013 by malkodan in Bash, Linux, System Administration

Tagged with Bash, Conventions, Linux, programming, script, Scripting

EVE and WINE 1 comment

It’s also been a long time since I’ve played any computer interactive game. Unfortunately a work colleague introduced me to EVE Online.
I’m usually playing EVE on Microsoft Windows, which I believe is the best platform for PC gaming.

It’s been a while since I dealt with WINE. In the old days WINE was very complicated to deal with.
I thought I should give it a try – EVE Online on CentOS.

This is a short, semi-tutorial post about how to run EVE Online on CentOS.
It’s fairly childish so even very young Linux users will be able to understand it easily.

Let’s go (as root):

# cat > /tmp/epel.conf <<EOF
[epel]
name=\$releasever - \$basearch - epel
baseurl=http://download.fedora.redhat.com/pub/epel/5/x86_64/
enabled=1
EOF

# yum -y -c /tmp/epel.conf install wine

Let’s get EVE Online (from now there’s no need for root user access):

$ cd /tmp
$ wget http://content.eveonline.com/EVE_Premium_Setup_XXXXXX_m.exe

XXXXXX is obviously the version number, which is subject to change.

Let’s install EVE:

$ wine /tmp/EVE_Premium_Setup_XXXXXX_m.exe

OK, here’s the tricky part, if you’ll run it now, the EULA page will not display properly and you won’t be able to accept it. This is because it needs TrueType fonts.
We’ll need to install the package msttcorefonts, a quick look at google suggest you can follow the instructions found here.
Let’s configure the fonts in wine:

$ for font_file in `rpm -ql msttcorefonts`; do ln -s $font_file /home/dan/.wine/drive_c/windows/Fonts; done

Run EVE:

$ wine /home/dan/.wine/drive_c/Program Files/CCP/EVE/eve.exe

It’ll also most likely add a desktop icon for you, in case you didn’t notice.

EVE works nicely with WINE, an evident that WINE has gone a very long way since the last time I’ve used it!!

I believe these instructions can be generalized quite easily for recent fedora distros just as well.
\o/

Feel free to contact me on this issue in case you encounter any problems.

Posted February 4, 2010 by malkodan in Linux

Tagged with CentOS, EVE, EVE Online, fedora, game, gaming, Linux, WINE

Rocket science Leave a comment

I still remember my Linux nightmares of the previous century. Trying to install Linux and wiping my whole HD while trying to multi boot RedHat 5.0.
It was for a reason that they said you have to be a rocket scientist in order to install Linux properly.
Times have changed, Linux is easy to install. Perhaps two things are different, one is that objectively Linux became much easier to handle and the second is probably the fact I gained much more experience.
In my opinion – one of the reasons that Linux became easier along the years is the improving support for various device drivers. For the home users – it is excellent news. However, for the SysAdmin who deals mostly with servers and some high-end devices, the headache, I believe, still exists.
If you thought that having a NIC without a driver is a problem, I can assure you that having a RAID controller without a driver is ten times the headache.
I bring you here the story of the RocketRAID device, how to remaster initrd and driver disks and of course, how to become a rocket scientist!

WTF

With Centos 5.4 you get an ugly error in the middle of the installation saying you have no devices you can partition.
DOH!!! Because it discovered no HDs.

So now you’re asking yourself, where am I going? – Google of course.
RocketRAID 3530 driver page

And you discover you have drivers only for RHEL/CentOS 5.3. Oh! but there’s also source code!
It means we can do either of both:

Remaster initrd and insert the RocketRAID drivers where needed
Create a new driver disk and use it

I’ll show how we do them both.
I’ll assume you have the RocketRAID driver compiled for the installation kernel.
In addition, I’m also going to assume you have a network installation that’s easy to remaster.

Remastering the initrd

What do we have?

# file initrd.img
initrd.img: gzip compressed data, from Unix, last modified: Sun Jul 26 17:39:09 2009, max compression

I’ll make it quicker for you. It’s a gzipped cpio archive.
Let’s open it:

# mkdir initrd; gunzip -c initrd.img | (cd initrd && cpio -idm)
12113 blocks

It’s open, let’s modify what’s needed.

modules/modules.alias – Contains a list of PCI device IDs and the module to load
modules/pci.ids – Common names for PCI devices
modules/modules.dep – Dependency tree for modules (loading order of modules)
modules/modules.cgz – The actual modules inside this initrd

Most of the work was done for us already in the official driver package from HighPoint.
Edit modules.alias and add there the relevant new IDs:

alias pci:v00001103d00003220sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003320sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003410sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003510sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003511sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003520sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003521sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003522sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003530sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003540sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00003560sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004210sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004211sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004310sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004311sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004320sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004321sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004322sv*sd*bc*sc*i* hptiop
alias pci:v00001103d00004400sv*sd*bc*sc*i* hptiop

This was taken from the RHEL5.3 package on the HighPoint website.

So now the installer (anaconda) knows it should load hptiop for our relevant devices. But it needs the module itself!
Download the source package and do the usual configure/make/make install – I’m not planning to go into it. I assume you now have your hptiop.ko compiled against the kernel version the installation is going to use.
OK, so the real deal is in modules.cgz, let’s open it:

# file modules/modules.cgz
modules/modules.cgz: gzip compressed data, from Unix, last modified: Sat Mar 21 15:13:43 2009, max compression
# mkdir /tmp/modules; gunzip -c modules/modules.cgz | (cd /tmp/modules && cpio -idm)
41082 blocks
# cp /home/dan/hptiop.ko /tmp/modules/2.6.18-164.el5/x86_64

Now we need to repackage both modules.cgz and initrd.img:

# (cd /tmp/modules && find . -print | cpio -c -o | gzip -c9 > /tmp/initrd/modules/modules.cgz)
41083 blocks
# (cd /tmp/initrd && find . -print | cpio -c -o | gzip -c9 > /tmp/initrd-with-rr.img)

Great, use initrd-with-rr.img now for your installation, it should load your RocketRAID device!

A driver disk

Creating a driver disk is much cleaner in my opinion. You do not remaster a stock initrd just for a stupid driver.
So you ask what is a driver disk? – Without going into the bits and bytes, I’ll just say that it’s a brilliant way of incorporating a custom modules.cgz and modules.alias without touching the installation initrd at all!
I knew I couldn’t live quietly with the initrd remaster so choosing the driver disk (dd in short) option was inevitable.
As I noted before, HighPoint provided me only a RHEL/CentOS 5.3 driver disk (and binary), but they also provided the source. I knew it was a matter of some adjustments to get it to work also for 5.4.
It is much easier to approach the driver disk now as we are much more familiar with how the installation initrd works.
I’m lazy, I already created a script that takes the 5.3 driver package and creates a dd:

#!/bin/bash

# $1 - driver_package
# $2 - destination of driver disk
make_rocketraid_driverdisk() {
        local driver_package=$1; shift
        local destination=$1; shift

        local tmp_image=`mktemp`
        local tmp_mount_dir=`mktemp -d`

        dd if=/dev/zero of=$tmp_image count=1 bs=1M && \
        mkdosfs $tmp_image && \
        mount -o loop $tmp_image $tmp_mount_dir && \
        tar -xf $driver_package -C $tmp_mount_dir && \
        umount $tmp_mount_dir && \
        local -i retval=$?

        if [ $retval -eq 0 ]; then
                cp -aL $tmp_image $destination
                chmod 644 $destination
                echo "Driver disk created at: $destination"
        fi

        rm -f $tmp_image
        rmdir $tmp_mount_dir

        return $retval
}

make_rocketraid_driverdisk rr3xxx_4xxx-rhel_centos-5u3-x86_64-v1.6.09.0702.tgz /tmp/rr.img

Want it for 5.4? – easy. Just remaster the modules.cgz that’s inside rr3xxx_4xxx-rhel_centos-5u3-x86_64-v1.6.09.0702.tgz and replace it with a relevant hptiop.ko module 🙂

Edit your kickstart to load the driver disk:

driverdisk --source=http://UGAT/HA/BAIT/INC/HighPoint/RocketRAID/3xxx-4xxx/rr3xxx-4xxx-2.6.18-164.el5.img

Make sure you have this line in the main section and not meta generated in your %pre section as the driverdisk directive is being processed before the %pre section.

The OS doesn’t boot after installation

You moron! This is because the installation kernel/initrd and the one that boots afterwards are not the same!
You can fix it in one of the 3 following ways:

Recompile the CentOS/RHEL kernel and repackage it with the RocketRAID driver – pretty ugly, not to mention time consuming.
Build a module RPM for the specific kernel version you’re going to use – very clean but also very time consuming!
Just build the module for the relevant kernel in the %post section – my way.

In the %post section of your kickstart, add the following:

(cd /tmp && \
        wget http://UGAT/HA/BAIT/INC/rr3xxx_4xxx-linux-src-v1.6-072009-1131.tar.gz && \
        tar -xf rr3xxx_4xxx-linux-src-v1.6-072009-1131.tar.gz && \
        cd rr3xxx_4xxx-linux-src-v1.6 && \
        make install)

The next boot obviously have a different initrd image. Generally speaking, initrd creation is done after the %post section, so you should not bother about it too much…
Server should boot now. Go play with your 12x2TB RAID array.

I hope I could teach you something in this post. It was a hell of a war discovering how to properly do all of these.
Now if you’ll excuse me – I’ll be going to play with spaceships and shoot rockets!

Posted February 3, 2010 by malkodan in Linux, System Administration

Tagged with CentOS, cpio, Hardware, HighPoint, initrd, lazy, Linux, Redhat, remaster, RHEL, Rocket, RocketRAID, Science

Backups… all night? 1 comment

Being the one that is responsible for the backups at work – I never compromised on anything.
As a SysAdmin you must:

Backup
Backup some more
Test your backups

Shoot to maim

At first we mainly needed to backup our subversion repository. A pretty easy task for any SysAdmin.
What I would do is simply dump the repository at night, and scp it to two other workstations of developers in the company (I didn’t really have much of a choice in terms of other computers in the network).
It worked.

The golden goose is on the loose

After a while I managed to convince our R&D manager it is time for detachable backups. Detachable backups can save you in case the building is on fire or if someone decides to shoot a rocket on your building (unlikely even in Israel, but as a SysAdmin – never take any chances).
With the virtual threat of a virtual rocket that might incinerate all of our important information, we decided that the cheapest and most effective way of action is to purchase a tape drive and a few tapes. Mind you, the year is 2006 and portable HDs are expensive, uncommon and small.

Backing up to a tape has always been something on my TODO list that I had to tick.
During one of my previous jobs, we had a tape archive that took care of it transparently, it was managed by a different team. Ever since I had always had yearnings for the /dev/tape that’s totally sequential.
It was very soon that I’d discovered that these tapes are a plain headache:

It is a mess in general to deal with the tapes as the access is sequential
It’s slow!!!
The only reasonable way to deal with the backups is with dumpe2fs – it’s an archaic tool it’s archaic and work only on the extX filesystem family!
It takes a while to eject the tape, I can still remember the minutes of waiting in the servers room for the tape to eject, so I can deposit it at home
The tapes tend to break! like any tape, the film tends to run away from the bearings, rendering the tape useless

Too bad our backup was far from being able to fit on a DVD media.

The glamour, the fortune, the pain

The tape backup held us good for more than 2 years. I was so happy the solution was robust enough to keep us running for that much time without the need of any major changes.
But portable USB HDs became cheaper and larger and it was time for a change. I was excited to receive two brand new and shiny 500GB HDs. I diligently worked on a new backup script. A backup script that would not be dependant on the filesystem type (hell! i wanted to use JFS!), a backup script that would have snapshots weeks back! a backup script that would rule them all!
This backup script will hopefully be published in one of my next posts.
I felt like king of the world, backups became easy, I was much more confident with the new backup as the files could be seen on the mounted HD easily, in contrast to the sequential tape and the binary filesystem dump.
Backups ran manually by me during the day. I inspected them carefully and was pleased.
It was time for the backup to take place at night. And so it was.

From time to time I would get in the backup log:
Input/output error

At first I didn’t pay much attention.
WTF?! are my HDs broken?! – no way, they are brand new and it happened on both of them. But dmesg also showed some nasty information while accessing the HDs.
I started to trigger the backups manually at day time. Not a single error.

Backups went back to night time.
At the morning I would issue a ls:

# ls /media/backup
Input/output error
# ls /media/backup
daily.1 daily.2 daily.3 daily.4 daily.5 daily.6 daily.7

What the hell is going on around here?! – first command fails but the second succeeds?
First command also used to lag for a while, where the second breezed out. I discovered only later it was a key hint.

My backup creates many links using “cp -aL” (in order to preserve snapshots), I had a speculation I might be messing the filesystem structure with too many links to the same inode – unlikely, but I was shooting at all directions, I was clueless.

So there I go, easing the backups up and eliminating the snapshot functionality. Guess what? – still errors on the backup.

What do I do next? Do I stay up at night just to witness the problem in real time?!, Don’t laugh, a friend of mine actually had to do it once in other occasions.
At this time I already introduced this issue to all of my fellow SysAdmin friends. None of them had any idea. I can’t blame them.
I was frustrated, even the archaic tape backups worked better than the HDs, is newer always better? – perhaps not.
I recreated the filesystem on the portable HDs as ext3 instead of JFS, maybe JFS is buggy?
I’ll save you the trouble. JFS is far from being buggy and it had also nothing to do with crond.

We’ll show the unbelievers

For days I’d watch the nightly email the backup would produce, notice the failure and rerun it manually during the day. Until one day.
It had struck me like a lightning on a sunny day.

The second command would always succeed on the device. What if this HD is a little tired?
What if the portable HD goes to sleep and is having problems waking up?
It’s worth trying.

# sdparm --set=STANDBY=0 /dev/sdb
# sdparm --save /dev/sdb

What do you say? – It worked!
It appears that some USB HDs go to sleep and doesn’t wake up nicely when they should.
Should I file a bug about it? Was it the hardware that malfunctioned?
I was so happy this issue was solved – I never cared about either.
Maybe after crafting this post – it is time to care a little more though.

As the madmen play on words and make us all dance to their song…

I’m sitting at my desk, receiving the nightly email informing me the backup was successful. The portable HDs now also utilize an encrypted filesystem. The backup never fails.
I look at my watch, drink a glass of wine and rejoice.

Posted January 1, 2010 by malkodan in Linux, System Administration

Tagged with backup, dumpe2fs, error, ext2, ext3, Harddisk, Hardisk, HD, input, JFS, Linux, maintenance, output, sdparm, sleep, usb

got r00t? 5 comments

Introduction

Landing in a new startup company has its cons and pros.
The pros being:

You can do almost whatever you want

The cons:

You have to do it from scracth!

The Developers

Linux developers are not dumb. They can’t be. If they were dumb, they couldn’t have developed anything on Linux. They might have been called developers on some other platforms.
I was opted quite early about the question of:
“Am I, as a SysAdmin, going to give those Linux developers root access on their machines?”

Why not:

They can cause a mess and break their system in a second.
A fellow developer (the chowner) who ran:
```
# chown -R his_username:his_group *
```
He came to me saying “My Linux workstation stopped working well!!!”
Later on I also discovered he was at /, when performing this command! 🙂
For his defence he added: “But I stopped the command quickly! after I saw the mistake!”
And there’s no 2, I think this is the only main reason, given that these are actually people I generally trust.

Why yes:

They’ll bother me less with small things such as mounting/umounting media.
If they need to perform any other administrative action – they’ll learn from it.
Heck, it’s their own workstation, if they really want, they’ll get root access, so who am I to play god with them?

Choosing the former and letting the developers rejoice with their root access on their machines, I had to perform some proactive actions in order to avoid unwanted situations I might encounter.

Installation

Your flavor of installation should be idempotent, in terms of letting the user destroy his workstation, but still be able to reinstall and get to the same position.
Let’s take for example the chowner developer. His workstation was ruined. I never even thought of starting to change back permissions to their originals. It would cause much more trouble in the long run than any good.
We reinstalled his workstation and after 15 minutes he was happy again to continue development.

Automatic network installations are too easy to implement today on Linux. If you don’t have one, you must be living in the medieval times or so.
I can give you one suggestion though about partitioning – make sure your developers have a /home on a different partition. It’ll be easier when reinstalling to preserve /home and remove all the rest.

Consolidating software

I consider installing non-packaged software on Linux a very dirty action.
The reasons for that are:

You can’t uninstall it using standard ways
You can’t upgrade it using standard ways
You can’t keep track of it

In addition to installing packaged software, you must also have all your workstations and server synchronize against the same software repositories.
If user A installs software from repository A and user B from repository B, they might run into different behavior on their software.
Have you ever heard: “How come it works on my computer and doesn’t work on yours??”
As a SysAdmin, you must eliminate the possibilities of this to happen to a zero.

How do you do it?
Well, using CentOS – use a YUM repository and cache whatever packages you need from the various internet repositories out there.
Debian? – just the same – just with apt.

Remember – if you have any software on workstations that is not well packaged or not well controlled – you’ll run into awkward situations very soon.

Today

Up until today Linux developers in my company still posses their root access, but they barely use it. To be honest I don’t think they even really need it. However, they have it. It is also about educating the developers that they are given the root access because they are being trusted. If they blew it, it’s mostly their fault, not yours.

I’ll continue to let them be root when needed. They have proved worthy so far.
And I’ll ask you another question – do you really think that someone who can’t handle his own workstation be a good developer? – think again!

Posted December 5, 2009 by malkodan in System Administration

Tagged with CentOS, developer, easy, lazy, Linux, maintenance, r00t, root, rpm, strict, yum

Bash RPC 2 comments

Auto configuring complex cluster architectures is a task many of you might probably skip, using the excuse that it’s a one time task and it’ll never repeat itself. WRONG!
Being lazy as I could and the need to quickly deploy systems for hungry customers, I started out with a small little infrastructure that helped me along the way to auto configure clusters of two nodes or more.
My use cases were:
1. RedHat cluster
2. LVS
3. Heartbeat
4. Oracle RAC

Auto configuring an Oracle RAC DB is not an easy task at all. However, with the proper infrastructure, it can become noticeably easier.
The common denominators for all cluster configurations I had to carry were:
1. They had to run on more than one node
2. Except from entering a password once for all nodes, I didn’t want any interaction
3. They all consisted from steps that should either run on all nodes, or on just one node
4. Sometimes you could logically split the task into a few phases of configuration, making it easier to comprehend the tasks you have to achieve

Even though it is not the most tidy piece of Bash code I’m going to post here, I’m very proud of it as it saved me countless hours. I give you the skeleton of code, which is the essence of what I nicknamed Bash RPC. On top of this you should be able to easily auto configure various configurations involving more than one computer.

Using it

The sample attached is a simple script that should be able to bake /home/cake on all relevant nodes.
In order to use the script properly, edit it using your favorite editor and stroll through the configuration_vars() function. Populate the array HOST_IP with your relevant hosts.
Now you can simply run the script.
I’m aware to the slight disadvantage that you can’t have your configuration come from command line, on the other hand – when dealing with big, complex, do you really think your configuration can be defined in a single line of arguments?

Taming it

OK, this is the interesting part. Obviously no one needs to bake cakes on his nodes, it is truly pointless and was merely given as an example.
So how would you go about customizing this skeleton to your needs?
First and foremost, we must plan. Planning and designing is the key for every tech related activity you carry. Be it a SysAdmin task or a pure development task. While configuring your cluster for the first time, keep notes of the steps (and commands) you have to go through. Later on, try to logically separate the steps into phases. When you have them all, we can start hacking Bash RPC.
Start drawing your phases and steps, using functions in the form of PHASEx_STEPx.
Fill up your functions with your ideas and start testing! and that’s it!

How does it work?

Simplicity is the key for everything.
Bash RPC can be ran in 2 ways – either running all phases (full run) or running just one step.
If you give Bash RPC just one argument, it assumes it is a function you have to run. If no arguments are given it will run the whole script.
Have a look at run_function_on_node(). This function receives a node and functions it should run on. It will copy the script to the destination node and initiate it with the arguments it received.
And this is more or less the essence of Bash RPC. REALLY!

Oh, there’s one small thing. Sometimes you have to run things on just one host, in that case you can add a suffix of ___SINGLE_HOST for your steps. This will make sure the step will run on just one host (the first one you defined).

I’m more than aware that this skeleton of Bash RPC can be polished some more and indeed I have a list of TODOs for this skeleton. But all in all – I really think this one is a big time saver.

Real world use cases

I consider this script a success mainly because of 2 real world use cases.
The first one is the act of configuring from A to Z Oracle RAC. Those of you who had to go through this nightmare can testify that configuring Oracle RAC takes 2-3 days (modest estimation) of both a DBA and SysAdmin working closely together. How about an unattended script running in the background, notifying you 45 minutes later you have an Oracle RAC ready for service?

The second use case is my good friend and former colleague, Oren Held. Oren could easily take this skeleton and use it for auto configuring a totally different cluster using LVS over Heartbeat. He was even satisfied while using it. Oren never consulted me while performing this task – and this is the great achievement.

I hope you could use this code snippet and customize it for your own needs, continuing with the YOU CAN CREATE A SCRIPT FOR EVERYTHING attitude!!

Have a look at cluster-config-cake.sh to get an idea about how it’s being done.

Posted October 10, 2009 by malkodan in Bash, Linux, System Administration

Tagged with Bash, cluster, configure, lazy, Linux, oracle, Oren Held, RAC, rpc, setup, ssh

Bashing Linux Linux system administration, programming and everything that goes in between…