Thursday, December 31, 2009

Bottled the 80/- today...

and for the first time ever, I didn't get to taste it.  Had exactly enough bottles, and the beer ran out as I was filling the last one.  That never happens!  Well, it happens once, apparently.

Beer tip #1. Remember that bottling session from hell?  The one where the bottle fell over, and the filler broke, and beer was going everywhere, and you were desperately trying to hold the hose closed with one hand while reaching for the rolling bottle full of beer with the other...

Don't remember that one?  Then you haven't been brewing long enough. You will.  Anyway, go buy one of these right now, and put it on your bottling hose.  Then when you have your disaster, just close it.

Tip #2 (also from experience).  Put it on your hose below the lowest level your beer will reach, and you won't have to re-start the siphon.




Wednesday, December 30, 2009

Time to clean up those scripts...

I've been spending the past few days cleaning up scripts - mostly because I don't want to start on any new big projects during a short work week.

Some of my scripts have been hanging around since 1996, and are just plain bad.  It's interesting to look back on code you wrote 15 or 20 years ago, and wonder exactly what the hell you were thinking when you wrote that piece of... but I digress.

A great sites to start is David Pashley's 'Robust Bash Script' site, http://www.davidpashley.com/articles/writing-robust-shell-scripts.html

Here's my quck list of things to do to clean up scripts.

First, a meta-script thing.  I assume nobody reading this is crazy enough to have '.' in their PATH.  If you do, get rid of it now.

 I use the full pathname with every single command in a script.  This eliminates the possibility of someone messing around with your PATH.  If I use the command a lot, I just define it  Here's the cron job I use to make sure disk space is OK on the servers.


#!/bin/bash
# check_disk_space
  /bin/rm /tmp/disk_space 2>>/dev/null
  /bin/df -P >/tmp/disk_space
  HOST=`/bin/hostname -s`
  ECHO="/bin/echo -e"
  CUT="/bin/cut"
  MAIL="/bin/mail"

{
  read fs
# skip the header...
  while read fs
  do
    blocks=`$ECHO $fs|$CUT -f2 -s -d" "`
    if [ $blocks != "-" ]; then
     avail=`$ECHO $fs|$CUT -f4 -s -d" "`
     let valu='avail * 100 / blocks'
     if [ $valu -gt 0 ]; then
       if [ $valu -lt 8 ]; then
        $ECHO "CRITICAL $HOST disk space!"|$MAIL root pager
       fi
       if [ $valu -lt 13 ]; then
        $ECHO "Check $HOST disk space!"|$MAIL root
       fi
     fi  
   fi  
  done


I error check everything.  My one disagreement with Mr. Pashley is the 'set -e' and 'set -u' constructs - I avoid them, and do error checking instead.  The advantage to using the 'set' constructs is that it makes it easy to error out of scripts without writing a whole lot of code.  For instance, to use an example from his site:

chroot=$1
..
rm -rf $chroot/usr/share/doc

If you don't pass an argument to the script, you'll wipe out your documentation.  If you use the 'set -u' at the top of the script, it'll fail with

./scriptname: line 15: $1: unbound variable

Great!  But... what if that's a script we really, really need to run correctly?  Like, say, a cron job?  It'll fail, all right, and if you're faithful about reading logs you might catch it.  And if not, it could be not working for a long, long time before it's caught.  Hopefully, not doing anything too important - like backups.

Nope; I want to get hit over the head with a 2x4 if one of my scripts are failing:

MAIL="/bin/mailx"
ECHO="/bin/echo -e"

if [ "$#" -lt 1 ]; then {
  $ECHO "$0: Error: too few arguments.  Exiting..."|$MAIL root pager; exit 1; }
fi

Note the "$0".  Be nice to yourself; let yourself know which script you're getting the error message from.

But if nothing else, use this construct - it'll at least keep you out of serious trouble:

cd /nosuchdirectory || exit 1
rm -rf *


I always use mkdir with the -p switch.  That way, worst case it creates a directory you don't want.  Best case... well:

mkdir /doesntexist/dir
cd doesntexist/dir
{process stuff}
rm -f *

or something similar.  There's not much downside with the -p.

More tomorrow.  Well, next week - the plant's closed the rest of the week.  Happy New Year!

Monday, December 28, 2009

Hints from Lynis

Michael Boelen's Lynis system utility, at http://www.rootkit.nl/projects/lynis.html, is a massively useful system security and auditing tool.  It scans through your system and points out the things that you should have caught, but are so easy to miss.  Bad permissions on /etc/snort.conf; expired SSL certificates; loggers that should be running but aren't, and so on. 

Lynix helps by not only telling you what's wrong, but how it found out what's wrong.  For instance:

Warning: pwck found one or more errors/warnings in the password file [test:AUTH-9228] [impact:M]

My early background is Data General Unix, not Linux, and even though I've been running Linux for a decade I learn new tricks every day. I never knew the pwck command existed.  Handy little thing:

[root@roosevelt master]# pwck
user adm: directory /var/adm does not exist
user news: directory /etc/news does not exist
user uucp: directory /var/spool/uucp does not exist
user gopher: directory /var/gopher does not exist

uucp??  gopher??  Yeesh, this must have been something left over from the mid-90s. Let's userdel those right away.

Suggestion: Audit daemon is enabled with an empty ruleset. Disable the daemon or define rules [test:ACCT-9630]

Yeah, it is kinda stupid to have the audit daemon running, with nothing to audit.  Time to read up on my audit.rules syntax.

Suggestion: Check file permissions of /etc/squid/squid.conf to limit access [test:SQD-3613]

Whoops, world writable - wonder how that happened?  Easy fix.

Suggestion: Add legal banner to /etc/motd, to warn unauthorized users [test:BANN-7122]

Nah.  I've always thought that the legal notice thing was stupid.  It's not a binding contract - does anyone think that putting 'Anyone sending me unsolicited email owes me $10,000!' at the bottom of your web page will actually allow you to collect anything?  It's a waste of electrons.  But it's easy to fix - just add it to the Lynis default.prf file, like so:

config:test_skip_always:BANN-7122:

Anyway, you get the idea.  It's easy to configure, easy to run once a week or so, and catches the things that falls through the cracks.  Nice tool to have.  Thanks, Michael!

Tuesday, December 22, 2009

RHEL, KVM virtualization, and migrating from Xen

My Christmas project is to convert the half-dozen or so Xen virtual guests we have over to KVM.

Why KVM?  It's the future for Red Hat; it's lightweight, and it's included in the kernel,  so you don't need to run a special kernel.

My first project was to take my Dell server that's now running Xen and a few Windows instances, add a RHEL image or two, and see how much work it was getting everything switched over.  The official line is that while it's possible to migrate a Xen host to KVM, it's not possible to migrate a guest.  We'll see.

The place to start is the brand new RHEL Virtualization Guide, here:

http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Virtualization_Guide/index.html

Well, no.  The place to start was very, very good backups.  I'm running these on loop disks; I backed up the loopback files first.

On the host, I did a 'yum install kvm' to get those utilities. Then, I did a 'yum install kernel kernel-devel' on both the host and  the guest, and changed grub.conf to boot into this new non-Xen kernel, and re-booted the host.

Probably the biggest change on the host is networking.  You'll either have to use NAT or bridging, and since I didn't want to add Yet Another Layer of NAT into my network, I chose bridging.

/etc/sysconfig/network-scripts/ifup-eth1 looks like this originally:

DEVICE=eth1
BROADCAST=192.168.1.255
IPADDR=192.168.1.140
NETMASK=255.255.255.0
NETWORK=192.168.1.0
ONBOOT=yes
GATEWAY=192.168.1.250
TYPE=Ethernet

I had to comment out the IPADDR line, and add to the bottom:


BRIDGE=br1

Then add ifcfg-br1:

DEVICE=br1
BROADCAST=192.168.1.255
IPADDR=192.168.1.140  
NETMASK=255.255.255.0
NETWORK=192.168.1.0 
ONBOOT=yes
GATEWAY=192.168.1.250
TYPE=Bridge  (NOTE:  Case is important here.  Must be capital 'B', lower 'ridge')

And restart the network.  It’ll now look like this:



Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.0     *               255.255.255.0   U     0      0        0 br1
default         saratoga1.denma 0.0.0.0         UG    0      0        0 br1

Add this to /etc/sysctl.conf:

net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

and then

sysctl -p /etc/sysctl.conf

This is so you don’t have to put rules into iptables to forward the bridge traffic.


OK, new kernel installed, and networking fixed.  Now it's time to build a guest xml file. As I said, I've got a loopback file for the Xen image already, sitting in /xmdata1.  So let's import that file:


virt-install --name=zimbra --ram=2048 --vcpus=2 --check-cpu  --accelerate
    --file=/xmdata1/xen-zimbra --vnc –import

The '--import’ uses an existing file.  You’ve now got a file in /etc/libvirt/qemu called zimbra.xml

The ‘--accelerate’ is important.  Otherwise, the guest gets created as domain type='qemu'.  That caused each guest to use 100% of the host’s CPU, and run bog slow.  Edit the file and change it to ‘kvm’ if you forget the switch - the xml files get put in /etc/libvirt/qemu. But if you change this by hand, remember to restart
/etc/init.d/libvirtd.  

You control the guests using the 'virsh' command.  So after I've done all this, I started up zimbra by doing

virsh start zimbra

And that was it.  Didn't have to do a thing on the guest side, except add a normal kernel.  It's been running two days now without a hitch.  To make sure the guest starts up when your host starts, simply do a

virsh autostart kantech
Domain kantech marked as autostarted

and I'm done.

But that was the easy part, right? I had two more guests to get going - a Windows 2000 Server loop disk, which isn't even officially supported, and a 2008R2 64-bit server on a real disk. Those were going to be the fun ones.  I budgeted two days, since I'm an optimist.

For the Win2K server, I did the virt-install exactly as above, started it from the console, and... it worked.  The stupid thing booted, installed all new drivers, rebooted, and it's been running ever since.

The 2008R2 server was slightly more tricky.  Because of the 'real' disk drives, I had to do this:

virt-install --name=antivirus --ram=2048 --vcpus=2 --check-cpu --accelerate --disk path=/dev/sdb1 --disk path=/dev/sdb2 --disk path=/dev/sde1 --import

Did that; did the start, and... this one didn't even bother to install new drivers.  It just ran.

So the week-long installation and debugging process has taken about three hours.  I'm going to spend the rest of the week catching up on BOFH episodes.



Friday, December 18, 2009

Problems with RHEL Xen and rhn update

I'm running a Dell 2950 with RHEL 5.3 Xen as my virtualization host, with a half-dozen guests.  There's a plan in place to move them all to kvm - but that's another post.

Yesterday, I got this ominous message when doing a redhat-update:

Error Message:
    Abuse of Service detected for server defiant.denmantire.com
Error Class Code: 49
Error Class Info:
     You are getting this error because RHN has detected an abuse of
     service from this system and account. This error is triggered when
     your system makes too many connections to Red Hat Network. This
     error can not be triggered under a normal use of the Red Hat Network
     service as configured by default on Red Hat Linux.

     The Red Hat Network services for this system will remain disabled
     until you will reduce the RHN network traffic from your system to
     acceptable limits.

 Abuse of Service?? Oh, no!  What have I done!

Well... nothing.  Red Hat did.

There's a cron job installed in /etc/cron.d when you're a Xen host, called rhn-virtualization.cron.  It monitors the hypervisor and notifies Red Hat Network if there's been a change in status in any of the virtual guests, so that RHN can make sure your systems are up to date.  Sounds logical.

In a recent update, this was changed.  It now reports back to the mothership any time any of your guests flips from blocked to running, and vice versa.  In other words, lots.  Really lots - any time a guest is waiting for keyboard input, or disk IO, or CPU, or...

So if your machine checks in more than 100 times per day after the first 1500 checkins, you get flagged as abusive.  And with a half-dozen machines running full out, I'm certain that my machines are checking in more often than that.  Boom, error 49, and it's a real pain to get your machines re-registered.

Suggested fix is to change the cron job to once an hour or so.

My suggested fix is to nuke the cron job, and run

/usr/bin/yum update --security -y

as a daily cron job.

Thursday, December 17, 2009

OK, now for the beer part...

I've been homebrewing for close to twenty years now.  I've bought all-grain kits from a number of stores, but the only one that I've seen 'open source' their kits is Northern Brewer (http://www.northernbrewer.com).  Click on any of their all-grain kits, and you'll see a 'Kit Inventory Sheet', which tells you precisely what's in the kit.  For instance, last week I did their Scottish 80.   If you go here:


http://legacy.northernbrewer.com/docs/kis-html/1153.html

you get exact ingredients, recommended yeast, times and temperatures, etc.

Talk about 'free, as in beer'!

I added 2 oz. peated malt I had hanging around, just to give it a slight smoky flavor.

Note the extremely high mash temperature - 158 degrees.  Remember, 'MALT' - More Alcohol, Lower Temperature.  This should produce a beer with a huge malty flavor, exactly what you want in a Scottish.  I used the Wyeast 1728, and we'll see what happens.

Wednesday, December 16, 2009

BIND 9 - remember to disable dynamic updates!

The nixCraft newsletter, http://www.cyberciti.biz , has some really handy tips. Today's was a BIND 9 'feature' that I didn't have disabled, but should have. BIND 9 allows you to update master zones on a nameserver with the allow-updates command. Bad idea for many reasons, even if you specify allowed addresses. So for security's sake, put

allow-update { none; };

into each of your zone files.

When I'd done this, and done a 'rndc reload', I noticed the following in my log file:

Dec 15 09:02:05 challenger named[3127]: the working directory is not writable

Hmm. A little googling told me that the named directory had to be group writable, and mine wasn't. So:

chmod g+w /var/named/chroot/var/named/

Of course, we're running chroot'ed. If you're not, you should be.

Scripting FTP, with error checking

FTP is great for sending data back and forth. It's easy to automate, and there are many sites out there that will tell you how to do so. But it's a real pain to get any error checking going.

Allow me to demonstrate. Here's a quick script:

#!/bin/bash
# send_ftp_copies

HOST='www.google.com'
USER='mysername'
PASSWD='mypassword'

if [ "$1" = "" ]
then
    echo "$0: Sorry, must be specify filename. Exiting..."|mail root
    exit 1
else
    FILENAME=$1
fi

if test ! -f "/tmp/$FILENAME"
then
   echo "$0: File $FILENAME does not exist. Exiting..."|mail root
   exit 1
fi

ftp -n $HOST << EOF
    user $USER $PASSWD
    cd /myinbox
    lcd /tmp
    put $FILENAME
    quit
EOF

You'd invoke this like

send_ftp_file FILENAME

it would check if the file existed, execute the commands up to the EOF, and quit.

Ah, but what if you wanted to check for an error? Something simple, like not being able to connect.

The usual way would be to do something like this:

if [ "$?" -ne 0 ]; then
    echo "Command failed: $?. Exiting..."|mail root
    exit 1
fi

Unfortunately, this doesn't work. FTP, as far as I can tell, never returns an error to the shell.

So let's redirect the output. Let's do something like this:

ERROR_FILE="/tmp/ftp_error$$"
ftp -n $HOST 2> "$ERROR_FILE" << EOF
...
EOF

if [ -s "$ERROR_FILE" ] ; then
    echo "ftp transfer failed! Error is in $ERROR_FILE"
    exit 1
fi

and see what happens when I connect to a site (mine) that has no FTP server running. Sure enough:

ftp transfer failed! Error is in /tmp/ftp_error17237

cat /tmp/ftp_error17237
ftp: connect: Connection timed out

Hah! Worked. So, I put it into production and... got an error message each and every time I had a connection, even a valid one.

It seems that most Linux systems try Kerberos authentication first. So if the target site doesn't do Kerberos, you'll get the message

KERBEROS_V4 rejected as an authentication type

and then it'll try password authentication, which will work. But in the meantime, your error file now has that message in it, and your script will fail.

grep to the rescue:

cat $ERROR_FILE|grep -v KERBEROS_V4>"$ERROR_FILE".1

if [ -s "$ERROR_FILE.1" ] ; then
    echo "ftp transfer failed! Error is in $ERROR_FILE.1"
    exit 1
fi

which works just fine.