Friday, January 29, 2010

Hard vs. symbolic links

I don't know why I don't use many hard links.  Habit, I guess. I'll try to break it.

A symbolic link is a special file that points to another file, i.e.

lrwxrwxrwx 1 root root 4 Jun 23  2007 awk -> gawk*

Delete gawk, and awk points nowhere - and won't work.  Delete awk, and nothing at all happens to gawk.

A hard link is simply a file that points to the same inode as another.  In effect, it's just another way of referring to the same file.  You can't tell by looking at it that it's a hard link, and if you delete it - or the file it's linked to - nothing happens to the other file.  The only clue you've got is by looking at the link count, that column in 'ls -l' that nobody knows what it is:

-rwxr-xr-x 3 root root 62872 Jan 14 14:06 gunzip
-rwxr-xr-x 3 root root 62872 Jan 14 14:06 gzip
-rwxr-xr-x 3 root root 62872 Jan 14 14:06 zcat

Each has a link count of 3.  That means that there are three different files sharing the same inode number.  And, sure enough:

[root@dg bin]# ls -i *z*
319701 gunzip  319701 gzip  319701 zcat

So why have three 'identical' files with different names?  Well, here's what I do with one of my scripts.

[root@dg scripts]# ls -lai start*
1357990 -rwxr-xr-x 3 root root 446 Jan 28 16:44 start
1357990 -rwxr-xr-x 3 root root 446 Jan 28 16:44 start_this
1357990 -rwxr-xr-x 3 root root 446 Jan 28 16:44 start_that

Same file, hard linked.



 CALLED_NAME="`basename $0`"
 case $CALLED_NAME in

 cd $ICDIR||{ $ECHO "$0 failed chdir"|$MAIL tim;exit 1; }


So all I do is see how it's called, and operate accordingly.

I believe that in dg/ux, cp was a hard link to mv.

Thursday, January 28, 2010

Wrong files in the /etc/rcx.d directories

I was looking for something in /etc the other day, and didn't know if it was there or in a subdirectory.  So I did a grep -r:

[root@dg etc]# fgrep -ir "testing 1 2 3" *
fgrep: rc0.d/K88syslog: No such file or directory
fgrep: rc1.d/K88syslog: No such file or directory
fgrep: rc2.d/K88syslog: No such file or directory
fgrep: rc2.d/S40snortd: No such file or directory
fgrep: rc2.d/S12phone_log: No such file or directory
fgrep: rc2.d/S55zabbix_agentd: No such file or directory
fgrep: rc3.d/K88syslog: No such file or directory

Whoops.  That's not good.

Linux uses AT&T Sys V-type directories for startup and shutdown, and links those files to /etc/init.d.  That means that under /etc/rc.d, you've got a bunch of directories corresponding to the various runlevels:

[root@dg ~]# ls -lad /etc/rc.d/rc*.d
drwxr-xr-x 2 root root 4096 Dec  8 15:32 /etc/rc.d/rc0.d
drwxr-xr-x 2 root root 4096 Dec  8 15:32 /etc/rc.d/rc1.d
drwxr-xr-x 2 root root 4096 Jan 28 02:14 /etc/rc.d/rc2.d
drwxr-xr-x 2 root root 4096 Jan 28 02:14 /etc/rc.d/rc3.d
drwxr-xr-x 2 root root 4096 Jan 28 02:14 /etc/rc.d/rc4.d
drwxr-xr-x 2 root root 4096 Jan 28 02:14 /etc/rc.d/rc5.d
drwxr-xr-x 2 root root 4096 Dec  8 15:32 /etc/rc.d/rc6.d

and in each of those directories, you've got a link to init.d for the various start and stop scripts:

[root@dg ~]# ls -la /etc/rc.d/rc5.d/S*|more
lrwxrwxrwx 1 root root   22 Dec  8 15:32 /etc/rc.d/rc5.d/S02lvm2-monitor -> ../i
lrwxrwxrwx 1 root root   17 Jun 24  2007 /etc/rc.d/rc5.d/S03sysstat -> ../init.d
lrwxrwxrwx 1 root root   18 Jun 23  2007 /etc/rc.d/rc5.d/S08iptables -> ../init.

That means that at runlevel 5,  lvm2-monitor will start first, followed by sysstat, iptables, etc.  The down scripts are prefixed with a 'K', and work the same way:

[root@dg ~]# ls -la /etc/rc.d/rc5.d/K*|more
lrwxrwxrwx 1 root root 20 Feb  7  2008 /etc/rc.d/rc5.d/K00xendomains -> ../init.
lrwxrwxrwx 1 root root 17 Jan 31  2009 /etc/rc.d/rc5.d/K01dnsmasq -> ../init.d/d
lrwxrwxrwx 1 root root 24 Sep  6 17:26 /etc/rc.d/rc5.d/K01setroubleshoot -> ../i

So anyway.  What did those error messages in grep tell me?  It says that I have files in the various rcx.d directories that are linked to a nonexistent file in /etc/init.d.  Not a real problem, because the file will just fail to do anything - but something that really should be cleaned up.

However, while I was checking things out, I spotted something that potentially could be a real problem.

[root@dg init.d]# ls -la /etc/rc2.d/S99ossec
-r-xr-xr-x 1 root root 1087 May  9  2006 /etc/rc2.d/S99ossec

That's not a link - it's really a file!

One of two things can happen in this case, both of them bad.  If you've made a change to the file in /etc/init.d, it won't be reflected in the level 2 startup.  Or, worse - if you've removed the app and deleted the file in /etc/init.d, it could be running something you don't want to run.

My cleanup script looks like this:

# cleanup_rc


 cd /etc/rc.d||{ $ECHO "$0 failed chdir"|$MAIL tim;exit 1; }
 DIRS="`find . -name "rc*.d" -type d`"
 for i in $DIRS
   cd /etc/rc.d
   cd $i
   FILES="`find . -type f`"
     for j in $FILES
        FNAME="`basename $j`"
        rm -f $FNAME
        ln -s ../init.d/${FNAME:3} $FNAME
   FILES="`find . -type l -follow`"
   for j in $FILES
      rm -f $j
 exit 0

It finds all regular files in /etc/rc.d/rc*.d, deletes the file, and creates a link to init.d.  Then, it finds all files that don't exist in init.d, and deletes them.

Works like a charm:

[root@dg init.d]# cleanup_rc
[root@dg init.d]# ls -la /etc/rc2.d/S99ossec
lrwxrwxrwx 1 root root 15 Jan 28 11:29 /etc/rc2.d/S99ossec -> ../init.d/ossec
[root@dg init.d]# cd /etc
[root@dg etc]# fgrep -r "testing 1 2 3" *
[root@dg etc]#

Tuesday, January 26, 2010

bash numeric comparison

Ah, just ran into the dreaded bash comparison error again.  And I know better.  But in my defense, I'm cleaning up some scripts circa 1996.

OK, there is no good defense.  This is a miserable excuse for a loop.


  until test -z "`ps -e|grep icexec`"
{some stuff you want to do here}
    if [[ $loops > 10 ]]  
    exit 0

   let loops='loops + 1'

This is just bad in so many ways, it's hard to decide where to start.  Let's go with the math functions first.

if [[ loops > 10 ]] just doesn't work.  It does an ASCII comparison, so the second time through it exits out, since '2' is larger than '1'.  Change it to if [[ loops -gt 10 ]].  Or, better still, use two parentheses, and make it if ((loops > 10)).  Much cleaner.

Incrementing the loop.  This works, but I hate the 'let' command.  It's so damn picky about whitespace.  Change it to parentheses again, ((loops = loops+1)).  Or, again better, ((loops++)).

So your code is now

  until test -z "`ps -e|grep icexec`"
 {some stuff you want to do here}

 if ((loops > 10))
  exit 0

Better.  But why bother with the inner 'if...then'?

  until test -z "`ps -e|grep icexec`"||((loops > 10))
 {some stuff you want to do here}

Wednesday, January 20, 2010

bash error checking

Just had an interesting one bite me.

I like to do a lot of error checking in bash.  There are so many easy ways to destroy a system; something like this:

cd /nonexistentdirectory
rm -f *

can just ruin your whole day.  So I usually do something simple, like:

cd /nonexistentdirectory||exit 1
rm -f *

That works.  But if you've got it in the middle of a script, you'll never know about it.  The obvious thing to do is to mail yourself a message.

cd /nonexistentdirectory||(echo "$0 failed chdir"|mailx tim;exit 1)
This works fine.  That is, it emails me a nice error message - and then proceed to destroy all of my files.  See, it tosses up a subshell - a completely separate instance of bash.  Here's a little experiment.

  echo "testval before=$TESTVAL"

  cd /nonexistentdirectory||(TESTVAL="not a test";echo $TESTVAL)

  echo "testval after=$TESTVAL"

Running it shows:

testval before=testing
/usr/local/scripts/verify_full: line 25: cd: /nonexistentdirectory: No such file or directory
testval=not a test
testval after=testing

 So my variable gets changed to what I want it to - and then as soon as I drop out of that subshell, it changes back.  My 'exit 1' had the no effect at all - it droped me out of the subshell (which I was about to exit anyway) back to the main shell, and then proceeded to destroy all of my files.

Curly braces to the rescue.

cd /nonexistentdirectory||{ TESTVAL="not a test";echo "testval=$TESTVAL"; }

testval before=testing
/usr/local/scripts/verify_full: line 25: cd: /nonexistentdirectory: No such file or directory
testval=not a test
testval after=not a test

Now, the 'exit 1' really will drop you out of the whole script.  Your files are saved!

One gotcha - note the semicolon before the closing brace.  If you forget to put it there, you will search for a long time trying to find out why the script is failing.

Tuesday, January 19, 2010

Quick perl script to convert .cvs to Excel .xml

I've got a number of programs that write out data in .cvs format, for those customers who want to import it into a spreadsheet. It would be much more convenient if I could actually write it out as an Excel spreadsheet, for a number of reasons. There are a couple of Perl/Python programs out there, but they're usually way too complicated - they're designed to take something you have no control over and fix it. This is a quick and dirty perl script designed to take a well-formed .cvs file, and convert it.


use strict;
use warnings;
use Text::CSV_XS;
use Spreadsheet::WriteExcel;
my $row = 0;

my $infile = $ARGV[0];

open (my $fh, $infile) or die "$infile: $!";

my $outfile = "$infile";
if ($infile =~ /\.csv/) {
    $outfile =~ s/\.csv/.xls/;
else {
   $outfile = $infile . ".xls";

my $workbook = Spreadsheet::WriteExcel->;new($outfile);
my $worksheet = $workbook->;add_worksheet("sheet1");

my $csv = Text::CSV_XS->new ({ binary => 1, allow_whitespace => 1 });

while (my $rowdata = $csv->getline ($fh)) {
    my @rowdata = @$rowdata;
    foreach my $col (0 .. $#rowdata) {
        my $field = $rowdata[$col] || "";
        $worksheet->write($row, $col, $field);

close $fh or die "$fh:  $!";

Note the'allow_whitespace' parameter on the output file open.  This is one that's driven a lot of folks nuts.  Strict CSV grammar says that there is no whitespace in between delimiters, i.e.


However, if you're writing out numbers with a trailing sign (did I mention I'm a COBOL programmer?), you're going to get a space there with positive numbers:

"testing",1 ,2 , etc.

Excel will see these as 'Number stored as text' and give you an error, and refuse to format them nicely.  The 'allow_whitespace' parameter eliminates this.

Friday, January 15, 2010

Generating ssl keys and certificates

This is something I only do every two years, so it's a learning experience each time.  And GoDaddy has plenty of instruction help for Plesk, Apache, etc. - but none for what I use it for, secure email.  So let's get it documented.  This isn't for self-signed certificates; there are plenty of tutorials around for that.

Go to /etc/ssl and run

openssl genrsa -des3 -out

Number of bits has gone from 1024 minimum in the last few years to 2048 minumum.  And 4096 is probably better.  129 bits was once thought unbreakable - remember Squeamish Ossifrage?  But RSA-768 was cracked this year.  For a really good site on this, see

Anyway.  It’ll ask for a passphrase – go ahead and pick anything; we’ll remove it in a minute.

After the key has been generated, do this:

openssl rsa -in -out your.key

It’ll ask for the passphrase once, and then write out a new key without it.  If you do not strip the passphrase, you’ll be required to re-enter that passphrase every time you re-start a program that uses the certs (i.e., every reboot for every system).  However, this is insecure – if this key gets compromised, you’ll have to revoke the certificate.  That’s why I make all the files in the directory chmod 400.

Then, you’ll use that public key to generate a Certificate Request:

openssl req -new -key your.key -out your.csr

You then submit that csr file to the certificate provider.  GoDaddy sends back a zip file with the .crt file inside it.

To make a .pem file, simply use a text editor to put the .key file on top, a blank line, and the .crt on the bottom.  So, for instance, stunnel uses the ipop3d.pem fileto allow me to pop in to my mailbox.  The file looks like this:



Thursday, January 14, 2010

Random disk corruption during reboot

This one's been driving me crazy for more than a year.  Randomly, during a clean reboot, about half my systems will fail to come back up.  I'll do a 'shutdown -r'; the system will start again, but then I'll be told that /tmp - and it's always /tmp - needs checked manually:

and I get the dreaded

Give root password for maintenance (or type Control-D to continue)

So I'll go to the console, type 'fsck -y' because I really don't care what the errors are - nobody but Ted Ts'o could fix them by hand anyway - it finds two or three errors, and I'm back in business.

Not a big deal if I'm on site, but a real pain if I need to reboot remotely and the system won't come up by itself.  It's not reproducible, and darn near impossible to troubleshoot.

I've decided to try two things.  #1 is a suggestion from Mr. Ts'o in another context - add the 'sync' parameter to the fstab entry.  Ext3 syncs data to disk every 5 seconds; the sync parameter makes it write to disk immediately.  Since it's only /tmp, I don't think it'll have a horrible effect on the system.  We'll see.

The other is to always remember to run the 'sync' command before shutting down - it'll flush the buffers to the disk.  Since I'll never remember that, I'll add the command


to the /etc/init.d/halt file.

Wednesday, January 13, 2010

More SpamAssassin

Just noticed this on the SARE (SpamAssassin Rules Emporium) at

Seems like good advice:

IMPORTANT: Due to Ninjas being busy with lives, wives & hockey matches, SARE rules aren't being updated.

There is no need to run automated update tools as all they will produce is useless load on everybody's servers.

Any updates will be announced on the SpamAssassin Users Mailing List.

Tuesday, January 12, 2010

Today's date

Among the things that I rarely in public is the fact that although the platform I'm using is RHEL5.4 Linux, the code I'm writing is... COBOL.  I brought Denman's first computer online in 1980, and there just weren't a lot of other choices.  And in the intervening years, there really hasn't been a compelling reason to switch languages, so I've kept writing.

Anyway, I was in the debugger this morning, and had to maniuplate the date in YYMMDD form.  It came out


which is right - but I had to look at it a whole bunch of times.  It still looks like 47 octal to me.

Friday, January 8, 2010

kvm disk comparison

I used bonnie++ to get some disk benchmarks.  The virtual io is supposed to be much faster than the ide emulation, so let's take a look.

virtio (loop):

Version 1.03e       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
buran.denmantire 3G  8321  27 16886  13 13128   9 30720  46 15277   2 538.5   4
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 14308  94 +++++ +++ +++++ +++  7738  58 +++++ +++ +++++ +++,3G,8321,27,16886,13,13128,9,30720,46,15277,2,538.5,4,16,14308,94,+++++,+++,+++++,+++,7738,58,+++++,+++,+++++,+++

ide (loop):

Version 1.03e       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
buran.denmantire 3G  8164  12 16475   3 19955   7 45174  68 71586   6 992.4  13
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 15543  90 +++++ +++ +++++ +++ 15183  84 +++++ +++ +++++ +++,3G,8164,12,16475,3,19955,7,45174,68,71586,6,992.4,13,16,15543,90,+++++,+++,+++++,+++,15183,84,+++++,+++,+++++,+++

Hmmm.  Slightly faster, yes, but not the huge increase I was expecting.  Of course, that's with loop disks.  I'll try it sometime next week with a physical partition.

Thursday, January 7, 2010

Making chkrootkit a little more readable

I use cron to daily run a couple of rootkit checkers, rkhunter ( and chkroot
(  chkrootkit is nice, but it's a bit paranoid about weird and hidden files.  On the other hand, I probably want my rootkit checker to be a bit paranoid.

It's got a line of code to look for hidden files:

files=`${find} ${DIR} -name ".[A-Za-z]*" -o -name "...*" -o -name ".. *"` 

and then it just does an echo to display them.  Well, that's fine if you've got one or two, but if you've got a couple of dozen this is almost unreadable.  See, displaying like that will display all of the filenames with a space between them, and no newline, like so:

/usr/lib/firefox-3.0.16/.autoreg /usr/lib/gtk-2.0/immodules/.relocation-tag /usr/lib/ /usr/lib/ /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadLine/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadKey/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/YAML/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/GD/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/GD/Text/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/GD/Graph/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/File/Which/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/File/Tail/.packlist /usr/lib/perl5/site_perl/5.

and so on, and so on.

I made a simple change:

      if test -n "$files"; then
        echo "Suspicious files = "
        for i in ${files}; do ls -la $i; done

and it prints the file out one line at a time.  Much nicer:

Suspicious files =
-rw-r--r-- 1 root root 0 Dec  3 11:05 /usr/lib/firefox-3.0.16/.autoreg
-rw-r--r-- 1 root root 4622 Dec 11  2007 /usr/lib/gtk-2.0/immodules/.relocation-tag
lrwxrwxrwx 1 root root 27 Sep 14 07:58 /usr/lib/ ->
-rw-r--r-- 1 root root 65 Apr  7  2009 /usr/lib/
-rw-r--r-- 1 root root 110 Apr  7  2009 /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadLine/.packlist
-rw-r--r-- 1 root root 363 Apr  7  2009 /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadKey/.packlist

Wednesday, January 6, 2010

kvm virtio disk mystery solved

I suppose this is what I get for depending on a GUI - something I normally never do.

kvm stores its guest definitions in an xml file in /etc/libvirt/qemu.  So to see what adding a new drive would look like in that file, I went to virt-manager, fired up the server, clicked the 'hardware' tab, and installed a virtio disk.  Then, I went looked at the file, and...

No difference.  I even saved it and did a diff.  No difference. OK, it must need a reboot - makes sense, kinda.  Rebooted the machine; no vda disk, no change to the .xml.  All right, I shut the guest down completely, get out of virt-manager, get back in... nada.  The disk is showing up in the hardware tab, but not in the .xml.  Nothing in the logs.

Feeling mightily confused, I fired off a support request to Red Hat and started messing around.  I removed the disk from the guest, shut the guest down, added the disk - AHA!  That did it!  Started up the guest; the disk shows up.  All is well.

This sure doesn't sound like the behavior I'd expect.  I mean, if you're going to have a GUI, why not an error message if you try to do this?  I think I'll submit a bug report.

Tuesday, January 5, 2010

Converting from Xen to kvm

So my Saturday/Sunday project, after all the good football games were over, was to convert from Xen to kvm here.  I've got six virtual RHEL5 servers running on this host.  Five were no problem at all, and took less than an hour total to convert and reboot.  One more small hint:  comment out

co:2345:respawn:/sbin/agetty xvc0 9600 vt100-nav
in /etc/inittab.  It started respawning rapidly.

The sixth server caused all kinds of problems.  It is, of course, our main production server.

First problem is that using the virt-install --import command imports all of the disks as hdx, on the ide bus.  Fine - unless you have more than four, the maximum number that the ide bus can support.  I'm working on getting these switch over to vdx-type virtio disks, but it's not as simple as I thought it should be, i.e., adding

    <disk device="disk" type="file">
      <target bus="virtio" dev="vda">

to the .xml file.  Ah, well, I'll keep working on it.  Luckily, the disks that I don't have up are historical files; I've got at least a week.

The other problem was networking.  The way I've got Denman set up is that all traffic goes through a central gateway machine. Only the inside virtual servers and the host are on the network, and can talk to each other directly.  But one of those six new kvm boxes was on the 0.0 network.  In order to get that network up and running, I had to define it on the host.  But if I defined it on the host, then the host could get to the 0.0 network directly, foiling my nice firewall scheme.

The solution took some head-scratching, but I finally came up with this:


and the secret is the netmask.  It creates the bridge:

[root@defiant ~]# ifconfig
br0       Link encap:Ethernet  HWaddr 00:19:B9:B8:95:F5
          inet addr:  Bcast:  Mask:
but doesn't route it anywhere:

[root@defiant ~]# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface     *        U     0      0        0 br1
default         saratoga1.denma         UG    0      0        0 br1

so the traffic to the 0.0 network still has to go through the gateway.

Friday, January 1, 2010

Happy New Year - check your SpamAssassin setup!

Just noticed this in all messages coming in:

3.2 FH_DATE_PAST_20XX The date is grossly in the future.

3.2 is huge in SpamAssassin.  Consider that the default threshold is 5.0 - this alone gets you most of the way to marking the message as spam. shows:   FH_DATE_PAST_20XX Date =~ /20[1-9][0-9]/ [if-unset: 2006]

It should be [2-9][0-9].  At least, it should be that now.

Happy new year!