Wednesday, March 17, 2010

RIP Denman Tire

I began working for Denman Rubber Manufacturing in March 1975. I came in as a lab technician, then moved to Timekeeper.

In 1976, I began to work on an associate degree with one of those computer thingies. We had a Burroughs medium system at Kent Trumbull, but the big thing was the keypunch stations. Most folks had IBM 036, where you had to do three keys at once to do a left or right parenthesis. We had a few 039s. One key for a parenthesis - COBOL heaven! And it was COBOL, of course. That, FORTRAN, and Assembler.

Meanwhile, Denman had just been informed that IBM would no longer maintain their unit record equipment. Mostly because there was no way to make the dates go past 1979. So they knew they needed someone to set up a computer system. I was the only person they knew who knew what one looked like.

I made the Correct Technical Decision, and picked a Data General mini. I have a bad habit of making the C T D. I've owned a Betamax, and spent a decade with OS/2.

I wrote apps. In COBOL, of course - what else in 1979? Accounts Payable and Receivable, General Ledger, Order Entry, Inventory, ERP. Our payroll processor went out of business - our CFO spent a year trying to get a certain payroll company that begins with an A and ends with a DP to get our payroll on line. Didn't work. I wrote a payroll system in three months.

I changed operating systems. I started off with ICOS, the Interactive COBOL Operating System. Threaded p-code! Then to RDOS; to AOS; to AOS/VS; to dg/ux. I wrote articles for the Data General User's Group magazine. I talked about the Y2K problem in my second article - in 1986. I became president of the group in 1994; I saw it go away.

I made the decision to, for once in my life, go with the popular choice: Red Hat. For once, the popular choice was the Correct Technical Decision; I'm a RHCE running RHEL5.4 on all of our systems.

I got a BA from Hiram, a MS from Kent, an RHCT and a RHCE. I raised a son who also has a BA from Hiram, an RHCT, and an RHCE.

I've had a good career. Denman has been as good a place to work for as I could have ever asked.

Yesterday, they filed for a Chapter 7 liquidation.

I'll be OK - I already have a new job. But I've got friends that I've known for more than 30 years that are in serious trouble. Where does a tire builder with a 10th grade education go to get $27 an hour?

I'm running off at the fingers now, and will go to bed.

Thursday, March 11, 2010

Setting up IMAP and SquirrelMail

I’ve always been a POP3 guy, but my circumstances are about to change – more of that in a later post.  But I’ve always known the advantages that IMAP bring to the party – keeping the message on a central server for multiple location access and archiving – and if I had to set up a new email system, I wanted to use IMAP.

SquirrelMail – the most popular Linux web email interface – runs exclusively IMAP, rather than POP3, so that would become a necessity anyway.

IMAP installation was trivial – yum install cyrus-imapd. That’s it. It adds the cyrus user, writes out a .conf file.  Just remember to set a password for cyrus.   Only gotcha I found was that the
/etc/logrotate.d/cyrus-imapd contains a /var/log/auth.log entry. If you already have one elsewhere, logrotate will fail with a duplicate file error.

Sendmail wasn’t much harder.

R$=N            $: $#local $: $1
R$=N < @ $=w . >        $: $#local $: $1
Rbb + $+ < @ $=w . >        $#cyrusbb $: $1

and remember that the separators before the $: above are tabs, not spaces.  Then

m4 >, and restart sendmail.

Start up /etc/init.d/cyrus-imapd and saslauthd, and make sure both are configured to come up when you reboot.

/etc/init.d/cyrus-imapd start
chkconfig cyrus-imapd on
/etc/init.d/saslauthd start

Next, set up a sasl password for cyrus and any users by issuing this command as root:

/usr/local/sbin/saslpasswd cyrus
/usr/local/sbin/saslpasswd tim

imtest tests your connection:

[root@kyushu init.d]# su cyrus
bash-3.2$ imtest -m login -p imap localhost
S: * OK [CAPABILITY IMAP4 IMAP4rev1 LITERAL+ ID STARTTLS] Cyrus IMAP4 v2.3.7-Invoca-RPM-2.3.7-7.el5_4.3 server ready
S: C01 OK Completed
Please enter your password:
C: L01 LOGIN cyrus {8}
S: + go ahead
Security strength factor: 0
. logout
* BYE LOGOUT received
. OK Completed
Connection closed.

and set up your users:

bash-3.2$ cyradm localhost
IMAP Password:
localhost.localdomain> cm user.tim

And that's it! I did a quick test by setting up an IMAP email account in Outlook - worked great. Next week, I'll set up SquirrelMail.

Thursday, March 4, 2010

New Spamhaus block list

Spamhaus has just announced a new block list, the Domain Block List, here:

This is a great concept - it's designed to foil 'Snowshoe' spammers.  Those are the folks who use a wide array of addresses to spam from, thus spreading out the spam load over hundreds or thousands of addresses - the same way a snowshoe spreads your weight out over the snow.  It's very difficult to block, because they're constantly registering/spamming from/losing addresses.

But the spam payload is usually linked to a domain, and that's what the DBL is designed to block.  The DBL is not an IP blocklist - it lists actual domains; it's not a normal RBL list.  For this reason, you'll need SpamAssassin 3.31.

Now, the bad news - 3.31 isn't out yet.  The minute it releases, I'll start testing this.

Spamhaus is the most solid, reliable set of blocklists out there, with an extremely low false positive rate.  I can't wait to try this out.

Thursday, February 25, 2010

bash vs. -bash

Spent the last two days running around in circles, trying to figure out why this stupid script wouldn't work.  It's one line, and I was testing it from the command line.  Worked fine in a script:

root@dg scripts]# cat foo
ERR="Error from $(basename $0)";echo $ERR
[root@dg scripts]# foo
Error from foo
Failed miserably by itself:

[root@dg scripts]# ERR="Error from $(basename $0)";echo $ERR
basename: invalid option -- b
Try `basename --help' for more information.
Error from

The good folks at were a big help.  Seems that if the file is executed, it'll behave as you think it should:

[root@tolstoy scripts]# cat foo
#! /bin/bash
  echo $0

[root@tolstoy scripts]# foo

but if it's ever sourced - as your shell might be, after passing through /etc/profile, ~.bash_profile, etc.:

[root@tolstoy scripts]# . foo

In fact:

[root@tolstoy scripts]# echo $0

so my basename $0 became basename -bash, and it failed because it didn't know what the -b option is.

Wednesday, February 24, 2010

Cfengine 2 to 3 conversion tool

I've been using Cfengine 2 for a number of years now, and it's a great configuration tool.  They've recently put out version 3, and the changes are... extensive.  As in, Java vs. C++ extensive.

I'm very tentatively trying to move over to 3.  Yesterday, they put a big help up on their website:

Now on line at our Technical Corner is a conversion sampler, that
enables you to perform a limited conversion of portions of configuration
from Cfengine 2 to Cfengine 3. To convert larger samples more
completely, you can arrange professional services with Cfengine AS/Inc.
Mark Burgess

Wednesday, February 17, 2010

Stupid bash tricks: when is a null string not a null string?

I've done this more than once, and it still seems to me there ought to be an error.

The 'test' command says:

      -n STRING
              the length of STRING is nonzero

       STRING equivalent to -n STRING

       -z STRING
              the length of STRING is zero

 So something like this should work:

$ /home/tim>TEST=""
$ /home/tim>if test -z $TEST; then echo "null"; else echo "not null";fi

and it does.   But this should not work:

$ /home/tim>if test -n $TEST; then echo "not null"; else echo "null";fi  
not null

Hang on.  It can't be both!  Let's check out the length:

$ /home/tim>expr length $TEST
expr: syntax error

Huh?  How about feeding it a real string:

$ /home/tim>expr length "" 

Here's the problem.   Absent quotes around the $TEST, expr sees this as a command missing a parameter.  So this works just fine:

$ /home/tim>expr length "$TEST"

as does this:

$ /home/tim>if test -n "$TEST"; then echo "not null"; else echo "null";fi

So sometimes putting quotes around a string variable does make a difference...

Tuesday, February 16, 2010

bad interpreter: Permission denied

So I'm doing the Symantec live update installation on the servers, and they all work - except one.  The only one I really care about, our main server.

It goes about halfway through the install script, and then dies:

[root@dg tmp]# /opt/Symantec/symantec_antivirus/sav liveupdate -u
Command failed: Problem with LiveUpdate.
Check that java directory is in PATH
Unable to perform update 


The log's pretty good, though, and it showed which file it failed on:

16-Feb-10 10:24:58 AM Making /tmp/1266333784179/1266333839866/navuphub.dis executable ...
16-Feb-10 10:24:58 AM Running /tmp/1266333784179/1266333839866/navuphub.dis ...
16-Feb-10 10:24:58 AM Error running /tmp/1266333784179/1266333839866/navuphub.dis with reason: Permission denied. 

Huh?  Permission denied?  I'm running as root! I trace the file down in /tmp, and try to execute it.  The top line says

like it should, so there shouldn't be any problem:

[root@dg 1266333839866]# ./navuphub.dis
-bash: ./navuphub.dis: /bin/sh: bad interpreter: Permission denied

Uh-oh.  So I try executing /bin/sh - it works.  I unlink and link back to it - it's a symbolic link to bash in RHEL.  It still works.  Now I'm getting nervous.

I copy that little .dis file over to ~, and... the file executes.

I slap myself upside the head.  I do a mount -l, and sure enough:

/dev/hda5 on /tmp type ext3 (rw,nodev,noexec) [/tmp]

In a fit of paranoia a few years ago, I made /tmp unable to execute programs.  That's what that 'noexec' parameter in /etc/fstab does - no programs, no scripts. A remount of temp fixed it.

Friday, February 12, 2010

Configuring a local Liveupdate repository with Symantec Endpoint and Linux

Seven or eight years ago, I decided to finally dump all the individual antivirus packages around here and go with a centralized system. I wasn't sure what I wanted, but I knew what I didn't want - Symantec. I knew they were the evil empire.

After going through two different vendors, this year I finally decided to try Symantec Endpoint Protection Small Business Edition, v. 12. I'm a convert. Incredibly lightweight, responsive, and... cost effective. As in, under $15 a seat cost effective. I love it.

Never bothered with their Linux version, even though it was included with my purchase. ClamAV has been doing just fine for us for years, thank you.

Until last month. For some reason - I assume the definition file - ClamAV is now taking forever to make a scan:

Data scanned: 803.16 MB
Data read: 240.92 MB (ratio 3.33:1)
Time: 8035.814 sec (133 m 55 s)

100K a second?  Let's see, to scan our home directory would take... about a week.

So, easy solution - I've got plenty of SEP licences left; it runs on Linux; it's very easy to administer on Windows - I'll just install SEP on my Linux boxes!

Well, as easy as the Windows install/administration was, the Linux install is that complex. First of all, the Linux download is hard to find. Once I had it I wanted to set up a local LiveUpdate server on the Windows box. Easy enough. Then, I had to point the Linux server towards the LiveUpdate server. That's when things got crazy.

The docs say that you should change the server name in /etc/liveupdate.conf, but there are at least two documented ways of doing so. In addition, they changed the port from 8080 to 7070 in the most recent LiveUpdate Administration release. So I got all of that sorted out in the file, ran liveupdate... and nothing happened.

Well, something happened, actually:

Command failed: Failure in pre processing of micro definitions before update.
Unable to perform update

and the /etc/liveupdate.conf file was nulled out.

After a couple of days of back-and-forth with Symantec (Linux? What's that?), I finally found someone who gave me this link:

Once executed, JLU reads the contents of the unencrypted liveupdate.conf file, runs LiveUpdate, and then encrypts the liveupdate.conf file to prevent tampering. The encryption level used is above the maximum encryption level allowed by default in Sun Java. JLU requires the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files be applied to the version of Java being used to execute JLU.

To allow JLU to properly encrypt the liveupdate.conf file, you must apply the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files to the installation of Java being used to execute JLU.

Ah.  I need super-secret crypto, available only to true believers.  So I went to Sun, eventually found the
Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files, downloaded them and replaced the on my system.  It consists of two files, local_policy.jar and US_export_policy.jar.  Replaced the files on my system with those, made a mental note not to ever let these machines leave the country, and...

Still didn't work.  Crud.

OK, one more tech request.  They sent me to this link on 'Configuring JavaLiveUpdate':

which was wrong in an interesting way.  The table it's got is off by one row - the 'Parameter' column lines up with the 'Description' of the row above it.

But it did give me a hint - it pointed me to a Liveupdt.hst file.  I'd seen this before on the Windows side; the LiveUpdate Administration docs told me how to generate this file, but not what to do with it.  This seemed to imply that I pointed 'hostfile=' to it.  So I generated it on the Windows side (Configure/Client Settings/Export Java Settings), moved it over, and put it into my /etc/liveupdate.conf file, which now reads:


Hah!  That worked!

And it's slightly faster.  That 133-minute clamav scan finished in just under 8 minutes.

Sheesh.  But worth it, I think.

Tuesday, February 9, 2010

Samba 0-day

Samba has announced a 0-day exploit that's caused the computing world to sit up and... yawn. Details here:

I think SANS says it best:

"When is a 0day not a 0day? When the exploit ends up being just a poor default configuration issue. It can lead to files being read, that the user has permission to read."

So what we have here is an exploit that allows users to read files that... they have permission to read. No privilege escalation.

Yawn. If this really bothers you, add

wide links = no

to your smb.conf file, and restart. Of course, this will break allowing symlinks in an exported share.

Monday, February 8, 2010

Windows 2008 NTP problem?

I've got a Windows 2008 R2 server sitting as a guest on our RHEL 5.4 server.  I just happened to notice that the time was off.  Way off, like about six minutes.

The w32tm.exe utility has some interesting switches.  I used one to stripchart against the main time server:

C:\Users\Administrator>w32tm /stripchart /computer:buran
Tracking buran [].
The current time is 2/8/2010 12:12:44 PM
12:12:44 d:+00.0077832s o:+416.5745527s
12:12:46 d:+00.0029019s o:+417.0718254s
12:12:48 d:+00.0019183s o:+417.5564322s
12:12:51 d:+00.0019141s o:+418.0448887s
12:12:53 d:+00.0019232s o:+418.5206214s
12:12:55 d:+00.0019154s o:+419.0251265s
12:12:57 d:+00.0019209s o:+419.5642993s
12:12:59 d:+00.0019262s o:+420.0780252s
12:13:01 d:+00.0019133s o:+420.5655424s
12:13:03 d:+00.0019161s o:+421.0448077s
12:13:05 d:+00.0028745s o:+421.5292839s
12:13:07 d:+00.0019101s o:+422.0110611s

Wow.  I'm losing a second every four seconds.  Not good.

I've got w32time set up in NTP mode on the Windows server, so I'm not quite sure what's going on here.  Until I figure it out, I'm going to set the system up to resync time once a minute.  That'll at least keep me within 15 seconds of reality.  Change


to 60, and start/stop w32time

C:\Users\Administrator>w32tm /stripchart /computer:buran
Tracking buran [].
The current time is 2/8/2010 12:26:40
12:26:40 d:+00.0048590s o:+09.2796603s
12:26:42 d:+00.0009300s o:+09.8433852s
12:26:44 d:+00.0019293s o:+10.4054516s
12:26:46 d:+00.0019263s o:+10.9600696s
12:26:48 d:+00.0019241s o:+11.4967021s
12:26:50 d:+00.0019256s o:+12.0280363s
12:26:52 d:+00.0019269s o:+12.5750087s
12:26:54 d:+00.0019231s o:+13.1260536s
12:26:56 d:+00.0019243s o:+13.6663052s
12:26:58 d:+00.0018872s o:+14.2191757s
12:27:15 d:+00.0009443s o:+00.1400540s
12:27:17 d:+00.0019228s o:+00.6581759s

Better.  Not perfect, but better, and the network will be able to handle an NTP request every minute.  Now, all I have to do is find out what's wrong...

Friday, January 29, 2010

Hard vs. symbolic links

I don't know why I don't use many hard links.  Habit, I guess. I'll try to break it.

A symbolic link is a special file that points to another file, i.e.

lrwxrwxrwx 1 root root 4 Jun 23  2007 awk -> gawk*

Delete gawk, and awk points nowhere - and won't work.  Delete awk, and nothing at all happens to gawk.

A hard link is simply a file that points to the same inode as another.  In effect, it's just another way of referring to the same file.  You can't tell by looking at it that it's a hard link, and if you delete it - or the file it's linked to - nothing happens to the other file.  The only clue you've got is by looking at the link count, that column in 'ls -l' that nobody knows what it is:

-rwxr-xr-x 3 root root 62872 Jan 14 14:06 gunzip
-rwxr-xr-x 3 root root 62872 Jan 14 14:06 gzip
-rwxr-xr-x 3 root root 62872 Jan 14 14:06 zcat

Each has a link count of 3.  That means that there are three different files sharing the same inode number.  And, sure enough:

[root@dg bin]# ls -i *z*
319701 gunzip  319701 gzip  319701 zcat

So why have three 'identical' files with different names?  Well, here's what I do with one of my scripts.

[root@dg scripts]# ls -lai start*
1357990 -rwxr-xr-x 3 root root 446 Jan 28 16:44 start
1357990 -rwxr-xr-x 3 root root 446 Jan 28 16:44 start_this
1357990 -rwxr-xr-x 3 root root 446 Jan 28 16:44 start_that

Same file, hard linked.



 CALLED_NAME="`basename $0`"
 case $CALLED_NAME in

 cd $ICDIR||{ $ECHO "$0 failed chdir"|$MAIL tim;exit 1; }


So all I do is see how it's called, and operate accordingly.

I believe that in dg/ux, cp was a hard link to mv.

Thursday, January 28, 2010

Wrong files in the /etc/rcx.d directories

I was looking for something in /etc the other day, and didn't know if it was there or in a subdirectory.  So I did a grep -r:

[root@dg etc]# fgrep -ir "testing 1 2 3" *
fgrep: rc0.d/K88syslog: No such file or directory
fgrep: rc1.d/K88syslog: No such file or directory
fgrep: rc2.d/K88syslog: No such file or directory
fgrep: rc2.d/S40snortd: No such file or directory
fgrep: rc2.d/S12phone_log: No such file or directory
fgrep: rc2.d/S55zabbix_agentd: No such file or directory
fgrep: rc3.d/K88syslog: No such file or directory

Whoops.  That's not good.

Linux uses AT&T Sys V-type directories for startup and shutdown, and links those files to /etc/init.d.  That means that under /etc/rc.d, you've got a bunch of directories corresponding to the various runlevels:

[root@dg ~]# ls -lad /etc/rc.d/rc*.d
drwxr-xr-x 2 root root 4096 Dec  8 15:32 /etc/rc.d/rc0.d
drwxr-xr-x 2 root root 4096 Dec  8 15:32 /etc/rc.d/rc1.d
drwxr-xr-x 2 root root 4096 Jan 28 02:14 /etc/rc.d/rc2.d
drwxr-xr-x 2 root root 4096 Jan 28 02:14 /etc/rc.d/rc3.d
drwxr-xr-x 2 root root 4096 Jan 28 02:14 /etc/rc.d/rc4.d
drwxr-xr-x 2 root root 4096 Jan 28 02:14 /etc/rc.d/rc5.d
drwxr-xr-x 2 root root 4096 Dec  8 15:32 /etc/rc.d/rc6.d

and in each of those directories, you've got a link to init.d for the various start and stop scripts:

[root@dg ~]# ls -la /etc/rc.d/rc5.d/S*|more
lrwxrwxrwx 1 root root   22 Dec  8 15:32 /etc/rc.d/rc5.d/S02lvm2-monitor -> ../i
lrwxrwxrwx 1 root root   17 Jun 24  2007 /etc/rc.d/rc5.d/S03sysstat -> ../init.d
lrwxrwxrwx 1 root root   18 Jun 23  2007 /etc/rc.d/rc5.d/S08iptables -> ../init.

That means that at runlevel 5,  lvm2-monitor will start first, followed by sysstat, iptables, etc.  The down scripts are prefixed with a 'K', and work the same way:

[root@dg ~]# ls -la /etc/rc.d/rc5.d/K*|more
lrwxrwxrwx 1 root root 20 Feb  7  2008 /etc/rc.d/rc5.d/K00xendomains -> ../init.
lrwxrwxrwx 1 root root 17 Jan 31  2009 /etc/rc.d/rc5.d/K01dnsmasq -> ../init.d/d
lrwxrwxrwx 1 root root 24 Sep  6 17:26 /etc/rc.d/rc5.d/K01setroubleshoot -> ../i

So anyway.  What did those error messages in grep tell me?  It says that I have files in the various rcx.d directories that are linked to a nonexistent file in /etc/init.d.  Not a real problem, because the file will just fail to do anything - but something that really should be cleaned up.

However, while I was checking things out, I spotted something that potentially could be a real problem.

[root@dg init.d]# ls -la /etc/rc2.d/S99ossec
-r-xr-xr-x 1 root root 1087 May  9  2006 /etc/rc2.d/S99ossec

That's not a link - it's really a file!

One of two things can happen in this case, both of them bad.  If you've made a change to the file in /etc/init.d, it won't be reflected in the level 2 startup.  Or, worse - if you've removed the app and deleted the file in /etc/init.d, it could be running something you don't want to run.

My cleanup script looks like this:

# cleanup_rc


 cd /etc/rc.d||{ $ECHO "$0 failed chdir"|$MAIL tim;exit 1; }
 DIRS="`find . -name "rc*.d" -type d`"
 for i in $DIRS
   cd /etc/rc.d
   cd $i
   FILES="`find . -type f`"
     for j in $FILES
        FNAME="`basename $j`"
        rm -f $FNAME
        ln -s ../init.d/${FNAME:3} $FNAME
   FILES="`find . -type l -follow`"
   for j in $FILES
      rm -f $j
 exit 0

It finds all regular files in /etc/rc.d/rc*.d, deletes the file, and creates a link to init.d.  Then, it finds all files that don't exist in init.d, and deletes them.

Works like a charm:

[root@dg init.d]# cleanup_rc
[root@dg init.d]# ls -la /etc/rc2.d/S99ossec
lrwxrwxrwx 1 root root 15 Jan 28 11:29 /etc/rc2.d/S99ossec -> ../init.d/ossec
[root@dg init.d]# cd /etc
[root@dg etc]# fgrep -r "testing 1 2 3" *
[root@dg etc]#

Tuesday, January 26, 2010

bash numeric comparison

Ah, just ran into the dreaded bash comparison error again.  And I know better.  But in my defense, I'm cleaning up some scripts circa 1996.

OK, there is no good defense.  This is a miserable excuse for a loop.


  until test -z "`ps -e|grep icexec`"
{some stuff you want to do here}
    if [[ $loops > 10 ]]  
    exit 0

   let loops='loops + 1'

This is just bad in so many ways, it's hard to decide where to start.  Let's go with the math functions first.

if [[ loops > 10 ]] just doesn't work.  It does an ASCII comparison, so the second time through it exits out, since '2' is larger than '1'.  Change it to if [[ loops -gt 10 ]].  Or, better still, use two parentheses, and make it if ((loops > 10)).  Much cleaner.

Incrementing the loop.  This works, but I hate the 'let' command.  It's so damn picky about whitespace.  Change it to parentheses again, ((loops = loops+1)).  Or, again better, ((loops++)).

So your code is now

  until test -z "`ps -e|grep icexec`"
 {some stuff you want to do here}

 if ((loops > 10))
  exit 0

Better.  But why bother with the inner 'if...then'?

  until test -z "`ps -e|grep icexec`"||((loops > 10))
 {some stuff you want to do here}

Wednesday, January 20, 2010

bash error checking

Just had an interesting one bite me.

I like to do a lot of error checking in bash.  There are so many easy ways to destroy a system; something like this:

cd /nonexistentdirectory
rm -f *

can just ruin your whole day.  So I usually do something simple, like:

cd /nonexistentdirectory||exit 1
rm -f *

That works.  But if you've got it in the middle of a script, you'll never know about it.  The obvious thing to do is to mail yourself a message.

cd /nonexistentdirectory||(echo "$0 failed chdir"|mailx tim;exit 1)
This works fine.  That is, it emails me a nice error message - and then proceed to destroy all of my files.  See, it tosses up a subshell - a completely separate instance of bash.  Here's a little experiment.

  echo "testval before=$TESTVAL"

  cd /nonexistentdirectory||(TESTVAL="not a test";echo $TESTVAL)

  echo "testval after=$TESTVAL"

Running it shows:

testval before=testing
/usr/local/scripts/verify_full: line 25: cd: /nonexistentdirectory: No such file or directory
testval=not a test
testval after=testing

 So my variable gets changed to what I want it to - and then as soon as I drop out of that subshell, it changes back.  My 'exit 1' had the no effect at all - it droped me out of the subshell (which I was about to exit anyway) back to the main shell, and then proceeded to destroy all of my files.

Curly braces to the rescue.

cd /nonexistentdirectory||{ TESTVAL="not a test";echo "testval=$TESTVAL"; }

testval before=testing
/usr/local/scripts/verify_full: line 25: cd: /nonexistentdirectory: No such file or directory
testval=not a test
testval after=not a test

Now, the 'exit 1' really will drop you out of the whole script.  Your files are saved!

One gotcha - note the semicolon before the closing brace.  If you forget to put it there, you will search for a long time trying to find out why the script is failing.

Tuesday, January 19, 2010

Quick perl script to convert .cvs to Excel .xml

I've got a number of programs that write out data in .cvs format, for those customers who want to import it into a spreadsheet. It would be much more convenient if I could actually write it out as an Excel spreadsheet, for a number of reasons. There are a couple of Perl/Python programs out there, but they're usually way too complicated - they're designed to take something you have no control over and fix it. This is a quick and dirty perl script designed to take a well-formed .cvs file, and convert it.


use strict;
use warnings;
use Text::CSV_XS;
use Spreadsheet::WriteExcel;
my $row = 0;

my $infile = $ARGV[0];

open (my $fh, $infile) or die "$infile: $!";

my $outfile = "$infile";
if ($infile =~ /\.csv/) {
    $outfile =~ s/\.csv/.xls/;
else {
   $outfile = $infile . ".xls";

my $workbook = Spreadsheet::WriteExcel->;new($outfile);
my $worksheet = $workbook->;add_worksheet("sheet1");

my $csv = Text::CSV_XS->new ({ binary => 1, allow_whitespace => 1 });

while (my $rowdata = $csv->getline ($fh)) {
    my @rowdata = @$rowdata;
    foreach my $col (0 .. $#rowdata) {
        my $field = $rowdata[$col] || "";
        $worksheet->write($row, $col, $field);

close $fh or die "$fh:  $!";

Note the'allow_whitespace' parameter on the output file open.  This is one that's driven a lot of folks nuts.  Strict CSV grammar says that there is no whitespace in between delimiters, i.e.


However, if you're writing out numbers with a trailing sign (did I mention I'm a COBOL programmer?), you're going to get a space there with positive numbers:

"testing",1 ,2 , etc.

Excel will see these as 'Number stored as text' and give you an error, and refuse to format them nicely.  The 'allow_whitespace' parameter eliminates this.

Friday, January 15, 2010

Generating ssl keys and certificates

This is something I only do every two years, so it's a learning experience each time.  And GoDaddy has plenty of instruction help for Plesk, Apache, etc. - but none for what I use it for, secure email.  So let's get it documented.  This isn't for self-signed certificates; there are plenty of tutorials around for that.

Go to /etc/ssl and run

openssl genrsa -des3 -out

Number of bits has gone from 1024 minimum in the last few years to 2048 minumum.  And 4096 is probably better.  129 bits was once thought unbreakable - remember Squeamish Ossifrage?  But RSA-768 was cracked this year.  For a really good site on this, see

Anyway.  It’ll ask for a passphrase – go ahead and pick anything; we’ll remove it in a minute.

After the key has been generated, do this:

openssl rsa -in -out your.key

It’ll ask for the passphrase once, and then write out a new key without it.  If you do not strip the passphrase, you’ll be required to re-enter that passphrase every time you re-start a program that uses the certs (i.e., every reboot for every system).  However, this is insecure – if this key gets compromised, you’ll have to revoke the certificate.  That’s why I make all the files in the directory chmod 400.

Then, you’ll use that public key to generate a Certificate Request:

openssl req -new -key your.key -out your.csr

You then submit that csr file to the certificate provider.  GoDaddy sends back a zip file with the .crt file inside it.

To make a .pem file, simply use a text editor to put the .key file on top, a blank line, and the .crt on the bottom.  So, for instance, stunnel uses the ipop3d.pem fileto allow me to pop in to my mailbox.  The file looks like this:



Thursday, January 14, 2010

Random disk corruption during reboot

This one's been driving me crazy for more than a year.  Randomly, during a clean reboot, about half my systems will fail to come back up.  I'll do a 'shutdown -r'; the system will start again, but then I'll be told that /tmp - and it's always /tmp - needs checked manually:

and I get the dreaded

Give root password for maintenance (or type Control-D to continue)

So I'll go to the console, type 'fsck -y' because I really don't care what the errors are - nobody but Ted Ts'o could fix them by hand anyway - it finds two or three errors, and I'm back in business.

Not a big deal if I'm on site, but a real pain if I need to reboot remotely and the system won't come up by itself.  It's not reproducible, and darn near impossible to troubleshoot.

I've decided to try two things.  #1 is a suggestion from Mr. Ts'o in another context - add the 'sync' parameter to the fstab entry.  Ext3 syncs data to disk every 5 seconds; the sync parameter makes it write to disk immediately.  Since it's only /tmp, I don't think it'll have a horrible effect on the system.  We'll see.

The other is to always remember to run the 'sync' command before shutting down - it'll flush the buffers to the disk.  Since I'll never remember that, I'll add the command


to the /etc/init.d/halt file.

Wednesday, January 13, 2010

More SpamAssassin

Just noticed this on the SARE (SpamAssassin Rules Emporium) at

Seems like good advice:

IMPORTANT: Due to Ninjas being busy with lives, wives & hockey matches, SARE rules aren't being updated.

There is no need to run automated update tools as all they will produce is useless load on everybody's servers.

Any updates will be announced on the SpamAssassin Users Mailing List.

Tuesday, January 12, 2010

Today's date

Among the things that I rarely in public is the fact that although the platform I'm using is RHEL5.4 Linux, the code I'm writing is... COBOL.  I brought Denman's first computer online in 1980, and there just weren't a lot of other choices.  And in the intervening years, there really hasn't been a compelling reason to switch languages, so I've kept writing.

Anyway, I was in the debugger this morning, and had to maniuplate the date in YYMMDD form.  It came out


which is right - but I had to look at it a whole bunch of times.  It still looks like 47 octal to me.

Friday, January 8, 2010

kvm disk comparison

I used bonnie++ to get some disk benchmarks.  The virtual io is supposed to be much faster than the ide emulation, so let's take a look.

virtio (loop):

Version 1.03e       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
buran.denmantire 3G  8321  27 16886  13 13128   9 30720  46 15277   2 538.5   4
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 14308  94 +++++ +++ +++++ +++  7738  58 +++++ +++ +++++ +++,3G,8321,27,16886,13,13128,9,30720,46,15277,2,538.5,4,16,14308,94,+++++,+++,+++++,+++,7738,58,+++++,+++,+++++,+++

ide (loop):

Version 1.03e       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
buran.denmantire 3G  8164  12 16475   3 19955   7 45174  68 71586   6 992.4  13
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 15543  90 +++++ +++ +++++ +++ 15183  84 +++++ +++ +++++ +++,3G,8164,12,16475,3,19955,7,45174,68,71586,6,992.4,13,16,15543,90,+++++,+++,+++++,+++,15183,84,+++++,+++,+++++,+++

Hmmm.  Slightly faster, yes, but not the huge increase I was expecting.  Of course, that's with loop disks.  I'll try it sometime next week with a physical partition.

Thursday, January 7, 2010

Making chkrootkit a little more readable

I use cron to daily run a couple of rootkit checkers, rkhunter ( and chkroot
(  chkrootkit is nice, but it's a bit paranoid about weird and hidden files.  On the other hand, I probably want my rootkit checker to be a bit paranoid.

It's got a line of code to look for hidden files:

files=`${find} ${DIR} -name ".[A-Za-z]*" -o -name "...*" -o -name ".. *"` 

and then it just does an echo to display them.  Well, that's fine if you've got one or two, but if you've got a couple of dozen this is almost unreadable.  See, displaying like that will display all of the filenames with a space between them, and no newline, like so:

/usr/lib/firefox-3.0.16/.autoreg /usr/lib/gtk-2.0/immodules/.relocation-tag /usr/lib/ /usr/lib/ /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadLine/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadKey/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/YAML/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/GD/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/GD/Text/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/GD/Graph/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/File/Which/.packlist /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/File/Tail/.packlist /usr/lib/perl5/site_perl/5.

and so on, and so on.

I made a simple change:

      if test -n "$files"; then
        echo "Suspicious files = "
        for i in ${files}; do ls -la $i; done

and it prints the file out one line at a time.  Much nicer:

Suspicious files =
-rw-r--r-- 1 root root 0 Dec  3 11:05 /usr/lib/firefox-3.0.16/.autoreg
-rw-r--r-- 1 root root 4622 Dec 11  2007 /usr/lib/gtk-2.0/immodules/.relocation-tag
lrwxrwxrwx 1 root root 27 Sep 14 07:58 /usr/lib/ ->
-rw-r--r-- 1 root root 65 Apr  7  2009 /usr/lib/
-rw-r--r-- 1 root root 110 Apr  7  2009 /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadLine/.packlist
-rw-r--r-- 1 root root 363 Apr  7  2009 /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/Term/ReadKey/.packlist

Wednesday, January 6, 2010

kvm virtio disk mystery solved

I suppose this is what I get for depending on a GUI - something I normally never do.

kvm stores its guest definitions in an xml file in /etc/libvirt/qemu.  So to see what adding a new drive would look like in that file, I went to virt-manager, fired up the server, clicked the 'hardware' tab, and installed a virtio disk.  Then, I went looked at the file, and...

No difference.  I even saved it and did a diff.  No difference. OK, it must need a reboot - makes sense, kinda.  Rebooted the machine; no vda disk, no change to the .xml.  All right, I shut the guest down completely, get out of virt-manager, get back in... nada.  The disk is showing up in the hardware tab, but not in the .xml.  Nothing in the logs.

Feeling mightily confused, I fired off a support request to Red Hat and started messing around.  I removed the disk from the guest, shut the guest down, added the disk - AHA!  That did it!  Started up the guest; the disk shows up.  All is well.

This sure doesn't sound like the behavior I'd expect.  I mean, if you're going to have a GUI, why not an error message if you try to do this?  I think I'll submit a bug report.

Tuesday, January 5, 2010

Converting from Xen to kvm

So my Saturday/Sunday project, after all the good football games were over, was to convert from Xen to kvm here.  I've got six virtual RHEL5 servers running on this host.  Five were no problem at all, and took less than an hour total to convert and reboot.  One more small hint:  comment out

co:2345:respawn:/sbin/agetty xvc0 9600 vt100-nav
in /etc/inittab.  It started respawning rapidly.

The sixth server caused all kinds of problems.  It is, of course, our main production server.

First problem is that using the virt-install --import command imports all of the disks as hdx, on the ide bus.  Fine - unless you have more than four, the maximum number that the ide bus can support.  I'm working on getting these switch over to vdx-type virtio disks, but it's not as simple as I thought it should be, i.e., adding

    <disk device="disk" type="file">
      <target bus="virtio" dev="vda">

to the .xml file.  Ah, well, I'll keep working on it.  Luckily, the disks that I don't have up are historical files; I've got at least a week.

The other problem was networking.  The way I've got Denman set up is that all traffic goes through a central gateway machine. Only the inside virtual servers and the host are on the network, and can talk to each other directly.  But one of those six new kvm boxes was on the 0.0 network.  In order to get that network up and running, I had to define it on the host.  But if I defined it on the host, then the host could get to the 0.0 network directly, foiling my nice firewall scheme.

The solution took some head-scratching, but I finally came up with this:


and the secret is the netmask.  It creates the bridge:

[root@defiant ~]# ifconfig
br0       Link encap:Ethernet  HWaddr 00:19:B9:B8:95:F5
          inet addr:  Bcast:  Mask:
but doesn't route it anywhere:

[root@defiant ~]# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface     *        U     0      0        0 br1
default         saratoga1.denma         UG    0      0        0 br1

so the traffic to the 0.0 network still has to go through the gateway.

Friday, January 1, 2010

Happy New Year - check your SpamAssassin setup!

Just noticed this in all messages coming in:

3.2 FH_DATE_PAST_20XX The date is grossly in the future.

3.2 is huge in SpamAssassin.  Consider that the default threshold is 5.0 - this alone gets you most of the way to marking the message as spam. shows:   FH_DATE_PAST_20XX Date =~ /20[1-9][0-9]/ [if-unset: 2006]

It should be [2-9][0-9].  At least, it should be that now.

Happy new year!