December 2004
posix-sh BitsRecently I performed the system administration portion of a large migration from a Hewlett Packard UNIX (HP-UX) system to Red Hat Linux (RH Linux). The migration involved the creation and copying of roughly thirty accounts and 80GB of data. The data portion (since it was stored in group accessible common areas) turned out to be trivial relative to the user accounts.
The original plan was simply to manually add the users and transfer
the appropriate parts of their former $HOME over, as usual
something went wrong. Another problem occurred which cost valuable time,
time otherwise to be spent on creating accounts and secure copying files.
So, I wrote some scripts . . .
On the spot scripting is a common practice among UNIX users. In this case, circumstances were somewhat different than the norm. Most of the time, a shell script for something as trivial as moving bits is not a tall order, however, correctness and timeliness were the order of the day.
Some quick decisions about what and how had to be scribbled onto note paper. These were formulated in two minutes to the quick list below:
Steps
1. Create the accounts
2. Create the initial passwords
3. Run chage -d 0 -M 999 on each account (aging policy was
still in the air at the time and in committee)
4. Copy the directories
5. Fixup permissions
Rules
- no file deletion, mistakes must be fixed by hand
- it has to work quickly, as in little testing
The realization was simple, this was not the time or place to write some glorious all in one program or script to take care of the problem now and forever (that project was added the day after ...).
There were some known items as well, each user-id's $HOME
matched their login name (which made things very easy). Additionally,
no profiles were being copied over.
The main problem with migration was that even though the accounts
all resided under /home, they were in three different
groups. The solution for that was simple, create a list of users
per group:
ls -al /home | grep groupname | awk '{print $3}' > groupname.out
Next, time to create the accounts:
cat groupname.out | add_accounts.sh [add_accounts.sh] for i in $@ do useradd -g groupname -mk /usr/local/etc/skel \ -d /home/$i -c "$i User Account" $i done
Note that the GECOS field is a bit dry - on purpose - it is not required on this particular system. An LDAP server already possesses said information. At most, as side work, it was decided to fill out the field in spare time for each user.
Root public-keys were already setup just for the occasion, so now it was a matter of getting the files, which turned out to be slightly trickier than first guessed, due to scp's nature wild cards could not be used, temporary space was used instead:
mkdir /tmp/homes cat groupname.out | scopyhomes.sh [scopyhomes.sh] #!/bin.sh for i in $@ do scp -r host:/home/$i /tmp/homes done
With the directories now across, it was time to copy in just the regular files:
cd /tmp/homes for i in * do cd $i cp -R * /home/$i cd .. done
The last step, fixup perms:
cat groupname.out | fixperms.sh [fixperms.sh] for i in @ do chown -R $i:groupname /home/$i done
So... whats wrong with this picture?
The logic of the afore written material is ... questionable. Any seasoned programmer would toss the bits shown so far out the window and rightly so. They are riddled with potential problems. For instance, what if one of the secure copies hung? A keyboard signal means the loop simply would keep processing potentially missing an account. Why the constant catting of a file? What happens of the filesystem has mystery disappearing file problems? What if part of one script that is essential just does not work?
The easiest one to answer is filetests. In any operation involving
directories and files, a simple if [ -d $i ]; then or
if [ -f $i ]; then would have avoided any potential
file disappearance issues.
Another issue is trapping and dying, which can be cured with just a few lines of shell code:
toppid=$$ trap "exit 1" 1 2 3 15
Finally, what to do about outright explosions? What if a file is not where it should be? The answer, a generic bomb routine:
progname=${0##*/}
...
bomb()
{
cat >&2 <<ERRORMESSAGE
ERROR: $@
${progname} ABORTED
kill ${toppid}
exit 1
}
To use the bomb routine, look at the cp
command:
cp -R * /home/$i || bomb "Cannot copy to $i ... exiting"
Simple enough.
Fatefully, nothing went wrong. It all worked, but in retrospect, having the canned routines that should be used with every script handy would have ensured avoiding potential disaster or at least mitigated it to a manageable point. If those simple few lines routines (and others to be sure) had been lying around, a few more safety nets could have been cast.
(based on last 2 months log reports)