next up previous contents index
Next: Bacula Copyright, Trademark, and Up: Bacula User's Guide Previous: The Bacula internal database   Contents   Index

Subsections


Disaster Recovery Using a Bacula Rescue Floppy

General

Please note that the Bacula Rescue Floppy is now deprecated and is and is replaced by the Bacula Rescue CDROM described in another chapter of this manual.

When disaster strikes, you must have a plan, and you must have prepared in advance otherwise the work of recovering your system and your files will be considerably greater. For example, if you have not previously saved the partitioning information for your hard disk, how can you properly rebuild it if the disk must be replaced?

Unfortunately, many of the steps one must take before and immediately after a disaster are very operating system dependent. As a consequence, this chapter will discuss in detail disaster recovery (also called Bare Metal Recovery) for Linux and Solaris. For Solaris, the procedures are still quite manual. For FreeBSD the same procedures may be used but they are not yet developed. For Win32, no luck. Apparently an ``emergency boot'' disk allowing access to the full system API without interference does not exist.

Important Considerations

Here are a few important considerations concerning disaster recovery that you should take into account before a disaster strikes.

Steps to Take Before Disaster Strikes

Bare Metal Floppy Recovery on Linux with a Bacula Floppy Rescue Disk

Since floppies are being used less and less, the Bacula Floppy rescue disk is deprecated, which means that it is no longer really supported. For those of you who have or need floppy rescue, we include the recovery instructions here for your reference.

The remainder of this section concerns recovering a Linux computer using a floppy, and parts of it relate to the Red Hat version of Linux.

A so called ``Bare Metal'' recovery is one where you start with an empty hard disk and you restore your machine. There are also cases where you may lose a file or a directory and want it restored. Please see the previous chapter for more details for those cases.

Bare Metal Recovery assumes that you have the following four items for your system:

Restrictions

In addition, to the above assumptions, the following conditions or restrictions apply:

Directories

If you are building a self-contained Bacula Rescue CDROM, you will find the necessary scripts in rescule/linux/cdrom subdirectory of the Bacula source code.

If you wish to build the Bacula Rescue floppy disk, the scripts discussed below can be found in the rescue/linux/floppy subdirectory of the Bacula source code.

Preparation for a Bare Metal Recovery

There are two things you should do immediately on all (Linux) systems for which you wish to do a bare metal recovery:

  1. Create a system emergency boot disk or alternatively a system installation boot floppy. This step can be skipped if you have an Installation CDROM and your machine will boot from CDROM (most modern computers will).
  2. Create a Bacula Rescue floppy, which captures the current working state of your computer and creates scripts to restore it. In addition, it creates a statically linked version of the Bacula File daemon (Client) program, which is key to successfully restoring from scratch.

Creating an Emergency Boot Disk

Here you have several choices:

tomsrtbt:

If you have created a Bacula Rescue CDROM, you can skip this section.

If you *must* use a boot floppy, my preference is to create and use a tomsrtbt emergency boot disk because it gives you a very clean Linux environment (with a 2.2 kernel) and the most utilities. See http://www.toms.net/rb/ for more details on this. It is very easy to do and well worth the effort. However, I recommend that you create both especially if you have non-standard hardware. You may find that tomsrtbt will not work with your network driver (he surely has one, but you must explicitly put it on the disk), whereas the Linux rescue is more likely to work.

Emergency Boot Disk:

If you have created a Bacula Rescue CDROM, you can skip this section.

To create a standard Linux emergency boot disk you must first know the name of the kernel, which you can find with:

  ls -l /boot

and looking on the vmlinux-... line or alternative do an

 uname -a

then become root and with a blank floppy in the drive, enter the following command:

  mkbootdisk --device /dev/fd0 2.4.18-18

where you replace ``2.4.18-18'' by your system name.

This disk can then be booted and you will be in an environment with a number of important tools available. Some disadvantages of this environment as opposed to tomsrtbt are that you must enter linux rescue at the boot prompt or the boot will fail without a hard disk; it requires a disk boot image or a CDROM to be mounted, if the CDROM is released, you will loose a large number of the tools.

Red Hat Installation Disk:

If you have created a Bacula Rescue CDROM, you can skip this section.

Specific to Red Hat Linux, is to create an Installation floppy, which can also be used as an emergency boot disk. The advantage of this method is that it works in conjunction with the installation CDROM and hence during the first part of restoring the system, you have a much larger number of tools available (on the CDROM). This can be extremely useful if you are not sure what really happened and you need to examine your system in detail.

To make a Red Hat Linux installation disk, do the following:

mount the Installation CDROM (/mnt/cdrom)
cd /mnt/cdrom/images
dd if=boot.img of=/dev/fd0 bs=1440k

Now that you have either an emergency boot disk or an installation floppy, you will be able to reboot your system in the absence of your hard disk or with a damaged hard disk. This method has the same disadvantages compared to tomsrtbt disk as mentioned above for the Emergency Boot Disk.

Creating a Bacula Rescue Disk

If you have created a Bacula Rescue CDROM, this step will be automatically done for you.

Simply having a boot disk is not sufficient to re-create things as they were. To solve this problem, we will create a Bacula Rescue disk. Everything that will be written to this disk will first be placed into the <bacula-src>/rescue/linux directory.

The first step is while your system is up and running normally, you use a Bacula script called getdiskinfo to capture certain important information about your hard disk configuration (partitioning, formatting, mount points, ...). getdiskinfo will also create a number of scripts using the information found that can be used in an emergency to repartition your disks, reformat them, and restore a statically linked version of the Bacula file daemon so that your disk can be restored from within a minimal boot environment.

The first step is to run getdiskinfo as follows:

   su
   cd <bacula-src>/rescue/linux
   ./getdiskinfo

getdiskinfo works for either IDE or SCSI drives and recognizes both ext2 and ext3 file systems. If you wish to restore other file systems, you will need to modify the code. This script can be run multiple times, but really only needs to be run once unless you change your hard disk configuration.

Assuming you have a single hard disk on device /dev/hda, getdiskinfo will create the following files:

partition.hda
This file contains the shell commands to repartition your hard disk drive /dev/hda to the current state. If you have additional drives (e.g. /dev/hdc), you will find one of these files for each drive. DO NOT EXECUTE THIS SCRIPT UNLESS YOU WANT YOUR HARD DISK REPARTITIONED

format.hda
This file contains the shell commands that will format each of the partitions on your hard drive. It knows about ext2, ext3, and swap partitions. All other partitions, you must manually format. It is recommended that any Microsoft partitions be partitioned with Microsoft's format command rather than using Unix tools. DO NOT EXECUTE THIS SCRIPT UNLESS YOU WANT YOUR HARD DISK REFORMATTED

mount_drives
This script will mount all ext2 and ext3 drives that were previously mounted. They will be mounted on

/mnt/drive/. This is used just before running the statically linked Bacula so that it can access your drives for the restore.

restore_bacula
This script will restore the File daemon from the Bacula Rescue disk. Building the Bacula Rescue disk will be described later. This will provide your emergency boot environment with a Bacula file daemon. Note, this is a special statically linked version of the file daemon (i.e. it does not need or use shared libraries).

start_network
This script will start your network using the simplest possible commands. You will need to verify that the IP address used in this script is correct. In addition, if you have several ethernet cards, you may need to make other modifications to this script.

sfdisk
This is the program that will repartition your hard disk, and it is normally found in /sbin/sfdisk. It is placed in this directory so that it will be included on the rescue disk as it is not normally available with all emergency boot environments.

sfdisk.gz
This is the version of sfdisk that works with tomsrtbt. The standard sfdisk described above will not run under tomsrtbt.

The getdiskinfo program (actually a shell script) will also create a subdirectory named diskinfo, which contains the following files:

df.bsi
disks.bsi
fstab.bsi
ifconfig.bsi
mount.bsi
mount.ext2.bsi
mount.ext3.bsi
mtab.bsi
route.bsi
sfdisk.disks.bsi
sfdisk.hda.bsi
sfdisk.make.hda.bsi

Each of these files contains some important piece of information (sometimes redundant) about your hard disk setup or your network. Normally, you will not need this information, but it will be written to the Bacula Rescue disk just in case. Since it is normally not used, we will leave it to you to examine those files at your leisure.

Building a Static File Daemon:

If you have created a Bacula Rescue CDROM, this step will be automatically done for you.

The second of the three steps in creating your Bacula Rescue disk is to build a static version of the File daemon. Do so by either configuring Bacula as follows or by allowing the make_rescue_disk script described below make it for you:

cd <bacula-src>
./configure <normal-options>
make
cd src/filed
make static-bacula-fd
strip static-bacula-fd
cp static-bacula-fd ../../rescue/linux/bacula-fd
cp bacula-fd.conf ../../rescue/linux

Note, above, we built static-bacula-fd and changed its name to bacula-fd when copying it to the rescue/linux directory.

Finally, in <bacula-src>/rescue/linux, ensure that the WorkingDirectory and PIDDirectory both point to reasonable locations on a stripped down system. If you are using tomsrtbt you will also want to replace machine names with IP addresses since there is no resolver running. With the Linux Rescue disk, network address mapping seems to work. Don't forget that at the time this version of the Bacula File daemon runs, your file system will not be restored. In my bacula-fd.conf, I use /var/working.

Writing the Bacula Rescue Floppy:

When you have everything you need (output of getdiskinfo, Bacula File daemon, ...), you create your rescue floppy by putting a blank tape into your floppy disk drive and entering:

su
./make_rescue_disk

This script will reformat the floppy and write everything in the current directory and all files in the diskinfo directory to the floppy. If you supply the appropriate command line options, it will also build a static version of the Bacula file daemon and copy it along with the configuration file to the disk. Also using a command line option, you can make it write a compressed tar file containing all the files whose names are in backup.etc.list to the floppy. The list as provided contains names of files in /etc that you might need in a disaster situation. It is not needed, but in some cases such as a complex network setup, you may find it useful.

Options for make_rescue_disk:

The following command line options are available for the make_rescue_disk script:

Usage: make_rescue_disk
  -h, --help             print this message
  --make-static-bacula   make static File daemon and add to diskette
  --copy-static-bacula   copy static File daemon to diskette
  --copy-etc-files       copy files in etc list to diskette

Briefly the options are:

--make-static-bacula
If this option is specified, the script will assume that you have already configured and built Bacula. It will then proceed to build a statically linked version and copy it along with bacula-fd.conf to the current directory, then write it to the rescue disk.

--copy-static-bacula
If this option is given, the script will assume that you already have a copy of the statically linked Bacula in the current directory named bacula-fd as well as the configuration script. They will then be written to the rescue disk.

--copy-etc-files
If this option is specified, the script will tar the files in backup.etc.list and write them to the rescue disk.

Please examine the contents of the rescue floppy to ensure that it has everything you want and need. If not modify the scripts as necessary and re-run it until it is correct.

Now that you have both a system boot floppy and a Bacula Rescue floppy, assuming you have a full backup of your system made by Bacula, you are ready to handle nearly any kind of emergency restoration situation.

Restoring Your Linux Client with a Floppy

Now, let's assume that your hard disk has just died and that you have replaced it with an new identical drive. In addition, we assume that you have:

  1. A recent Bacula backup (Full plus Incrementals)
  2. An emergency boot floppy (preferably tomsrtbt)
  3. A Bacula Rescue Floppy Disk
  4. Your Bacula Director, Catalog, and Storage daemon running on another machine on your local network.

This is a relatively simple case, and later in this chapter, as time permits, we will discuss how you might recover from a situation where the machine that crashes is your main Bacula server (i.e. has the Director, the Catalog, and the Storage daemon).

You will take the following steps to get your system back up and running:

  1. Boot with your Emergency Floppy
  2. Mount your Bacula Rescue floppy
  3. Start the Network (local network)
  4. Re-partition your hard disk(s) as it was before
  5. Re-format your partitions
  6. Restore the Bacula File daemon (static version)
  7. Perform a Bacula restore of all your files
  8. Re-install your boot loader
  9. Reboot

Now for the details ...

Boot with your Emergency Floppy

First you will boot with your emergency floppy. If you use the Installation floppy described above, when you get to the boot prompt:

boot:

you enter linux rescue.

If you are booting from tomsrtbt simply enter the default responses.

When your machine finishes booting, you should be at the command prompt possibly with your hard disk mounted on /mount/sysimage (Linux emergency only). To see what is actually mounted, use:

df

Mount your Bacula Rescue Floppy:

Make sure that the mount point /mnt/floppy exists. If not, enter:

mkdir -p /mnt/floppy

the mount your Bacula Rescue disk and cd to it with:

mount /dev/fd0 /mnt/floppy
cd /mnt/floppy

To simplify running the scripts make sure the current directory is on your path by:

PATH=$PATH:.

Start the Network:

At this point, you should bring up your network. Normally, this is quite simple and requires just a few commands. If you have booted from your Bacula Rescue CDROM, please cd into the /bacula-hostname directory before continuing. To simplify your task, we have created a script that should work in most cases by typing:

./start_network

You can test it by pinging another machine, or pinging your broken machine machine from another machine. Do not proceed until your network is up.

Unmount Your Hard Disk (if mounted):

When you are sure you want to repartition your disk, normally, if your disk was damaged or if you are using tomsrtbt your hard disk will not be mounted. However, if it is you must first unmount it so that it is not in use. Do so by entering df and then enter the correct commands to unmount the disks. For example:

umount /mnt/sysimage/boot
umount /mnt/sysimage/usr
umount /mnt/sysimage/proc
umount /mnt/sysimage/

where you explicitly unmount (umount) each sysimage partition and finally, the last one being the root. Do another df command to be sure you successfully unmount all the sysimage partitions.

This is necessary because sfdisk will refuse to partition a disk that is currently mounted. As mentioned, this should never be necessary with tomsrtbt.

Partition Your Hard Disk(s):

If you are using tomsrtbt, you will need to do the following steps to get the correct sfdisk:

rm -f sfdisk
bzip2 -d sfdisk.bz2

Do not do the above steps if you are using a standard Linux boot disk or the Bacula Rescue CDROM.

Then proceed with partitioning your hard disk by:

./partition.hda

If you have multiple disks, do the same for each of them. For SCSI disks, the repartition script will be named: partition.sda. If the script complains about the disk being in use, simply go back and redo the df command and umount commands until you no longer have your hard disk mounted. Note, in many cases, if your hard disk was seriously damaged or a new one installed, it will not automatically be mounted. If it is mounted, it is because the emergency kernel found one or more possibly valid partitions.

If for some reason this procedure does not work, you can use the information in partition.hda to re-partition your disks by hand using fdisk.

Format Your Hard Disk(s):

After partitioning your disk, you must format it appropriately. The formatting script will put back swap partitions, normal Unix partitions (ext2) and journaled partitions (ext3). Do so by entering for each disk:

./format.hda

The format script will ask you if you want a block check done. We recommend to answer yes, but realize that for very large disks this can take hours.

Mount the Newly Formatted Disks:

Once the disks are partitioned and formatted, you can remount them with the mount_drives script. All your drives must be mounted for Bacula to be able to access them. Run the script as follows:

./mount_drives
df

The df will tell you if the drives are mounted. If not, re-run the script again. It isn't always easy to figure out and create the mount points and the mounts in the proper order, so repeating the ./mount_drives command will not cause any harm and will most likely work the second time. If not, correct it by hand before continuing.

Unmount the CDROM:

Next, if you are using the Red Hat installation disk, unmount the CDROM drive by doing:

umount /mnt/cdrom

This is not necessary if you are running tomsrtbt. In doing this, I find it is always busy, and I haven't figured out how to unmount it (Linux boot only).

Restore and Start the File Daemon:

If you have booted with a Bacula Rescue CDROM, your statically linked Bacula File daemon and the bacula-fd.conf file with be in the /bacula-hostname/bin directory. Please skip the following paragraph and continue with editing the Bacula configuration file.

If you have not used a Bacula Rescue CDROM, now change (cd) to some directory where you want to put the image of the Bacula File daemon. I use the tmp directory my hard disk (mounted as /mnt/disk/tmp) because it is easy. Then install into the current directory Bacula by running the restore_bacula script from the floppy drive. For example:

cd /mnt/disk
mkdir -p /mnt/disk/tmp
mkdir -p /mnt/disk/tmp/working
/mnt/floppy/restore_bacula
ls -l

Make sure bacula-fd and bacula-fd.conf are both there.

Edit the Bacula configuration file, create the working/pid/subsys directory if you haven't already done so above, and start Bacula by entering:

chroot /mnt/disk /tmp/bacula-fd -c /tmp/bacula-fd.conf

The above command starts the Bacula File daemon with your the proper root disk location (i.e. /mnt/disk/tmp. If Bacula does not start correct the problem and start it. You can check if it is running by entering:

ps fax

You can kill Bacula by entering:

kill -TERM <pid>

where pid is the first number printed in front of the first occurrence of bacula-fd in the ps fax command.

Now, you should be able to use another computer with Bacula installed to check the status by entering:

status client=xxxx

into the Console program, where xxxx is the name of the client you are restoring.

One common problem is that your bacula-dir.conf may contain machine addresses that are not properly resolved on the stripped down system to be restored because it is not running DNS. This is particularly true for the address in the Storage resource of the Director, which may be very well resolved on the Director's machine, but not on the machine being restored and running the File daemon. In that case, be prepared to edit bacula-dir.conf to replace the name of the Storage daemon's domain name with its IP address.

Restoring using the RedHat Installation Disk:

Suppose your system was damaged for one reason or another, so that the hard disk and the partitioning and much of the filesystems are intact, but you want to do a full restore. If you have booted into your system with the RedHat Installation Disk by specifying linux rescue at the boot: prompt, you will find yourself in a shell command with your disks already mounted (if it was possible) in /mnt/sysimage. In this case, you can do much like you did above to restore your system:

cd /mnt/sysimage/tmp
mkdir -p /mnt/sysimage/tmp/working
/mnt/floppy/restore_bacula
ls -l

Make sure that bacula-fd and bacula-fd.conf are both in the current directory and that the directory names in the bacula-fd.conf correctly point to the appropriate directories. Then start Bacula with:

chroot /mnt/sysimage /tmp/bacula-fd -c /tmp/bacula-fd.conf

Restore Your Files:

On the computer that is running the Director, you now run a restore command and select the files to be restored (normally everything), but before starting the restore, there is one final change you must make using the mod option. You must change the Where directory to be the root by using the mod option just before running the job and selecting Where. Set it to:

/

then run the restore.

You might be tempted to avoid using chroot and running Bacula directly and then using a Where to specify a destination of /mnt/disk. This is possible, however, the current version of Bacula always restores files to the new location, and thus any soft links that have been specified with absolute paths will end up with /mnt/disk prefixed to them. In general this is not fatal to getting your system running, but be aware that you will have to fix these links if you do not use chroot.

Final Step:

At this point, the restore should have finished with no errors, and all your files will be restored. One last task remains and that is to write a new boot sector so that your machine will boot. For lilo, you enter the following command:

run_lilo

If you are using grub instead of lilo, you must enter the following:

run_grub

Note, I've had quite a number of problems with grub because it is rather complicated and not designed to install easily under a simplified system. So, if you experience errors or end up unexpectedly in a chroot shell, simply exit back to the normal shell and type in the appropriate commands from the run_grub script by hand until you get it to install.

Reboot:

Reboot your machine by entering exit until you get to the main prompt then enter ctl-d.

If everything went well, you should now be back up and running. If not, re-insert the emergency boot floppy, boot, and figure out what is wrong.

At this point, you will probably want to remove the temporary copy of Bacula that you installed. Do so with:

rm -f /bacula-fd /bacula-fd.conf
rm -rf /working

Linux Problems or Bugs

Since every flavor and every release of Linux is different, there are likely to be some small difficulties with the scripts, so please be prepared to edit them in a minimal environment. A rudimentary knowledge of vi is very useful. Also, these scripts do not do everything. You will need to reformat Windows partitions by hand, for example.

Getting the boot loader back can be a problem if you are using grub because it is so complicated. If all else fails, reboot your system from your floppy but using the restored disk image, then proceed to a reinstallation of grub (looking at the run-grub script can help). By contrast, lilo is a piece of cake.

Bugs

When performing the bare metal recovery using the Red Hat emergency boot disk (actually the installation boot disk), I was never able to release the cdrom, and when the system came up /mnt/cdrom was soft linked to /mnt/disk/dev/hdd, which is not correct. I fixed this in each case by deleting and simply remaking it with mkdir -p /mnt/cdrom.

tomsrtbt

This is a single floppy (1.722Meg) that really has A LOT of software. For example, by default (version 2.0.103) you get:

AHA152X AHA1542 AIC7XXX BUSLOGIC DAC960 DEC_ELCP(TULIP) EATA EEXPRESS/PRO/PRO100 EL2 EL3 EXT2 EXT3 FAT FD IDE-CD/DISK/TAPE IMM INITRD ISO9660 JOLIET LOOP MATH_EMULATION MINIX MSDOS NCR53C8XX NE2000 NFS NTFS PARPORT PCINE2K PCNET32 PLIP PPA RTL8139 SD SERIAL/_CONSOLE SLIP SMC_ULTRA SR ST VFAT VID_SELECT VORTEX WD80x3 .exrc 3c589_cs agetty ash badblocks basename boot.b buildit.s busybox bz2bzImage bzip2 cardmgr cardmgr.pid cat chain.b chattr chgrp chmod chown chroot clear clone.s cmp common config cp cpio cs cut date dd dd-lfs debugfs ddate df dhcpcd-- dirname dmesg domainname ds du dumpe2fs e2fsck echo egrep elvis ex false fdflush fdformat fdisk filesize find findsuper fmt fstab grep group gunzip gzip halt head hexdump hexedit host.conf hostname hosts httpd i82365 ifconfig ile init inittab insmod install.s issue kernel key.lst kill killall killall5 ld ld-linux length less libc libcom_err libe2p libext2fs libtermcap libuuid lilo lilo.conf ln loadkmap login ls lsattr lsmod lua luasocket man map md5sum miterm mkdir mkdosfs mke2fs mkfifo mkfs.minix mknod mkswap more more.help mount mt mtab mv nc necho network networks nmclan_cs nslookup passwd pax pcmcia_core pcnet_cs pidof ping poweroff printf profile protocols ps pwd rc.0 rc.S rc.custom rc.custom.gz rc.pcmcia reboot rescuept reset resolv.conf rm rmdir rmmod route rsh rshd script sed serial serial_cs services setserial settings.s sh shared slattach sleep sln sort split stab strings swapoff swapon sync tail tar tcic tee telnet telnetd termcap test tomshexd tomsrtbt.FAQ touch traceroute true tune2fs umount undeb-- unpack.s unrpm-- update utmp vi vi.help view watch wc wget which xargs xirc2ps_cs yecho yes zcat

In addition, at Tom's Web Site, you can find a lot of additional kernel drivers and other software (such as sdisk, which is used by Bacula.

Building his floppy is a piece of cake. Simply download his .tar.gz file then:

- detar the .tar.gz archive
- become root
- cd to the tomsrtbt-<version> directory
- load a blank floppy with no bad sectors
- ./install.s


next up previous contents index
Next: Bacula Copyright, Trademark, and Up: Bacula User's Guide Previous: The Bacula internal database   Contents   Index
2005-06-01