Without proper setup and maintenance, your Catalog may continue to grow indefinitely as you run Jobs and backup Files. How fast the size of your Catalog grows depends on the number of Jobs you run and how many files they backup. By deleting records within the database, you can make space available for the new records that will be added during the next Job. By constantly deleting old expired records (dates older than the Retention period), your database size will remain constant.
If you started with the default configuration files, they already contain reasonable defaults for a small number of machines (less that 5), so if you fall into that case, catalog maintenance will not be urgent if you have a few hundred megabytes of disk space free. Whatever the case may be, some knowledge of retention periods will be useful.
Bacula uses three Retention periods: the File Retention period, the Job Retention period, and the Volume Retention period. Of these three, the File Retention period is by far the most important in determining how large your database will become.
The File Retention and the Job Retention are specified in each Client resource as is shown below. The Volume Retention period is specified in the Pool resource, and the details are given in the next chapter of this manual.
Retention periods are specified in seconds, but as a convenience, there are a number of modifiers that permit easy specification in terms of minutes, hours, days, weeks, months, quarters, or years on the record. See the Configuration chapter of this manual for additional details of modifier specification.
The default is 60 days.
The retention period is specified in seconds, but as a convenience, there are a number of modifiers that permit easy specification in terms of minutes, hours, days, weeks, months, quarters, or years. See the Configuration chapter of this manual for additional details of modifier specification.
The default is 180 days.
If you turn this off by setting it to no, your Catalog will grow each time you run a Job.
Over time, as noted above, your database will tend to grow. I've noticed that even though Bacula regularly prunes files, MySQL does not effectively use the space, and instead continues growing. To avoid this, from time to time, you must compact your database. Normally, large commercial database such as Oracle have commands that will compact a database to reclaim wasted file space. MySQL has the OPTIMIZE TABLE command that you can use, and SQLite version 2.8.4 and greater has the VACUUM command. We leave it to you to explore the utility of the OPTIMIZE TABLE command in MySQL.
All database programs have some means of writing the database out in ASCII format and then reloading it. Doing so will re-create the database from scratch producing a compacted result, so below, we show you how you can do this for both MySQL and SQLite.
For a MySQL database, you could write the Bacula database as an ASCII file (bacula.sql) then reload it by doing the following:
mysqldump -f --opt bacula > bacula.sql mysql bacula < bacula.sql rm -f bacula.sql
Depending on the size of your database, this will take more or less time and a fair amount of disk space. For example, if I cd to the location of the MySQL Bacula database (typically /opt/mysql/var or something similar) and enter:
du bacula
I get 620,644 which means there are that many blocks containing 1024 bytes each or approximately 635 MB of data. After doing the msqldump, I had a bacula.sql file that had 174,356 blocks, and after doing the mysql command to recreate the database, I ended up with a total of 210,464 blocks rather than the original 629,644. In other words, the compressed version of the database took approximately one third of the space of the database that had been in use for about a year.
As a consequence, I suggest you monitor the size of your database and from time to time (once every 6 months or year), compress it.
If you find that you are getting errors writing to your MySQL database, or Bacula hangs each time it tries to access the database, you should consider running MySQL's database check and repair routines. The program you need to run depends on the type of database indexing you are using. If you are using the default, you will probably want to use myisamchk. For more details on how to do this, please consult the MySQL document at: http://www.mysql.com/doc/en/Repair.html.
If the errors you are getting are simply SQL warnings, then you might try running dbcheck before (or possibly after) using the MySQL database repair program. It can clean up many of the orphanned record problems, and certain other inconsistencies in the Bacula database.
The same considerations apply that are indicated above for MySQL. That is, consult the PostgreSQL documents for how to repair the database, and also consider using Bacula's dbcheck program if the conditions are reasonable for using (see above).
Over time, as noted above, your database will tend to grow. I've noticed that even though Bacula regularly prunes files, PostgreSQL has a VACUUM command that will compact your database for you. Alternatively you may want to use the vacuumdb command, which can be run from a cron job.
All database programs have some means of writing the database out in ASCII format and then reloading it. Doing so will re-create the database from scratch producing a compacted result, so below, we show you how you can do this for PostgreSQL.
For a PostgreSQL database, you could write the Bacula database as an ASCII file (bacula.sql) then reload it by doing the following:
pg_dump bacula > bacula.sql cat bacula.sql | psql bacula rm -f bacula.sql
Depending on the size of your database, this will take more or less time and a fair amount of disk space. For example, you can cd to the location of the Bacula database (typically /usr/local/pgsql/data or possible /var/lib/pgsql/data) and check the size.
First please read the previous section that explains why it is necessary to compress a database. SQLite version 2.8.4 and greater have the Vacuum command for compacting the database.
cd {\bf working-directory} echo 'vacuum;' | sqlite bacula.db
As an alternative, you can use the following commands, adapted to your system:
cd {\bf working-directory} echo '.dump' | sqlite bacula.db > bacula.sql rm -f bacula.db sqlite bacula.db < bacula.sql rm -f bacula.sql
Where working-directory is the directory that you specified in the Director's configuration file. Note, in the case of SQLite, it is necessary to completely delete (rm) the old database before creating a new compressed version.
You may begin using Bacula with SQLite then later find that you want to switch to MySQL for any of a number of reasons: SQLite tends to use more disk than MySQL, SQLite apparently does not handle database sizes greater than 2GBytes, ... Several users have done so by first producing an ASCII ``dump'' of the SQLite database, then creating the MySQL tables with the create_mysql_tables script that comes with Bacula, and finally feeding the SQLite dump into MySQL using the -f command line option to continue past the errors that are generated by the DDL statements that SQLite's dump creates. Of course, you could edit the dump and remove the offending statements. Otherwise, MySQL accepts the SQL produced by SQLite.
If ever the machine on which you Bacula database crashes, and you need to restore from backup tapes, one of your first priorities will probably be to recover the database. Although Bacula will happily backup your catalog database if it is specified in the FileSet, this is not a very good way to do it because the database will be saved while Bacula is modifying it. Thus the database may be in and instable state. Worse yet, you will backup the database before all the Bacula updates have been applied.
To resolve these problems, you need backup the database after all the backup jobs have been run. In addition, you will want to make a copy while Bacula is not modifying it. To do so, you can use two scripts provided in the release make_catalog_backup and delete_catalog_backup. These files will be automatically generated along with all the other Bacula scripts. The first script will make an ASCII copy of your Bacula database into bacula.sql in the working directory you specified on your configuration, and the second will delete the bacula.sql file.
The basic sequence of events to make this work correctly is as follows:
Assuming that you start all your nightly backup jobs at 1:05 am (and that they run one after another), you can do the catalog backup with the following additional Director configuration statements:
# Backup the catalog database (after the nightly save) Job { Name = "BackupCatalog" Type = Backup Client=rufus-fd FileSet="Catalog" Schedule = "WeeklyCycleAfterBackup" Storage = DLTDrive Messages = Standard Pool = Default RunBeforeJob = "/home/kern/bacula/bin/make_catalog_backup" RunAfterJob = "/home/kern/bacula/bin/delete_catalog_backup" } # This schedule does the catalog. It starts after the WeeklyCycle Schedule { Name = "WeeklyCycleAfterBackup Run = Full sun-sat at 1:10 } # This is the backup of the catalog FileSet { Name = "Catalog" Include = signature=MD5 { @working_directory@/bacula.sql } }
If you are running a database in production mode on your machine, Bacula will happily backup the files, but if the database is in use while Bacula is reading it, you may back it up in an unstable state.
The best solution is to shutdown your database before backing it up, or use some tool specific to your database to make a valid live copy perhaps by dumping the database in ASCII format. I am not a database expert, so I cannot provide you advice on how to do this, but if you are unsure about how to backup your database, you might try visiting the Backup Central site, which has been renamed Storage Mountain (www.backupcentral.com). In particular, their Free Backup and Recovery Software page has links to scripts that show you how to shutdown and backup most major databases.
As mentioned above, if you do not do automatic pruning, your Catalog will grow each time you run a Job. Normally, you should decide how long you want File records to be maintained in the Catalog and set the File Retention period to that time. Then you can either wait and see how big your Catalog gets or make a calculation assuming approximately 154 bytes for each File saved and knowing the number of Files that are saved during each backup and the number of Clients you backup.
For example, suppose you do a backup of two systems, each with 100,000 files. Suppose further that you do a Full backup weekly and an Incremental every day, and that the Incremental backup typically saves 4,000 files. The size of your database after a month can roughly be calculated as:
Size = 154 * No. Systems * (100,000 * 4 + 10,000 * 26)
where we have assumed 4 weeks in a month and 26 incremental backups per month. This would give the following:
Size = 154 * 2 * (100,000 * 4 + 10,000 * 26) or Size = 308 * (400,000 + 260,000) or Size = 203,280,000 bytes
So for the above two systems, we should expect to have a database size of approximately 200 Megabytes. Of course, this will vary according to how many files are actually backed up.
Below are some statistics for a MySQL database containing Job records for five Clients beginning September 2001 through May 2002 (8.5 months) and File records for the last 80 days. (Older File records have been pruned). For these systems, only the user files and system files that change are backed up. The core part of the system is assumed to be easily reloaded from the RedHat rpms.
In the list below, the files (corresponding to Bacula Tables) with the extension .MYD contain the data records whereas files with the extension .MYI contain indexes.
You will note that the File records (containing the file attributes) make up the large bulk of the number of records as well as the space used (459 Mega Bytes including the indexes). As a consequence, the most important Retention period will be the File Retention period. A quick calculation shows that for each File that is saved, the database grows by approximately 150 bytes.
Size in Bytes Records File ============ ========= =========== 168 5 Client.MYD 3,072 Client.MYI 344,394,684 3,080,191 File.MYD 115,280,896 File.MYI 2,590,316 106,902 Filename.MYD 3,026,944 Filename.MYI 184 4 FileSet.MYD 2,048 FileSet.MYI 49,062 1,326 JobMedia.MYD 30,720 JobMedia.MYI 141,752 1,378 Job.MYD 13,312 Job.MYI 1,004 11 Media.MYD 3,072 Media.MYI 1,299,512 22,233 Path.MYD 581,632 Path.MYI 36 1 Pool.MYD 3,072 Pool.MYI 5 1 Version.MYD 1,024 Version.MYI
This database has a total size of approximately 450 Megabytes.
If we were using SQLite, the determination of the total database size would be much easier since it is a single file, but we would have less insight to the size of the individual tables as we have in this case.
Note, SQLite databases may be as much as 50% larger than MySQL databases due to the fact that all data is stored as ASCII strings. That is even binary integers are stored as ASCII strings, and this seems to increase the space needed.