Not too many years ago, US government constraints such as those imposed by HIPAA (www.hhs.gov/ocr/hipaa) were not in place. Companies such as Enron were still in business, and another US government requirement for Sarbanes-Oxley compliance did not exist. Back then, auditing was considered something that “we might do someday, maybe.” Today, however, auditing is at the forefront, and many DBAs are challenged to retain online up to seven years of audit trail information for their financial, business, and health-care databases.
Audit trail information is the one piece of data in your database that you might well insert but never retrieve during the normal course of operation. It is there predominantly as a forensic, after-the-fact trail of evidence. We need to have it, but from many perspectives, it is just something that sits on our disks and consumes space—lots and lots of space.
And then every month or year or some other time interval, we have to purge or archive it. Auditing is something that if not properly designed from the beginning can kill you at the end. Seven years from now when you are faced with your first purge or archive of the old data is not when you want to be thinking about how to accomplish it. Unless you designed for it, getting that old information out is going to be painful.
Enter two technologies that make auditing not only bearable but also pretty easy to manage and consume less space. These technologies are partitioning and segment space compression, as we discussed in Chapter 10. That second one might not be as obvious since basic segment space compression only works with large bulk operations like a direct path load (OLTP compression is a feature of the Advanced Compression Option—not available with all database editions), and audit trails are typically inserted into a row at a time, as events happen. The trick is to combine sliding window partitions with segment space compression.
Suppose we decide to partition the audit trail by month. During the first month of business, we just insert into the partitioned table; these inserts go in using the conventional path, not a direct path, and hence are not compressed.
Now, before the month ends, we’ll add a new partition to the table to accommodate next month’s auditing activity. Shortly after the beginning of next month, we will perform a large bulk operation on last month’s audit trail—specifically, we’ll use the ALTER TABLE command to move last month’s partition, which will have the effect of compressing the data as well.
If we, in fact, take this a step further, we could move this partition from a read-write tablespace, which it must have been in, into a tablespace that is normally read-only (and contains other partitions for this audit trail). In that fashion, we can back up that tablespace once a month, after we move the partition in there; ensure we have a good, clean, current readable copy of the tablespace; and then not back it up anymore that month.
We might have the following tablespaces for our audit trail:
•\ A current online, read-write tablespace that gets backed up like every other normal tablespace in our system: The audit trail information in this tablespace is not compressed, and it is constantly inserted into.
•\ A read-only tablespace containing “this year to date” audit trail partitions in a compressed format: At the beginning of each month, we make this tablespace read-write, move and compress last month’s audit information into this tablespace, make it read-only again, and back it up.
•\ A series of tablespaces for last year, the year before, and so on: These are all read-only and might even be on slow, cheap media. In the event of a media failure, we just need to restore from backup. We would occasionally pick a year at random from our backup sets to ensure they are still restorable (tapes go bad sometimes).
In this fashion, we have made purging easy (i.e., drop a partition). We have made archiving easy, too—we could just transport a tablespace off and restore it later. We have reduced our space utilization by implementing compression. We have reduced our backup volumes, as in many systems, the single largest set of data is audit trail data. If you can remove some or all of that from your day-to-day backups, the difference will be measurable.
In short, audit trail requirements and partitioning are two things that go hand in hand, regardless of the underlying system type, be it data warehouse or OLTP.
Tip Consider using Oracle’s Flashback Data Archive feature for auditing requirements. When enabled for a table, the Flashback Data Archive will automatically create an underlying partitioned table to record transactional information.