Summary-Partitioning

Partitioning is extremely useful in scaling up large database objects in the database. This scaling is visible from the perspective of performance scaling, availability scaling, and administrative scaling. All three are extremely important to different people.

The DBA is concerned with administrative scaling. The owners of the system are concerned with availability, because downtime is lost money, and anything that reduces downtime—or reduces the impact of downtime—boosts the payback for a system.

The end users of the system are concerned with performance scaling. No one likes to use a slow system, after all.

We also looked at the fact that in an OLTP system, partitions may not increase performance, especially if applied improperly. Partitions can increase the performance of certain classes of queries, but those queries are generally not applied in an OLTP system.

This point is important to understand, as many people associate partitioning with “free performance increase.” This does not mean that partitions should not be used in OLTP systems—they do provide many other salient benefits in this environment— just don’t expect a massive increase in throughput.

Expect reduced downtime. Expect the same good performance (partitioning will not slow you down when applied appropriately). Expect easier manageability, which may lead to increased performance due to the fact that some maintenance operations are performed by the DBAs more frequently because they can be.

We investigated the various table partitioning schemes offered by Oracle—range, hash, list, interval, reference, interval reference, virtual column, and composite—and talked about when they are most appropriately used.

We spent the bulk of our time looking at partitioned indexes and examining the differences between prefixed and nonprefixed and local and global indexes. We investigated partition operations in data warehouses combined with global indexes, and the trade-off between resource consumption and availability.

We also looked at the ease of maintenance features such as the ability to perform maintenance operations on multiple partitions at a time, cascade truncate, and cascade exchange. Oracle continues to update and improve partitioning with each new release.

Over time, I see this feature becoming more relevant to a broader audience as the size and scale of database applications grow.

The Internet and its database-hungry nature along with legislation requiring longer retention of audit data are leading to more and more extremely large collections of data, and partitioning is a natural tool to help manage that problem.

Auditing and Segment Space Compression-Partitioning

Not too many years ago, US government constraints such as those imposed by HIPAA (www.hhs.gov/ocr/hipaa) were not in place. Companies such as Enron were still in business, and another US government requirement for Sarbanes-Oxley compliance did not exist. Back then, auditing was considered something that “we might do someday, maybe.” Today, however, auditing is at the forefront, and many DBAs are challenged to retain online up to seven years of audit trail information for their financial, business, and health-care databases.

Audit trail information is the one piece of data in your database that you might well insert but never retrieve during the normal course of operation. It is there predominantly as a forensic, after-the-fact trail of evidence. We need to have it, but from many perspectives, it is just something that sits on our disks and consumes space—lots and lots of space.
And then every month or year or some other time interval, we have to purge or archive it. Auditing is something that if not properly designed from the beginning can kill you at the end. Seven years from now when you are faced with your first purge or archive of the old data is not when you want to be thinking about how to accomplish it. Unless you designed for it, getting that old information out is going to be painful.

Enter two technologies that make auditing not only bearable but also pretty easy to manage and consume less space. These technologies are partitioning and segment space compression, as we discussed in Chapter 10. That second one might not be as obvious since basic segment space compression only works with large bulk operations like a direct path load (OLTP compression is a feature of the Advanced Compression Option—not available with all database editions), and audit trails are typically inserted into a row at a time, as events happen. The trick is to combine sliding window partitions with segment space compression.

Suppose we decide to partition the audit trail by month. During the first month of business, we just insert into the partitioned table; these inserts go in using the conventional path, not a direct path, and hence are not compressed.

Now, before the month ends, we’ll add a new partition to the table to accommodate next month’s auditing activity. Shortly after the beginning of next month, we will perform a large bulk operation on last month’s audit trail—specifically, we’ll use the ALTER TABLE command to move last month’s partition, which will have the effect of compressing the data as well.

If we, in fact, take this a step further, we could move this partition from a read-write tablespace, which it must have been in, into a tablespace that is normally read-only (and contains other partitions for this audit trail). In that fashion, we can back up that tablespace once a month, after we move the partition in there; ensure we have a good, clean, current readable copy of the tablespace; and then not back it up anymore that month.
We might have the following tablespaces for our audit trail:

•\ A current online, read-write tablespace that gets backed up like every other normal tablespace in our system: The audit trail information in this tablespace is not compressed, and it is constantly inserted into.
•\ A read-only tablespace containing “this year to date” audit trail partitions in a compressed format: At the beginning of each month, we make this tablespace read-write, move and compress last month’s audit information into this tablespace, make it read-only again, and back it up.
•\ A series of tablespaces for last year, the year before, and so on: These are all read-only and might even be on slow, cheap media. In the event of a media failure, we just need to restore from backup. We would occasionally pick a year at random from our backup sets to ensure they are still restorable (tapes go bad sometimes).

In this fashion, we have made purging easy (i.e., drop a partition). We have made archiving easy, too—we could just transport a tablespace off and restore it later. We have reduced our space utilization by implementing compression. We have reduced our backup volumes, as in many systems, the single largest set of data is audit trail data. If you can remove some or all of that from your day-to-day backups, the difference will be measurable.

In short, audit trail requirements and partitioning are two things that go hand in hand, regardless of the underlying system type, be it data warehouse or OLTP.

Tip Consider using Oracle’s Flashback Data Archive feature for auditing requirements. When enabled for a table, the Flashback Data Archive will automatically create an underlying partitioned table to record transactional information.

Partial Indexes-Partitioning

Oracle allows you to create either local or global indexes on a subset of partitions in a table. You may want to do this if you’ve pre-created partitions and don’t yet have data for range partitions that map to future dates—the idea being that you’ll build the index after the partitions have been loaded (at some future date).

You set up the use of a partial index by first specifying INDEXING ON|OFF for each partition in the table. In this next example, PART_1 has indexing turned on and PART_2 has indexing turned off:

$ sqlplus eoda/foo@PDB1
SQL> CREATE TABLE p_table (a int)PARTITION BY RANGE (a)
(PARTITION part_1 VALUES LESS THAN(1000) INDEXING ON, PARTITION part_2 VALUES LESS THAN(2000) INDEXING OFF);Table created.

Next, a partial local index is created:
SQL> create index pi1 on p_table(a) local indexing partial; Index created.

In this scenario, the INDEXING PARTIAL clause instructs Oracle to only build and make usable local index partitions that point to partitions in the table that were defined with INDEXING ON. In this case, one usable index partition will be created with index entries pointing to data in the PART_1 table partition:

SQL> select a.index_name, a.partition_name, a.status from user_ind_partitions a, user_indexes b where b.table_name = ‘P_TABLE’and a.index_name = b.index_name;
INDEX_NAME PARTITION_NAME STATUS
PI1 PART_2 UNUSABLE
PI1 PART_1 USABLE

Next, we’ll insert some test data, generate statistics, set autotrace on, and run a query that should locate data in the PART_1 partition:
SQL> insert into p_table select rownum from dual connect by level < 2000; 1999 rows created. SQL> exec dbms_stats.gather_table_stats(user,’P_TABLE’); PL/SQL procedure successfully comple
SQL> explain plan for select * from p_table where a = 20;Explained.
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

0 SELECT STATEMENT
1 PARTITION RANGE SINGLE
2 INDEX RANGE SCAN

As expected, the optimizer was able to generate an execution plan utilizing the index. Next, a query is issued that selects data from the partition defined with INDEXING OFF:
SQL> explain plan for select * from p_table where a = 1500;Explained.
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

0 SELECT STATEMENT
1 PARTITION RANGE SINGLE
2 TABLE ACCESS FULL

The output shows a full table scan of PART_2 was required, as there is no usable index with entries pointing at data in PART_2. We can instruct Oracle to create index entries pointing to data in PART_2 by rebuilding the index partition associated with the PART_2 partition:
SQL> alter index pi1 rebuild partition part_2; Index altered.

Rerunning the previous select query shows that the optimizer is now utilizing the local partitioned index pointing to the PART_2 table partition:
0 SELECT STATEMENT
1 PARTITION RANGE SINGLE
2 INDEX RANGE SCAN

In this way, partial indexes allow you to disable the index while the table partition is being loaded (increasing the loading speed), and then later you can rebuild the partial index to make it available.

OLTP and Global Indexes-Partitioning-2

This shows the distribution of data, by location, into the individual partitions. We can now review some query plans to see what we could expect performance-wise:

SQL> variable x varchar2(30);
SQL> begindbms_stats.set_table_stats( user, ‘EMP’, numrows=>100000, numblks => 10000 ); end;/
PL/SQL procedure successfully completed.
SQL> explain plan for select empno, job, loc from emp where empno = :x; Explained.
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

The plan here shows an INDEX UNIQUE SCAN of the nonpartitioned index EMP_PK that was created in support of our primary key. Then there is a TABLE ACCESS BY GLOBAL INDEX ROWID, with a PSTART and PSTOP of ROWID/ROWID, meaning that when we get the ROWID from the index, it will tell us precisely which index partition to read to get this row. This index access will be as effective as on a nonpartitioned table and perform the same amount of I/O to do so. It is just a simple, single index unique scan followed by “get this row by rowid.” Now, let’s look at one of the other global indexes, the one on JOB:

SQL> explain plan for select empno, job, loc from emp where job = :x; Explained.
SQL> select * from table(dbms_xplan.display);

Sure enough, we see a similar effect for the INDEX RANGE SCAN. Our indexes are used and can provide high-speed OLTP access to the underlying data. If they were partitioned, they would have to be prefixed and enforce index partition elimination; hence, they are scalable as well, meaning we can partition them and observe the same behavior. In a moment, we’ll look at what would happen if we used LOCAL indexes only.

Lastly, let’s look at the area of availability. The Oracle documentation claims that globally partitioned indexes make for less available data than locally partitioned indexes. I don’t fully agree with this blanket characterization. I believe that in an OLTP system they are as highly available as a locally partitioned index. Consider the following:

SQL> alter tablespace p1 offline;
SQL> alter tablespace p2 offline;
SQL> alter tablespace p3 offline;
SQL> select empno, job, loc from emp where empno = 7782;
EMPNO JOB LOC
7782 MANAGER NEW YORK

Here, even though most of the underlying data is unavailable in the table, we can still gain access to any bit of data available via that index. As long as the EMPNO we want is in a tablespace that is available, and our GLOBAL index is available, our GLOBAL index works for us. On the other hand, if we had been using the highly available local index in the preceding case, we might have been prevented from accessing the data! This is a side effect of the fact that we partitioned on LOC but needed to query by EMPNO. We would have had to probe each local index partition and would have failed on the index partitions that were not available.

Other types of queries, however, will not (and cannot) function at this point in time:
SQL> select empno, job, loc from emp where job = ‘CLERK’;
ERROR at line 1:
ORA-00376: file 38 cannot be read at this time ORA-01110: data file 38: ‘/opt/oracle/oradata/CDB/ C217E68DF48779E1E0530101007F73B9/datafile/o1_mf_p2_jcbnhfh2_.dbf’

The CLERK data is in all of the partitions, and the fact that three of the tablespaces are offline does affect us. This is unavoidable unless we had partitioned on JOB, but then we would have had the same issues with queries that needed data by LOC. Anytime you need to access the data from many different keys, you will have this issue. Oracle will give you the data whenever it can.

Note, however, that if the query can be answered from the index, avoiding the TABLE ACCESS BY ROWID, the fact that the data is unavailable is not as meaningful:
SQL> select count() from emp where job = ‘CLERK’;COUNT()

Since Oracle didn’t need the table in this case, the fact that most of the partitions were offline doesn’t affect this query (assuming the index isn’t in one of the offline tablespaces of course). As this type of optimization (i.e., answer the query using just the index) is common in an OLTP system, there will be many applications that are not affected by the data that is offline. All we need to do now is make the offline data available as fast as possible (restore it and recover it).

OLTP and Global Indexes-Partitioning-1

An OLTP system is characterized by the frequent occurrence of many small read and write transactions. In general, fast access to the row (or rows) you need is paramount. Data integrity is vital. Availability is also very important. Global indexes make sense in many cases in OLTP systems. Table data can be partitioned by only one key—one set of columns. However, you may need to access the data in many different ways. You might partition EMPLOYEE data by LOCATION in the table, but you still need fast access to EMPLOYEE data by

•\ DEPARTMENT: Departments are geographically dispersed. There is no relationship between a department and a location.
•\ EMPLOYEE_ID: While an employee ID will determine a location, you don’t want to have to search by EMPLOYEE_ID and LOCATION; hence, partition elimination cannot take place on the index partitions. Also, EMPLOYEE_ID by itself must be unique.
•\ JOB_TITLE: There is no relationship between JOB_TITLE and LOCATION. All JOB_TITLE values may appear in any LOCATION.

There is a need to access the EMPLOYEE data by many different keys in different places in the application, and speed is paramount. In a data warehouse, we might just use locally partitioned indexes on these keys and use parallel index range scans to collect a large amount of data fast. In these cases, we don’t necessarily need to use index partition elimination. In an OLTP system, however, we do need to use it. Parallel query is not appropriate for these systems; we need to provide the indexes appropriately. Therefore, we will need to make use of global indexes on certain fields.

The following are the goals we need to meet:
•\ Fast access
•\ Data integrity
•\ Availability

Global indexes can help us accomplish these goals in an OLTP system. We will probably not be doing sliding windows, auditing aside for a moment. We will not be splitting partitions (unless we have a scheduled downtime), we will not be moving data, and so on. The operations we perform in a data warehouse are not done on a live OLTP system in general.

Here is a small example that shows how we can achieve the three goals just listed with global indexes. I am going to use simple, single partition global indexes, but the results would not be different with global indexes in multiple partitions (except for the fact that availability and manageability would increase as we added index partitions). We start by creating tablespaces P1, P2, P3, and P4, then create a table that is range partitioned by location, LOC, according to our rules, which place all LOC values less than ‘C’ into partition P1, those less than ‘D’ into partition P2, and so on:

$ sqlplus eoda/foo@PDB1
SQL> create tablespace p1 datafile size 1m autoextend on next 1m; Tablespace created.
SQL> create tablespace p2 datafile size 1m autoextend on next 1m; Tablespace created.
SQL> create tablespace p3 datafile size 1m autoextend on next 1m; Tablespace created.
SQL> create tablespace p4 datafile size 1m autoextend on next 1m; Tablespace created.
SQL> create table emp(EMPNO NUMBER(4) NOT NULL,ENAME VARCHAR2(10),JOB VARCHAR2(9),MGR NUMBER(4),HIREDATE DATE,SAL NUMBER(7,2),COMM NUMBER(7,2),DEPTNO NUMBER(2) NOT NULL,LOC VARCHAR2(13) NOT NULL)partition by range(loc) (partition p1 values less than(‘C’) tablespace p1, partition p2 values less than(‘D’) tablespace p2, partition p3 values less than(‘N’) tablespace p3, partition p4 values less than(‘Z’) tablespace p4 );Table created.

We alter the table to add a constraint on the primary key column:
SQL> alter table emp add constraint emp_pk primary key(empno); Table altered.

A side effect of this is that there exists a unique index on the EMPNO column. This shows we can support and enforce data integrity, one of our goals. Finally, we create two more global indexes on DEPTNO and JOB to facilitate accessing records quickly by those attributes:

SQL> create index emp_job_idx on emp(job) GLOBAL; Index created.
SQL> create index emp_dept_idx on emp(deptno) GLOBAL; Index created.
SQL> insert into empselect e.*, d.locfrom scott.emp e, scott.dept dwhere e.deptno = d.deptno;14 rows created.

Let’s see what is in each partition:
SQL> break on pname skip 1
SQL> select ‘p1’ pname, empno, job, loc from emp partition(p1) union allselect ‘p2’ pname, empno, job, loc from emp partition(p2)union allselect ‘p3’ pname, empno, job, loc from emp partition(p3)union allselect ‘p4’ pname, empno, job, loc from emp partition(p4);

Asynchronous Global Index Maintenance-Partitioning

As shown in the prior section, you can maintain global indexes while dropping or truncating partitions via the UPDATE GLOBAL INDEXES clause. However, as shown previously, such operations come at a cost in terms of time and resource consumption.

When dropping or truncating table partitions, Oracle postpones the removal of the global index entries associated with the dropped or truncated partitions. This is known as asynchronous global index maintenance. Oracle postpones the maintenance of the global index to a future time while keeping the global index usable.

The idea is that this improves the performance of dropping/truncating partitions while keeping any global indexes in a usable state. The actual cleanup of the index entries is done later (asynchronously) either by the DBA or by an automatically scheduled Oracle job. It’s not that less work is being done, rather it’s the cleanup of index entries is decoupled from the DROP/TRUNCATE statement.

A small example will demonstrate asynchronous global index maintenance. To set this up, we create a table, populate it with test data, and create a global index:
$ sqlplus eoda/foo@PDB1
SQL> drop table partitioned;
SQL> CREATE TABLE partitioned( timestamp date,id int)PARTITION BY RANGE (timestamp) (PARTITION fy_2020 VALUES LESS THAN (to_date(’01-jan-2021′,’dd-mon-yyyy’)), PARTITION fy_2021 VALUES LESS THAN( to_date(’01-jan-2022′,’dd-mon-yyyy’)));
SQL> insert into partitioned partition(fy_2020)select to_date(’31-dec-2020′,’dd-mon-yyyy’)-mod(rownum,364), rownum from dual connect by level < 100000;99999 rows created. SQL> insert into partitioned partition(fy_2021)select to_date(’31-dec-2021′,’dd-mon-yyyy’)-mod(rownum,364), rownum from dual connect by level < 100000;99999 rows created. SQL> create index partitioned_idx_global on partitioned(timestamp) GLOBAL; Index created.

Next, we’ll run a query to retrieve the current values of redo size and db block gets statistics for the current session:
SQL> col r1 new_value r2
SQL> col b1 new_value b2
SQL> select * from(select b.value r1from v$statname a, v$mystat bwhere a.statistic# = b.statistic#and a.name = ‘redo size’),(select b.value b1from v$statname a, v$mystat bwhere a.statistic# = b.statistic#and a.name = ‘db block gets’);
R1 B1
56928036 80829

Next, a partition is dropped with the UPDATE GLOBAL INDEXES clause specified:
SQL> alter table partitioned drop partition fy_2020 update global indexes; Table altered.

Now we’ll calculate the amount of redo generated and the number of current blocks accessed:
SQL> select * from(select b.value – &r2 redo_genfrom v$statname a, v$mystat bwhere a.statistic# = b.statistic#and a.name = ‘redo size’),(select b.value – &b2 db_block_getsfrom v$statname a, v$mystat bwhere a.statistic# = b.statistic#and a.name = ‘db block gets’);
old 2: (select b.value – &r2 redo_gen
new 2: (select b.value – 4816712 redo_gen
old 6: (select b.value – &b2 db_block_gets
new 6: (select b.value – 4512 db_block_gets
REDO_GEN DB_BLOCK_GETS
16864 103

Only a small amount of the redo is generated, and a small number of blocks are accessed. The reason behind this is that Oracle doesn’t immediately perform the index maintenance of removing the index entries from the dropped partition. Rather, these entries are marked as orphaned and will later be cleaned up by Oracle. The existence of orphaned entries can be verified via the following:
SQL> select index_name, orphaned_entries, status from user_indexes where table_name=’PARTITIONED’;
INDEX_NAME ORP STATUS
PARTITIONED_IDX_GLOBAL YES VALID

How do the orphaned entries get cleaned up? Oracle has an automatically scheduled
PMO_DEFERRED_GIDX_MAINT_JOB, which runs in a nightly maintenance window:
SQL> select job_name from dba_scheduler_jobs where job_name like ‘PMO%’;
JOB_NAME
PMO_DEFERRED_GIDX_MAINT_JOB

If you don’t want to wait for that job, you can manually clean up the entries yourself:
SQL> exec dbms_part.cleanup_gidx; PL/SQL procedure successfully completed.

Now checking for orphaned rows shows there are none:
SQL> select index_name, orphaned_entries, status from user_indexes where table_name=’PARTITIONED’;
INDEX_NAME ORP STATUS
PARTITIONED_IDX_GLOBAL NO VALID

In this way, you can perform operations such as dropping and truncating partitions and still leave your global indexes in a usable state without the immediate overhead of cleaning up the index entries as part of the drop/truncate operation.

Tip See Oracle Support note 1482264.1 for further details on asynchronous global index maintenance.

Sliding Windows and Indexes-Partitioning-1

The following example implements a classic sliding window of data. In many implementations, data is added to a warehouse over time and the oldest data is aged out. Many times, this data is range partitioned by a date attribute, so that the oldest data is stored together in a single partition, and the newly loaded data is likewise stored together in a new partition. The monthly load process involves the following:

•\ Detaching the old data: The oldest partition is either dropped or exchanged with an empty table (turning the oldest partition into a table) to permit archiving of the old data.
•\ Loading and indexing of the new data: The new data is loaded into a work table and indexed and validated.
•\ Attaching the new data: Once the new data is loaded and processed, the table it is in is exchanged with an empty partition in the partitioned table, turning this newly loaded data in a table into a partition of the larger partitioned table.

This process is repeated every month, or however often the load process is performed; it could be every day or every week. We will implement this very typical process in this section to show the impact of global partitioned indexes and demonstrate the options we have during partition operations to increase availability, allowing us to implement a sliding window of data and maintain continuous availability of data.

We’ll process yearly data in this example and have fiscal years 2020 and 2021 loaded up. The table will be partitioned by the TIMESTAMP column, and it will have two indexes created on it—one is a locally partitioned index on the ID column, and the other is a global index (nonpartitioned, in this case) on the TIMESTAMP column:

$ sqlplus eoda/foo@PDB1
SQL> drop table partitioned;
SQL> CREATE TABLE partitioned( timestamp date,id int)
PARTITION BY RANGE (timestamp)
(PARTITION fy_2020 VALUES LESS THAN
( to_date(’01-jan-2021′,’dd-mon-yyyy’) ) , PARTITION fy_2021 VALUES LESS THAN
( to_date(’01-jan-2022′,’dd-mon-yyyy’) ) ); Table created.
SQL> insert into partitioned partition(fy_2020) select to_date(’31-dec-2020′,’dd-mon-yyyy’)-mod(rownum,360), rownum from dual connect by level <= 70000;70000 rows created. SQL> insert into partitioned partition(fy_2021) select to_date(’31-dec-2021′,’dd-mon-yyyy’)-mod(rownum,360), rownum from dual connect by level <= 70000;70000 rows created. SQL> create index partitioned_idx_local on partitioned(id) LOCAL; Index created.
SQL> create index partitioned_idx_global on partitioned(timestamp) GLOBAL; Index created.

This sets up our warehouse table. The data is partitioned by fiscal year, and we have the last two years’ worth of data online. This table has two indexes: one is LOCAL and the other is GLOBAL. Now it’s the end of the year and we would like to do the following:

\ 1.\ Remove the oldest fiscal year data: We do not want to lose this data forever; we just want to age it out and archive it.
\ 2.\ Add the newest fiscal year data: It will take a while to load it, transform it, index it, and so on. We would like to do this work without impacting the availability of the current data, if at all possible.

The first step is to set up an empty table for fiscal year 2020 that looks just like the partitioned table. We’ll use this table to exchange with the FY_2020 partition in the partitioned table, turning that partition into a table and in turn emptying out the partition in the partitioned table. The net effect is that the oldest data in the partitioned table will have been in effect removed after the exchange:
SQL> create table fy_2020 ( timestamp date, id int ); Table created.
SQL> create index fy_2020_idx on fy_2020(id); Index created.

We’ll do the same to the new data to be loaded. We’ll create and load a table that structurally looks like the existing partitioned table (but that is not itself partitioned):

SQL> create table fy_2022 ( timestamp date, id int ); Table created.
SQL> insert into fy_2022 select to_date(’31-dec-2022′,’dd-mon-yyyy’)-mod(rownum,360), rownum from dual connect by level <= 70000;70000 rows created. SQL> create index fy_2022_idx on fy_2022(id) nologging; Index created.

Sliding Windows and Indexes-Partitioning-2

We’ll turn the current full partition into an empty partition and create a full table with the FY_2020 data in it. Also, we’ve completed all of the work necessary to have the FY_2022 data ready to go. This would have involved verifying the data, transforming it— whatever complex tasks we need to undertake to get it ready. Now we’re ready to update the live data using an exchange partition:

SQL> alter table partitionedexchange partition fy_2020with table fy_2020including indexeswithout validation;Table altered.
SQL> alter table partitioned drop partition fy_2020; Table altered.

This is all we need to do to age the old data out. We turned the partition into a full table and the empty table into a partition. This was a simple data dictionary update. No large amount of I/O took place—it just happened. We can now export that FY_2020 table (perhaps using a transportable tablespace) out of our database for archival purposes. We could reattach it quickly if we ever needed to.

Next, we want to slide in the new data:
SQL> alter table partitioned add partition fy_2022values less than ( to_date(’01-jan-2023′,’dd-mon-yyyy’) ); Table altered.
SQL> alter table partitionedexchange partition fy_2022with table fy_2022including indexeswithout validation;Table altered.

Again, this was instantaneous; it was accomplished via simple data dictionary updates—the WITHOUT VALIDATION clause allowed us to accomplish that. When you use that clause, the database will trust that the data you are placing into that partition is, in fact, valid for that partition. Adding the empty partition took very little time to process.

Then, we exchange the newly created empty partition with the full table, and the full table with the empty partition, and that operation is performed quickly as well. The new data is online. Looking at our indexes, however, we’ll find the following:
SQL> select index_name, status from user_indexes;
INDEX_NAME STATUS
PARTITIONED_IDX_LOCAL N/A
PARTITIONED_IDX_GLOBAL UNUSABLE
FY_2020_IDX VALID
FY_2022_IDX VALID

The global index is, of course, unusable after this operation. Since each index partition can point to any table partition, and we just took away a partition and added a partition, that index is invalid. It has entries that point into the partition we dropped. It has no entries that point into the partition we just added. Any query that would make use of this index would fail and not execute, or if we skip unusable indexes, the query’s performance would be negatively impacted by not being able to use the index:

SQL> select /*+ index( partitioned PARTITIONED_IDX_GLOBAL ) */ count() from partitioned where timestamp between to_date( ’01-mar-2022′, ‘dd-mon-yyyy’ ) and to_date( ’31-mar-2022′, ‘dd-mon-yyyy’ );select /+ index( partitioned PARTITIONED_IDX_GLOBAL ) / count()
ERROR at line 1:
ORA-01502: index ‘EODA.PARTITIONED_IDX_GLOBAL’ or partition of such index is in unusable state
SQL> explain plan for select count(*)from partitionedwhere timestamp between to_date( ’01-mar-2022′, ‘dd-mon-yyyy’ ) and to_date( ’31-mar-2022′, ‘dd-mon-yyyy’ );

So, our choices after performing this partition operation with global indexes are
•\ Ensure that SKIP_UNUSABLE_INDEXES=TRUE (the default is TRUE), andthen Oracle will not use this unusable index. But then we lose the performance the index was giving us.
•\ Rebuild this index to make it usable again.
•\ Use the UPDATE GLOBAL INDEXES clause when performing partition maintenance (the focus of the next section in this chapter).

The sliding window process, which so far has resulted in virtually no downtime, will now take a very long time to complete while we rebuild the global index. Runtime query performance of queries that relied on these indexes will be negatively affected during this time—either they will not run at all or they will run without the benefit of the index. All of the data must be scanned and the entire index reconstructed from the table data. If the table is many hundreds of gigabytes in size, this will take considerable resources.

“Live” Global Index Maintenance-Partitioning

Oracle has the ability to maintain the global indexes during the partition operation using the UPDATE GLOBAL INDEXES clause. This means that as you drop a partition, split a partition, perform whatever operation necessary on a partition, Oracle will perform the necessary modifications to the global index to keep it up to date.

Since most partition operations will cause this global index invalidation to occur, this feature can be a boon to systems that need to provide continual access to the data. You’ll find that you sacrifice the raw speed of the partition operation, but with the associated window of unavailability immediately afterward as you rebuild indexes, for a slower overall response time from the partition operation but coupled with 100 percent data availability.

In short, if you have a data warehouse that cannot have downtime, but must support these common data warehouse techniques of sliding data in and out, then this feature is for you—but you must understand the implications.

Revisiting our previous example, if our partition operations had used the UPDATE GLOBAL INDEXES clause when relevant (in this example, it would not be needed on the ADD PARTITION statement since the newly added partition would not have any rows in it), we would have discovered the indexes to be perfectly valid and usable both during and after the operation:

SQL> alter table partitioned exchange partition fy_2020 with table fy_2020 including indexes without validation UPDATE GLOBAL INDEXES;Table altered.

Note in the following output, the N/A status observed for the PARTITIONED_IDX_LOCAL index simply means the statuses are associated with the index partitions associated with that index, not the index itself. It doesn’t make sense to say the locally partitioned index is valid or not; it is just a container that logically holds the index partitions themselves:

SQL> select index_name, status from user_indexes;
INDEX_NAME STATUS
PARTITIONED_IDX_LOCAL N/A
PARTITIONED_IDX_GLOBAL VALID
FY_2020_IDX VALID
FY_2022_IDX VALID

But there is a trade-off: we are performing the logical equivalent of DELETE and INSERT operations on the global index structures. When we drop a partition, we have to delete all of the global index entries that might be pointing to that partition.

When we did the exchange of a table with a partition, we had to delete all of the global index entries pointing to the original data and then insert all of the new ones that we just slid in there. So the amount of work performed by the ALTER commands was significantly increased.

You should expect with global index maintenance considerations that the approach without index maintenance will consume fewer database resources and therefore ­perform faster but incur a measurable period of downtime.

The second approach, involving maintaining the indexes, will consume more resources and perhaps take longer overall, but will not incur downtime. As far as the end users are concerned, their ability to work never ceased. They might have been processing a bit slower (since we were competing with them for resources), but they were still processing, and they never stopped.

The index rebuild approach will almost certainly run faster, considering both the elapsed time and the CPU time. This fact has caused many a DBA to pause and say, “Hey, I don’t want to use UPDATE GLOBAL INDEXES—it’s slower.”

That is too simplistic of a view, however. What you need to remember is that while the operations overall took longer, processing on your system was not necessarily interrupted. Sure, you as the DBA might be looking at your screen for a longer period of time, but the really important work that takes place on your system was still taking place.

You need to see if this trade-off makes sense for you. If you have an eight-hour maintenance window overnight in which to load new data, then by all means, use the rebuild approach if that makes sense. However, if you have a mandate to be available continuously, then the ability to maintain the global indexes will be crucial.

One more thing to consider is the redo generated by each approach. You will find that the UPDATE GLOBAL INDEXES generates considerably more redo (due to the index maintenance), and you should expect that to only go up as you add more and more global indexes to the table.

The redo generated by the UPDATE GLOBAL INDEXES processing is unavoidable and cannot be turned off via NOLOGGING, since the maintenance of the global indexes is not a complete rebuild of their structure but more of an incremental maintenance.

Additionally, since you are maintaining the live index structure, you must generate undo for that—in the event the partition operation fails, you must be prepared to put the index back the way it was. And remember, undo is protected by redo itself, so some of the redo you see generated is from the index updates and some is from the rollback. Add another global index or two and you would reasonably expect these numbers to increase.

So, UPDATE GLOBAL INDEXES is an option that allows you to trade off availability for resource consumption. If you need to provide continuous availability, it’s the option for you. But you have to understand the ramifications and size other components of your system appropriately. Specifically, many data warehouses have been crafted over time to use bulk direct path operations, bypassing undo generation and, when permitted, redo generation as well. Using UPDATE GLOBAL INDEXES cannot bypass either of those two ­elements. You need to examine the rules you use to size your redo and undo needs before using this feature, so you can assure yourself it can work on your system.

Local Indexes and Unique Constraints-Partitioning

To enforce uniqueness—and that includes a UNIQUE constraint or PRIMARY KEY constraints—your partitioning key must be included in the constraint itself if you want to use a local index to enforce the constraint.
This is the largest limitation of a local index, in my opinion. Oracle enforces uniqueness only within an index partition—never across partitions. What this implies, for example, is that you cannot range partition on a TIMESTAMP field and have a primary key on the ID that is enforced using a locally partitioned index. Oracle will instead utilize a global index to enforce uniqueness.

In the next example, we will create a range partitioned table that is partitioned by a column named TIMESTAMP but has a primary key on the ID column. We can do that by executing the following CREATE TABLE statement in a schema that owns no other objects, so we can easily see exactly what objects are created by looking at every segment this user owns:
SQL> CREATE TABLE partitioned( timestamp date,id int,constraint partitioned_pk primary key(id))
PARTITION BY RANGE (timestamp)
PARTITION part_1 VALUES LESS THAN
( to_date(’01/01/2021′,’dd/mm/yyyy’) ) , PARTITION part_2 VALUES LESS THAN
( to_date(’01/01/2022′,’dd/mm/yyyy’) ));Table created.

And inserting some data so that we get segments created:
SQL> insert into partitioned values(to_date(’01/01/2020′,’dd/mm/yyyy’),1); 1 row created.
SQL> insert into partitioned values(to_date(’01/01/2021′,’dd/mm/yyyy’),2); 1 row created.

Assuming we run this in a schema with no other objects created, we’ll see the following:
SQL > select segment_name, partition_name, segment_type from user_segments;
SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE
PARTITIONED PART_1 TABLE PARTITION
PARTITIONED PART_2 TABLE PARTITION
PARTITIONED _PK INDEX

The PARTITIONED_PK index is not even partitioned, let alone locally partitioned, and as we’ll see, it cannot be locally partitioned. Even if we try to trick Oracle by realizing that a primary key can be enforced by a nonunique index as well as a unique index, we’ll find that this approach will not work either:
SQL> drop table partitioned;
SQL> CREATE TABLE partitioned( timestamp date,id int
PARTITION BY RANGE (timestamp)
PARTITION part_1 VALUES LESS THAN
( to_date(’01-jan-2021′,’dd-mon-yyyy’) ) , PARTITION part_2 VALUES LESS THAN
( to_date(’01-jan-2022′,’dd-mon-yyyy’) ));Table created.
SQL> create index partitioned_idx on partitioned(id) local; Index created.

And inserting some data so that we get segments created:
SQL> insert into partitioned values(to_date(’01/01/2020′,’dd/mm/yyyy’),1); 1 row created.
SQL> insert into partitioned values(to_date(’01/01/2021′,’dd/mm/yyyy’),2); 1 row created
SQL> select segment_name, partition_name, segment_type from user_segments;
SEGMENT_NAME PARTITION_NAME SEGMENT_TYPE
PARTITIONED PART_1 TABLE PARTITION
PARTITIONED PART_2 TABLE PARTITION
PARTITIONED_IDX PART_1 INDEX PARTITION
PARTITIONED_IDX PART_2 INDEX PARTITION

Here, Oracle attempts to create a global index on ID, but finds that it cannot since an index already exists. The preceding statements would work if the index we created was not partitioned, as Oracle would have used that index to enforce the constraint.

The reasons why uniqueness cannot be enforced, unless the partition key is part of the constraint, are twofold. First, if Oracle allowed this, it would void most of the advantages of partitions. Availability and scalability would be lost, as each and every partition would always have to be available and scanned to do any inserts and updates.
The more partitions you had, the less available the data would be. The more partitions you had, the more index partitions you would have to scan, and the less scalable partitions would become. Instead of providing availability and scalability, doing this would actually decrease both.

Additionally, Oracle would have to effectively serialize inserts and updates to this table at the transaction level. This is because if we add ID=1 to PART_1, Oracle would have to somehow prevent anyone else from adding ID=1 to PART_2. The only way to do this would be to prevent others from modifying index partition PART_2, since there isn’t anything to really lock in that partition.

In an OLTP system, unique constraints must be system enforced (i.e., enforced by Oracle) to ensure the integrity of data. This implies that the logical model of your application will have an impact on the physical design. Uniqueness constraints will either drive the underlying table partitioning scheme, driving the choice of the partition keys, or point you toward the use of global indexes instead. We’ll take a look at global indexes in more depth next.