OLTP and Global Indexes – How Hash Partitioning Works at Oracle Database

Summary-Partitioning

Posted on July 25, 2024July 25, 2024 by Barbara Raymond

Partitioning is extremely useful in scaling up large database objects in the database. This scaling is visible from the perspective of performance scaling, availability scaling, and administrative scaling. All three are extremely important to different people.

The DBA is concerned with administrative scaling. The owners of the system are concerned with availability, because downtime is lost money, and anything that reduces downtime—or reduces the impact of downtime—boosts the payback for a system.

The end users of the system are concerned with performance scaling. No one likes to use a slow system, after all.

We also looked at the fact that in an OLTP system, partitions may not increase performance, especially if applied improperly. Partitions can increase the performance of certain classes of queries, but those queries are generally not applied in an OLTP system.

This point is important to understand, as many people associate partitioning with “free performance increase.” This does not mean that partitions should not be used in OLTP systems—they do provide many other salient benefits in this environment— just don’t expect a massive increase in throughput.

Expect reduced downtime. Expect the same good performance (partitioning will not slow you down when applied appropriately). Expect easier manageability, which may lead to increased performance due to the fact that some maintenance operations are performed by the DBAs more frequently because they can be.

We investigated the various table partitioning schemes offered by Oracle—range, hash, list, interval, reference, interval reference, virtual column, and composite—and talked about when they are most appropriately used.

We spent the bulk of our time looking at partitioned indexes and examining the differences between prefixed and nonprefixed and local and global indexes. We investigated partition operations in data warehouses combined with global indexes, and the trade-off between resource consumption and availability.

We also looked at the ease of maintenance features such as the ability to perform maintenance operations on multiple partitions at a time, cascade truncate, and cascade exchange. Oracle continues to update and improve partitioning with each new release.

Over time, I see this feature becoming more relevant to a broader audience as the size and scale of database applications grow.

The Internet and its database-hungry nature along with legislation requiring longer retention of audit data are leading to more and more extremely large collections of data, and partitioning is a natural tool to help manage that problem.

Partitioning and Performance, Revisited-Partitioning-2

Posted on March 7, 2024July 25, 2024 by Barbara Raymond

Much like the hash partitioned tables we investigated earlier, Oracle will take the OWNER value, hash it to a partition between 1 and 16, and place the index entry in there.
Now when we review the output from autotrace, we are much closer to the work performed by the nonpartitioned table earlier—that is, we have not negatively impacted the work performed by our queries:

It should be noted, however, that a hash partitioned index cannot be range scanned; in general, it is most suitable for exact equality (equals or in-lists). If you were to query “WHERE OWNER > :X” using the preceding index, it would not be able to perform a simple range scan using partition elimination. You would be back to inspecting all 16 hash partitions.

USING ORDER BY

This example brought to mind an unrelated but very important fact. When looking at hash partitioned indexes, we are faced with another case where the use of an index to retrieve data would not automatically retrieve the data sorted.

Many people assume that if the query plan shows an index is used to retrieve the data, the data will be retrieved sorted. This has never been true.
The only way we can retrieve data in any sort of sorted order is to use an ORDER BY clause on the query. If your query does not contain an ORDER BY statement, you cannot make any assumptions about the sorted order of the data.

A quick example demonstrates this. We create a small table as a copy of ALL_USERS and create a hash partitioned index with four partitions on the USER_ID column:

So, even though Oracle used the index in a range scan, the data is obviously not sorted. In fact, you might observe a pattern in this data.
There are four sorted results here: the … replaces values that were increasing in value; and between the rows with USER_ID = 13 and 97, the values were increasing in the output.
Then the row with USER_ID = 22 appeared. What we are observing is Oracle returning “sorted data” from each of the four hash partitions, one after the other.

This is just a warning that unless your query has an ORDER BY, you have no reason to anticipate the data being returned to you in any kind of sorted order whatsoever. (And no, GROUP BY doesn’t have to sort either! There is no substitute for ORDER BY.)

Does that mean partitioning won’t affect OLTP performance at all in a positive sense? No, not entirely—you just have to look in a different place. In general, it will not positively impact the performance of your data retrieval in OLTP; rather, care has to be taken to ensure data retrieval isn’t affected negatively. But on data modification, partitioning may provide salient benefits in highly concurrent environments.

Consider the preceding a rather simple example of a single table with a single index, and add into the mix a primary key. Without partitioning, there is a single table: all insertions go into this single table. There is contention perhaps for the freelists on this table.

Additionally, the primary key index that would be on the OBJECT_ID column would be a heavy right-hand-side index, as we discussed in Chapter 11. Presumably, it would be populated by a sequence; hence, all inserts would go after the rightmost block leading to buffer busy waits.
Also, there would be a single index structure T_IDX that people would be contending for. So far, a lot of single items.

Enter partitioning. You hash partition the table by OBJECT_ID into 16 partitions.
There are now 16 tables to contend for, and each table has one-sixteenth the number of users hitting it simultaneously.

You locally partition the primary key index on OBJECT_ID into 16 partitions. You now have 16 right-hand sides, and each index structure will receive one-sixteenth the workload it had before. And so on.

That is, you can use partitioning in a highly concurrent environment to reduce contention, much like we used a reverse key index in Chapter 11 to reduce the buffer busy waits. However, you must be aware that the very process of partitioning out the data consumes more CPU itself than not having partitioning.
That is, it takes more CPU to figure out where to put the data than it would if the data had but one place to go.

So, as with everything, before applying partitioning to a system to increase performance, make sure you understand what that system needs. If your system is currently CPU bound, but that CPU usage is not due to contention and latch waits, introducing partitioning could make the problem worse, not better!

OLTP and Global Indexes-Partitioning-1

Posted on January 20, 2024July 25, 2024 by Barbara Raymond

An OLTP system is characterized by the frequent occurrence of many small read and write transactions. In general, fast access to the row (or rows) you need is paramount. Data integrity is vital. Availability is also very important. Global indexes make sense in many cases in OLTP systems. Table data can be partitioned by only one key—one set of columns. However, you may need to access the data in many different ways. You might partition EMPLOYEE data by LOCATION in the table, but you still need fast access to EMPLOYEE data by

•\ DEPARTMENT: Departments are geographically dispersed. There is no relationship between a department and a location.
•\ EMPLOYEE_ID: While an employee ID will determine a location, you don’t want to have to search by EMPLOYEE_ID and LOCATION; hence, partition elimination cannot take place on the index partitions. Also, EMPLOYEE_ID by itself must be unique.
•\ JOB_TITLE: There is no relationship between JOB_TITLE and LOCATION. All JOB_TITLE values may appear in any LOCATION.

There is a need to access the EMPLOYEE data by many different keys in different places in the application, and speed is paramount. In a data warehouse, we might just use locally partitioned indexes on these keys and use parallel index range scans to collect a large amount of data fast. In these cases, we don’t necessarily need to use index partition elimination. In an OLTP system, however, we do need to use it. Parallel query is not appropriate for these systems; we need to provide the indexes appropriately. Therefore, we will need to make use of global indexes on certain fields.

The following are the goals we need to meet:
•\ Fast access
•\ Data integrity
•\ Availability

Global indexes can help us accomplish these goals in an OLTP system. We will probably not be doing sliding windows, auditing aside for a moment. We will not be splitting partitions (unless we have a scheduled downtime), we will not be moving data, and so on. The operations we perform in a data warehouse are not done on a live OLTP system in general.

Here is a small example that shows how we can achieve the three goals just listed with global indexes. I am going to use simple, single partition global indexes, but the results would not be different with global indexes in multiple partitions (except for the fact that availability and manageability would increase as we added index partitions). We start by creating tablespaces P1, P2, P3, and P4, then create a table that is range partitioned by location, LOC, according to our rules, which place all LOC values less than ‘C’ into partition P1, those less than ‘D’ into partition P2, and so on:

$ sqlplus eoda/foo@PDB1
SQL> create tablespace p1 datafile size 1m autoextend on next 1m; Tablespace created.
SQL> create tablespace p2 datafile size 1m autoextend on next 1m; Tablespace created.
SQL> create tablespace p3 datafile size 1m autoextend on next 1m; Tablespace created.
SQL> create tablespace p4 datafile size 1m autoextend on next 1m; Tablespace created.
SQL> create table emp(EMPNO NUMBER(4) NOT NULL,ENAME VARCHAR2(10),JOB VARCHAR2(9),MGR NUMBER(4),HIREDATE DATE,SAL NUMBER(7,2),COMM NUMBER(7,2),DEPTNO NUMBER(2) NOT NULL,LOC VARCHAR2(13) NOT NULL)partition by range(loc) (partition p1 values less than(‘C’) tablespace p1, partition p2 values less than(‘D’) tablespace p2, partition p3 values less than(‘N’) tablespace p3, partition p4 values less than(‘Z’) tablespace p4 );Table created.

We alter the table to add a constraint on the primary key column:
SQL> alter table emp add constraint emp_pk primary key(empno); Table altered.

A side effect of this is that there exists a unique index on the EMPNO column. This shows we can support and enforce data integrity, one of our goals. Finally, we create two more global indexes on DEPTNO and JOB to facilitate accessing records quickly by those attributes:

SQL> create index emp_job_idx on emp(job) GLOBAL; Index created.
SQL> create index emp_dept_idx on emp(deptno) GLOBAL; Index created.
SQL> insert into empselect e.*, d.locfrom scott.emp e, scott.dept dwhere e.deptno = d.deptno;14 rows created.

Let’s see what is in each partition:
SQL> break on pname skip 1
SQL> select ‘p1’ pname, empno, job, loc from emp partition(p1) union allselect ‘p2’ pname, empno, job, loc from emp partition(p2)union allselect ‘p3’ pname, empno, job, loc from emp partition(p3)union allselect ‘p4’ pname, empno, job, loc from emp partition(p4);

Sliding Windows and Indexes-Partitioning-2

Posted on August 10, 2023July 25, 2024 by Barbara Raymond

We’ll turn the current full partition into an empty partition and create a full table with the FY_2020 data in it. Also, we’ve completed all of the work necessary to have the FY_2022 data ready to go. This would have involved verifying the data, transforming it— whatever complex tasks we need to undertake to get it ready. Now we’re ready to update the live data using an exchange partition:

SQL> alter table partitionedexchange partition fy_2020with table fy_2020including indexeswithout validation;Table altered.
SQL> alter table partitioned drop partition fy_2020; Table altered.

This is all we need to do to age the old data out. We turned the partition into a full table and the empty table into a partition. This was a simple data dictionary update. No large amount of I/O took place—it just happened. We can now export that FY_2020 table (perhaps using a transportable tablespace) out of our database for archival purposes. We could reattach it quickly if we ever needed to.

Next, we want to slide in the new data:
SQL> alter table partitioned add partition fy_2022values less than ( to_date(’01-jan-2023′,’dd-mon-yyyy’) ); Table altered.
SQL> alter table partitionedexchange partition fy_2022with table fy_2022including indexeswithout validation;Table altered.

Again, this was instantaneous; it was accomplished via simple data dictionary updates—the WITHOUT VALIDATION clause allowed us to accomplish that. When you use that clause, the database will trust that the data you are placing into that partition is, in fact, valid for that partition. Adding the empty partition took very little time to process.

Then, we exchange the newly created empty partition with the full table, and the full table with the empty partition, and that operation is performed quickly as well. The new data is online. Looking at our indexes, however, we’ll find the following:
SQL> select index_name, status from user_indexes;
INDEX_NAME STATUS
PARTITIONED_IDX_LOCAL N/A
PARTITIONED_IDX_GLOBAL UNUSABLE
FY_2020_IDX VALID
FY_2022_IDX VALID

The global index is, of course, unusable after this operation. Since each index partition can point to any table partition, and we just took away a partition and added a partition, that index is invalid. It has entries that point into the partition we dropped. It has no entries that point into the partition we just added. Any query that would make use of this index would fail and not execute, or if we skip unusable indexes, the query’s performance would be negatively impacted by not being able to use the index:

SQL> select /*+ index( partitioned PARTITIONED_IDX_GLOBAL ) */ count() from partitioned where timestamp between to_date( ’01-mar-2022′, ‘dd-mon-yyyy’ ) and to_date( ’31-mar-2022′, ‘dd-mon-yyyy’ );select /+ index( partitioned PARTITIONED_IDX_GLOBAL ) / count()
ERROR at line 1:
ORA-01502: index ‘EODA.PARTITIONED_IDX_GLOBAL’ or partition of such index is in unusable state
SQL> explain plan for select count(*)from partitionedwhere timestamp between to_date( ’01-mar-2022′, ‘dd-mon-yyyy’ ) and to_date( ’31-mar-2022′, ‘dd-mon-yyyy’ );

So, our choices after performing this partition operation with global indexes are
•\ Ensure that SKIP_UNUSABLE_INDEXES=TRUE (the default is TRUE), andthen Oracle will not use this unusable index. But then we lose the performance the index was giving us.
•\ Rebuild this index to make it usable again.
•\ Use the UPDATE GLOBAL INDEXES clause when performing partition maintenance (the focus of the next section in this chapter).

The sliding window process, which so far has resulted in virtually no downtime, will now take a very long time to complete while we rebuild the global index. Runtime query performance of queries that relied on these indexes will be negatively affected during this time—either they will not run at all or they will run without the benefit of the index. All of the data must be scanned and the entire index reconstructed from the table data. If the table is many hundreds of gigabytes in size, this will take considerable resources.

Hash Partition Using Powers of Two-Partitioning

Posted on April 4, 2023July 25, 2024 by Barbara Raymond

I mentioned earlier that the number of partitions should be a power of two. This is easily observed to be true. To demonstrate, we’ll set up a stored procedure to automate the creation of a hash partitioned table with N partitions (N will be a parameter). This procedure will construct a dynamic query to retrieve the counts of rows by partition and then display the counts and a simple histogram of the counts by partition. Lastly, it will open this query and let us see the results. This procedure starts with the hash table creation. We will use a table named T:

SQL> create or replaceprocedure hash_proc( p_nhash in number,p_cursor out sys_refcursor )authid current_userasl_text long;l_template long :=’select $POS$ oc, ”p$POS$” pname, count(*) cnt ‘ || ‘from t partition ( $PNAME$ ) union all ‘;table_or_view_does_not_exist exception;pragma exception_init( table_or_view_does_not_exist, -942 ); beginbegin
execute immediate ‘drop table t’; exception when table_or_view_does_not_existthen null;end;
execute immediate ‘CREATE TABLE t ( id )partition by hash(id)partitions ‘ || p_nhash || ‘asselect rownumfrom all_objects’;

Next, we will dynamically construct a query to retrieve the count of rows by partition. It does this using the template query defined earlier. For each partition, we’ll gather the count using the partition-extended table name and union all of the counts together:
for x in ( select partition_name pname, PARTITION_POSITION pos
from user_tab_partitions where table_name = ‘T’
order by partition_position )loopl_text := l_text ||replace(replace(l_template,’$POS$’, x.pos),’$PNAME$’, x.pname );end loop;

Now, we’ll take that query and select out the partition position (PNAME) and the count of rows in that partition (CNT). Using RPAD, we’ll construct a rather rudimentary but effective histogram:

open p_cursor for’select pname, cnt,

substr( rpad(””,30round( cnt/max(cnt)over(),2),”*”),1,30) hg from (‘ || substr( l_text, 1, length(l_text)-11 ) || ‘) order by oc’;end;/

If we run this with an input of 4, for four hash partitions, we would expect to see output similar to the following:
SQL> variable x refcursor
SQL> set autoprint on
SQL> exec hash_proc( 4, 😡 );
PL/SQL procedure successfully completed.

The simple histogram depicted shows a nice, even distribution of data over each of the four partitions. Each has close to the same number of rows in it. However, if we simply go from four to five hash partitions, we’ll see the following:
SQL> exec hash_proc( 5, 😡 );
PL/SQL procedure successfully completed.
PN CNT HG

This histogram points out that the first and last partitions have just half as many rows as the interior partitions. The data is not very evenly distributed at all. We’ll see the trend continue for six and seven hash partitions:
SQL> exec hash_proc( 6, 😡 );
PL/SQL procedure successfully completed.
PN CNT HG

SQL> exec hash_proc( 7, 😡 );
PL/SQL procedure successfully completed.
PN CNT HG

As soon as we get back to a number of hash partitions that is a power of two, we achieve the goal of even distribution once again:
SQL> exec hash_proc( 8, 😡 );
PL/SQL procedure successfully completed.
PN CNT HG

If we continue this experiment up to 16 partitions, we would see the same effects for the 9th through the 15th partitions—a skewing of the data to the interior partitions, away from the edges—and then upon hitting the 16th partition, you would see a flattening out again. The same would be true again up to 32 partitions, and then 64, and so on. This example just points out the importance of using a power of two as the number of hash partitions.

Partitioning Indexes-Partitioning

Posted on October 25, 2022July 25, 2024 by Barbara Raymond

Indexes, like tables, may be partitioned. There are two possible methods to partition indexes:

•\ Equipartition the index with the table: This is also known as a local index. For every table partition, there will be an index partition that indexes just that table partition. All of the entries in a given index partition point to a single table partition, and all of the rows in a single table partition are represented in a single index partition.

•\ Partition the index by range or hash: This is also known as a globally partitioned index. Here, the index is partitioned by range, or optionally by hash, and a single index partition may point to any (and all) table partitions.

Figure 13-5 demonstrates the difference between a local and a global index.

Chapter 13 Partitioning

Figure 13-5. Local and global index partitions

In the case of a globally partitioned index, note that the number of index partitions may be different from the number of table partitions.

Since global indexes may be partitioned by range or hash only, you must use local indexes if you wish to have a list or composite partitioned index. The local index will be partitioned using the same scheme as the underlying table.

Local Indexes vs. Global Indexes

In my experience, most partition implementations in data warehouse systems use local indexes. In an OLTP system, global indexes are much more common, and we’ll see why shortly. It has to do with the need to perform partition elimination on the index structures to maintain the same query response times after partitioning as before partitioning them.

Local indexes have certain properties that make them the best choice for most data warehouse implementations. They support a more available environment (less downtime), since problems will be isolated to one range or hash of data. On the other hand, since it can point to many table partitions, a global index may become a point of failure, rendering all partitions inaccessible to certain queries.

Local indexes are more flexible when it comes to partition maintenance operations. If the DBA decides to move a table partition, only the associated local index partition needs to be rebuilt or maintained. With a global index, all index partitions must be rebuilt or maintained in real time.

The same is true with sliding window implementations, where old data is aged out of the partition and new data is aged in. No local indexes will be in need of a rebuild, but all global indexes will be either rebuilt or maintained during the partition operation. In some cases, Oracle can take advantage of the fact that the index is locally partitioned with the table and will develop optimized query plans based on that. With global indexes, there is no such relationship between the index and table partitions.

Local indexes also facilitate a partition point-in-time recovery operation. If a single partition needs to be recovered to an earlier point in time than the rest of the table for some reason, all locally partitioned indexes can be recovered to that same point in time. All global indexes would need to be rebuilt on this object. This does not mean “avoid global indexes”—in fact, they are vitally important for performance reasons, as you’ll learn shortly—you just need to be aware of the implications of using them.

Row Movement-Partitioning

Posted on April 26, 2022July 25, 2024 by Barbara Raymond

You might wonder what would happen if the value of the column used to determine the partition is modified in any of the preceding partitioning schemes. There are two cases to consider:
•\ The modification would not cause a different partition to be used; therow would still belong in this partition. This is supported in all cases.
•\ The modification would cause the row to move across partitions. Thisis supported if row movement is enabled for the table; otherwise, anerror will be raised.

We can observe these behaviors easily. In the previous example in the “RangePartitioning” section, we inserted a pair of rows into PART_1 of the RANGE_EXAMPLE table:
$ sqlplus eoda/foo@PDB1
SQL> CREATE TABLE range_example
( range_key_column date,data varchar2(20))PARTITION BY RANGE (range_key_column)
( PARTITION part_1 VALUES LESS THAN
(to_date(’01/01/2021′,’dd/mm/yyyy’)), PARTITION part_2 VALUES LESS THAN
(to_date(’01/01/2022′,’dd/mm/yyyy’)));Table created.

SQL> insert into range_example( range_key_column, data )values(to_date( ’15-dec-2020 00:00:00′, ‘dd-mon-yyyy hh24:mi:ss’ ), ‘application data…’ );1 row created.

SQL> insert into range_example( range_key_column, data )values( to_date( ’01-jan-2021 00:00:00′, ‘dd-mon-yyyy hh24:mi:ss’ )-1/24/60/60,
‘application data…’ );1 row created.

SQL> select * from range_example partition(part_1); RANGE_KEY DATA
15-DEC-20 application data…
31-DEC-20 application data…

We take one of the rows and update the value in its RANGE_KEY_COLUMN such that it can remain in PART_1:
SQL> update range_example set range_key_column = trunc(range_key_column) where range_key_column =to_date( ’31-dec-2020 23:59:59′, ‘dd-mon-yyyy hh24:mi:ss’ ); 1 row updated.
As expected, this succeeds: the row remains in partition PART_1. Next, we update the
RANGE_KEY_COLUMN to a value that would cause it to belong in PART_2:
SQL> update range_exampleset range_key_column = to_date(’01-jan-2021′,’dd-mon-yyyy’) where range_key_column = to_date(’31-dec-2020′,’dd-mon-yyyy’);update range_example
ERROR at line 1:
ORA-14402: updating partition key column would cause a partition change

This immediately raises an error since we did not explicitly enable row movement. We can enable row movement on this table to allow the row to move from partition to partition.
You should be aware of a subtle side effect of doing this, however, namely, that the ROWID of a row will change as the result of the update:
SQL> select rowid from range_examplewhere range_key_column = to_date(’31-dec-2020′,’dd-mon-yyyy’);
SQL> alter table range_example enable row movement; Table altered.
SQL> update range_exampleset range_key_column = to_date(’01-jan-2021′,’dd-mon-yyyy’) where range_key_column = to_date(’31-dec-2020′,’dd-mon-yyyy’);1 row updated.

As long as you understand that the ROWID of the row will change on this update, enabling row movement will allow you to update partition keys.

Note There are other cases where a ROWID can change as a result of an update. It can happen as a result of an update to the primary key of an IOT. The universal ROWID will change for that row, too. The FLASHBACK TABLE command may also change the ROWID of rows, as might the ALTER TABLE SHRINK command.

You need to understand that, internally, row movement is done as if you had, in fact, deleted the row and reinserted it. It will update every single index on this table and delete the old entry and insert a new one. It will do the physical work of a DELETE plus an INSERT.

However, it is considered an update by Oracle even though it physically deletes and inserts the row—therefore, it won’t cause INSERT and DELETE triggers to fire, just the UPDATE triggers. Additionally, child tables that might prevent a DELETE due to a foreign key constraint won’t.
You do have to be prepared, however, for the extra work that will be performed; it is much more expensive than a normal UPDATE. Therefore, it would be a bad design decision to construct a system whereby the partition key was modified frequently and that modification would cause a partition movement.

Interval Partitioning-Partitioning-1

Posted on March 21, 2022July 25, 2024 by Barbara Raymond

Interval partitioning is very similar to range partitioning described previously—in fact, it starts with a range partitioned table but adds a rule (the interval) to the definition so the database knows how to add partitions in the future.

The goal of interval partitioning is to create new partitions for data—if, and only if, data exists for a given partition and only when that data arrives in the database. In other words, to remove the need to pre- create partitions for data, to allow the data itself to create the partition as it is inserted.

To use interval partitioning, you start with a range partitioned table without a MAXVALUE partition and specify an interval to add to the upper bound, the highest value of that partitioned table to create a new range.

You need to have a table that is range partitioned on a single column that permits adding a NUMBER or INTERVAL type to it (e.g., a table partitioned by a VARCHAR2 field cannot be interval partitioned; there is nothing you can add to a VARCHAR2).

You can use interval partitioning with any suitable existing range partitioned table; that is, you can ALTER an existing range partitioned table to be interval partitioned, or you can create one with the CREATE TABLE command.

For example, suppose you had a range partitioned table that said “anything strictly less than 01-JAN-2021” (data in the year 2020 and before) goes into partition P1—and that was it.

So it had one partition for all data in the year 2020 and before. If you attempted to insert data for the year 2022 into the table, the insert would fail as demonstrated previously in the section on range partitioning.

With interval partitioning, you can create a table and specify both a range (strictly less than 01-JAN-2020) and an interval—say one month in duration—and the database would create monthly partitions (a partition capable of holding exactly one month’s worth of data) as the data arrived.

The database would not pre-create all possible partitions because that would not be practical. But, as each row arrived, the database would see whether the partition for the month in question existed. The database would create the partition if needed. Here is an example of the syntax:

$ sqlplus eoda/foo@PDB1
SQL> create table audit_trail( ts timestamp,data varchar2(30))partition by range(ts)interval (numtoyminterval(1,’month’))store in (users, example )(partition p0 values less than(to_date(’01-01-1900′,’dd-mm-yyyy’)));Table created.

Note You might have a question in your mind, especially if you just finished reading the previous chapter on datatypes. You can see we are partitioning by a TIMESTAMP, and we are adding an INTERVAL of one month to it. In Chapter 12, we saw how adding an INTERVAL of one month to a TIMESTAMP that fell on January 31 would raise an error, since there is no February 31. Will the same issue happen with interval partitioning? The answer is yes; if you attempt to use a date such as ’29-01-1990′ (any day of the month after 28 would suffice), you will receive an error “ORA-14767: Cannot specify this interval with existing high bounds”. The database will not permit you to use a boundary value that is not safe to add the interval to.

On lines 8 and 9, you see the range partitioning scheme for this table; it starts with a single empty partition that would contain any data prior to 01-JAN-1900. Presumably, since the table holds an audit trail, this partition will remain small and empty forever. It is a mandatory partition and is referred to as the transitional partition.

All data that is strictly less than this current high value partition will be range partitioned, using traditional range partitioning. Only data that is created above the transitional partition high value will use interval partitioning. If we query the data dictionary, we can see what has been created so far:

SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) intervalfrom user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’ and a.table_name = b.table_nameorder by a.partition_position;

So far, we have just the single partition, and it is not an INTERVAL partition, as shown by the empty INTERVAL column. Rather, it is just a regular RANGE partition right now; it will hold anything strictly less than 01-JAN-1900.

Looking at the CREATE TABLE statement again, we can see the new interval partitioning–specific information on lines 6 and 7:interval (numtoyminterval(1,’month’))store in (users, example )

On line 6, we have the actual interval specification of NUMTOYMINTERVAL(1,’MONTH’). Our goal was to store monthly partitions—a new partition for each month’s worth of data—a very common goal. By using a date that is safe to add a month to (refer to Chapter 12 for why adding a month to a timestamp can be error-prone in some cases)— the first of the month—we can have the database create monthly partitions on the fly, as data arrives, for us.

On line 7, we have specifics: store in (users,example). This allows us to tell the database where to create these new partitions—what tablespaces to use. As the database figures out what partitions it wants to create, it uses this list to decide what tablespace to create each partition in. This allows the DBA to control the maximum desired tablespace size: they might not want a single 500GB tablespace, but they would be comfortable with ten 50GB tablespaces. In that case, they would set up ten tablespaces and allow the database to use all ten to create partitions. Let’s insert a row of data now and see what happens:

SQL> insert into audit_trail (ts,data) values ( to_timestamp(’27-feb-2020′,’dd-mon-yyyy’), ‘xx’ );1 row created.
SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) intervalfrom user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’and a.table_name = b.table_nameorder by a.partition_position;

If you recall from the Range Partitioning section, you would expect that INSERT to fail. However, since we are using interval partitioning, it succeeds and, in fact, creates a new partition SYS_P1623. The HIGH_VALUE for this partition is 01-MAR-2020 which, if we were using range partitioning, would imply anything strictly less than 01-MAR-2020 and greater than or equal to 01-JAN-1900 would go into this partition, but since we have an interval, the rules are different. When the interval is set, the range for this partition is anything greater than or equal to the HIGH_VALUE-INTERVAL and strictly less than the HIGH_VALUE. So, this partition would have the range of

SQL> select TIMESTAMP’ 2020-03-01 00:00:00′-NUMTOYMINTERVAL(1,’MONTH’) greater_than_eq_to,

Local Indexes-Partitioning

Posted on March 5, 2022July 25, 2024 by Barbara Raymond

Oracle makes a distinction between the following two types of local indexes:

•\ Local prefixed indexes: These are indexes whereby the partition keys are on the leading edge of the index definition. For example, if a table is range partitioned on a column named LOAD_DATE, a local prefixed index on that table would have LOAD_DATE as the first column in its column list.

•\ Local nonprefixed indexes: These indexes do not have the partition key on the leading edge of their column list. The index may or may not contain the partition key columns.

Both types of indexes are able to take advantage of partition elimination, both can support uniqueness (as long as the nonprefixed index includes the partition key), and so on. The fact is that a query that uses a local prefixed index will always allow for index partition elimination, whereas a query that uses a local nonprefixed index might not. This is why local nonprefixed indexes are said to be slower by some people—they do not enforce partition elimination (but they do support it).

There is nothing inherently better about a local prefixed index as opposed to a local nonprefixed index when that index is used as the initial path to the table in a query. What I mean is that if the query can start with “scan an index” as the first step, there isn’t much difference between a prefixed and a nonprefixed index.

Partition Elimination Behavior

For the query that starts with an index access, whether or not it can eliminate partitions from consideration all really depends on the predicate in your query. A small example will help demonstrate this. The following code creates a table, PARTITIONED_TABLE, that is range partitioned on a numeric column A such that values less than two will be in partition PART_1 and values less than three will be in partition PART_2:
$ sqlplus eoda/foo@PDB1
SQL> CREATE TABLE partitioned_table( a int,b int,data char(20))PARTITION BY RANGE (a)(PARTITION part_1 VALUES LESS THAN(2) tablespace p1, PARTITION part_2 VALUES LESS THAN(3) tablespace p2);Table created.

We then create both a local prefixed index, LOCAL_PREFIXED, and a local nonprefixed index, LOCAL_NONPREFIXED. Note that the nonprefixed index does not have A on the leading edge of its definition, which is what makes it a nonprefixed index:
SQL> create index local_prefixed on partitioned_table (a,b) local; Index created.
SQL> create index local_nonprefixed on partitioned_table (b) local; Index created.

Next, we’ll insert some data into one partition and gather statistics:
SQL> insert into partitioned_table select mod(rownum-1,2)+1, rownum, ‘x’
from dual connect by level <= 70000; 70000 rows created.

We take tablespace P2 offline, which contains the PART_2 partition for both the tables and indexes:
SQL> alter tablespace p2 offline;Tablespace altered.

Taking tablespace P2 offline will prevent Oracle from accessing those specific index partitions. It will be as if we had suffered media failure, causing them to become unavailable. Now we’ll query the table to see what index partitions are needed by different queries. This first query is written to permit the use of the local prefixed index:
SQL> select * from partitioned_table where a = 1 and b = 1;

This query succeeded, and we can see why by reviewing the explain plan. We’ll use the built-in package DBMS_XPLAN to see what partitions this query accesses. The PSTART (partition start) and PSTOP (partition stop) columns in the output show us exactly what partitions this query needs to have online and available in order to succeed:
SQL> explain plan for select * from partitioned_table where a = 1 and b = 1; Explained.

Now access DBMS_XPLAN.DISPLAY and instruct it to show the basic explain plan details plus partitioning information:
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

So, the query that uses LOCAL_PREFIXED succeeds. The optimizer was able to exclude PART_2 of LOCAL_PREFIXED from consideration because we specified A=1 in the query, and we can see that clearly in the plan PSTART and PSTOP are both equal to 1. Partition elimination kicked in for us. The second query fails, however:
SQL> select * from partitioned_table where b = 1;
ERROR:
ORA-00376: file 23 cannot be read at this time ORA-01110: data file 23: ‘/opt/oracle/oradata/CDB/ C217E68DF48779E1E0530101007F73B9/datafile/o1_mf_p2_jc8bg9py_.dbf’

And using the same technique, we can see why:
SQL> explain plan for select * from partitioned_table where b = 1; Explained.
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

Here, the optimizer was not able to remove PART_2 of LOCAL_NONPREFIXED from consideration—it needed to look in both the PART_1 and PART_2 partitions of the index to see if B=1 was in there. Herein lies a performance issue with local nonprefixed indexes: they do not make you use the partition key in the predicate as a prefixed index does. It is not that prefixed indexes are better; it’s just that in order to use them, you must use a query that allows for partition elimination.

If we drop the LOCAL_PREFIXED index and rerun the original successful query, as follows:
SQL> drop index local_prefixed;Index dropped.
SQL> select * from partitioned_table where a = 1 and b = 1;

It succeeds, but as we’ll see, it used the same index that just a moment ago failed us. The plan shows that Oracle was able to employ partition elimination here—the predicate A=1 was enough information for the database to eliminate index partition PART_2 from consideration:
SQL> explain plan for select * from partitioned_table where a = 1 and b = 1; Explained.
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

Note the PSTART and PSTOP column values of 1 and 1. This proves that the optimizer is able to perform partition elimination even for nonprefixed local indexes.

If you frequently query the preceding table with the following queries, then you might consider using a local nonprefixed index on (b,a):
select … from partitioned_table where a = :a and b = :b; select … from partitioned_table where b = :b;

That index would be useful for both of the preceding queries. The local prefixed index on (a,b) would be useful only for the first query.

The bottom line here is that you should not be afraid of nonprefixed indexes or consider them as major performance inhibitors. If you have many queries that could benefit from a nonprefixed index as outlined previously, then you should consider using one. The main concern is to ensure that your queries contain predicates that allow for index partition elimination whenever possible. The use of prefixed local indexes enforces that consideration. The use of nonprefixed indexes does not. Consider also how the index will be used. If it will be used as the first step in a query plan, there are not many differences between the two types of indexes.

Virtual Column Partitioning-Partitioning

Posted on November 2, 2021July 25, 2024 by Barbara Raymond

Virtual column partitioning allows you to partition based on a SQL expression. This type of partitioning is useful when a table column is overloaded with multiple business values and you want to partition on a portion of that column. For example, let’s say you have a RESERVATION_CODE column in a table:
$ sqlplus eoda/foo@PDB1
SQL> create table res(reservation_code varchar2(30)); Table created.

And the first character in the RESERVATION_CODE column defines a region from which the reservation originated. For the purposes of this example, let’s say a first character of an A or C maps to the NE region, values of B map to the SW region, and values of D map to the NW region.

In this scenario, we know that the first character represents the region, and we want to be able to list partition by region. With the data as it is, it’s not practical to list partition by the RESERVATION_CODE column, whereas virtual partitioning allows us to apply a SQL function to the column and list partition by the first character. Here’s what the table definition looks like with virtual column partitioning:
SQL> drop table res;
SQLcreatetableres(reservation_codevarchar2(30),regionas(decode(substr(reservation_code,1,1),’A’,’NE’,’C’,’NE’,’B’,’SW’,’D’,’NW’)))partition by list (region)(partition NE values(‘NE’),partition SW values(‘SW’),partition NW values(‘NW’));Table created.

We can view the partition information via this query:
SQL> select a.table_name, a.partition_name, a.high_value from user_tab_partitions a, user_part_tables b where a.table_name = ‘RES’and a.table_name = b.table_nameorder by a.table_name;

TABLE_NAME PARTITION_ HIGH_VALUE
RES NE ‘NE’
RES NW ‘NW’
RES SW ‘SW’

Next, some random test data is inserted into the table (your random results will be different from this example):
SQL> insert into res (reservation_code)select chr(64+(round(dbms_random.value(1,4)))) || level from dual connect by level < 10;9 rows created.

Now let’s see how the data was partitioned:
SQL> select ‘NE’, reservation_code, region from res partition(NE) union all
select ‘SW’, reservation_code, region from res partition(SW) union all
select ‘NW’, reservation_code, region from res partition(NW);

‘N RESERVATION_CODE RE

NE C3 NE
NE C5 NE
NE A6 NE
NE C8 NE
SW B2 SW
SW B7 SW
SW B9 SW
NW D1 NW
NW D4 NW

In this way, virtual column partitioning is often appropriate when there is a business requirement to partition on portions of data in a column or combinations of data from different columns (especially when there might not be an obvious way to list or range partition). The expression behind a virtual column can be a complex calculation, return a subset of a column string, combine column values, and so on.