Local Indexes and Unique Constraints – How Hash Partitioning Works at Oracle Database

Summary-Partitioning

Posted on July 25, 2024July 25, 2024 by Barbara Raymond

Partitioning is extremely useful in scaling up large database objects in the database. This scaling is visible from the perspective of performance scaling, availability scaling, and administrative scaling. All three are extremely important to different people.

The DBA is concerned with administrative scaling. The owners of the system are concerned with availability, because downtime is lost money, and anything that reduces downtime—or reduces the impact of downtime—boosts the payback for a system.

The end users of the system are concerned with performance scaling. No one likes to use a slow system, after all.

We also looked at the fact that in an OLTP system, partitions may not increase performance, especially if applied improperly. Partitions can increase the performance of certain classes of queries, but those queries are generally not applied in an OLTP system.

This point is important to understand, as many people associate partitioning with “free performance increase.” This does not mean that partitions should not be used in OLTP systems—they do provide many other salient benefits in this environment— just don’t expect a massive increase in throughput.

Expect reduced downtime. Expect the same good performance (partitioning will not slow you down when applied appropriately). Expect easier manageability, which may lead to increased performance due to the fact that some maintenance operations are performed by the DBAs more frequently because they can be.

We investigated the various table partitioning schemes offered by Oracle—range, hash, list, interval, reference, interval reference, virtual column, and composite—and talked about when they are most appropriately used.

We spent the bulk of our time looking at partitioned indexes and examining the differences between prefixed and nonprefixed and local and global indexes. We investigated partition operations in data warehouses combined with global indexes, and the trade-off between resource consumption and availability.

We also looked at the ease of maintenance features such as the ability to perform maintenance operations on multiple partitions at a time, cascade truncate, and cascade exchange. Oracle continues to update and improve partitioning with each new release.

Over time, I see this feature becoming more relevant to a broader audience as the size and scale of database applications grow.

The Internet and its database-hungry nature along with legislation requiring longer retention of audit data are leading to more and more extremely large collections of data, and partitioning is a natural tool to help manage that problem.

Ease of Maintenance Features-Partitioning

Posted on April 6, 2024July 25, 2024 by Barbara Raymond

At the beginning of this chapter, I stated the goal was to provide a practical guide to implement applications with partitioning, and that I wouldn’t be focusing so much on administration. However, there are a few new administrative features available that deserve some discussion, namely:

•\ Multiple partition maintenance operations
•\ Cascade exchange
•\ Cascade delete

These features have a positive impact in terms of ease of maintenance, data integrity, and performance. Therefore, it’s important to be aware of these features when implementing partitioning.

Multiple Partition Maintenance Operations
This feature eases the administration of partitioning and in some scenarios reduces the database resources required to perform maintenance operations. Oracle allows you to combine more than one partition maintenance operation in one DDL statement. Consider the following example:

$ sqlplus eoda/foo@PDB1
SQL> create table p_table(a int)partition by range (a)
(partition p1 values less than (1000), partition p2 values less than (2000));
Table created.

Now say you want to add multiple partitions to the table that was just created. Oracle allows you to perform multiple partition operations in one statement:

SQL> alter table p_table add
partition p3 values less than (3000),
partition p4 values less than (4000);Table altered.

Note In addition to adding partitions, multiple partition maintenance operations can be applied to dropping, merging, splitting, and truncating.

Performing multiple maintenance partition operations in one DDL statement is particularly advantageous for splitting partitions and thus deserves more discussion. A small example will illustrate this. Let’s set this up by creating a table and loading it with data:

SQL> CREATE TABLE sales(sales_id int,s_date date)PARTITION BY RANGE (s_date)
(PARTITION P2021 VALUES LESS THAN (to_date(’01-jan-2022′,’dd-mon-yyyy’))); Table created.
SQL> insert into sales
select level, to_date(’01-jan-2021′,’dd-mon-yyyy’) + ceil(dbms_random. value(1,364))
from dual connect by level < 100000; 99999 rows created.

Next, we create a small utility function to help us measure the resources consumed while performing an operation:

SQL> create or replace function get_stat_val( p_name in varchar2 ) return number
asl_val number;beginselect b.valueinto l_valfrom v$statname a, v$mystat b
where a.statistic# = b.statistic#and a.name = p_name;return l_val;end;/
Function created.

Next, we’ll split the P2021 partition into four partitions in one DDL statement and measure the resources consumed:

SQL> var r1 number
SQL> exec :r1 := get_stat_val(‘redo size’);
PL/SQL procedure successfully completed.
SQL> var c1 number
SQL> exec :c1 := dbms_utility.get_cpu_time;
PL/SQL procedure successfully completed.
SQL> alter table sales split partition P2021
into (partition Q1 values less than (to_date(’01-apr-2021′,’dd-mon-yyyy’)), partition Q2 values less than (to_date(’01-jul-2021′,’dd-mon-yyyy’)),
partition Q3 values less than (to_date(’01-oct-2021′,’dd-mon-yyyy’)), partition Q4);Table altered.
SQL> set serverout on
SQL> exec dbms_output.put_line(get_stat_val(‘redo size’) – :r1); 47068
SQL> exec dbms_output.put_line(dbms_utility.get_cpu_time – :c1); 12

The amount of redo generated via the single DDL statement is relatively low. Depending on the number of partitions being split and if you’re updating indexes at the same time, the amount of redo generated and CPU consumed can be considerably less than when splitting the maintenance operations into multiple statements.

OLTP and Global Indexes-Partitioning-2

Posted on January 29, 2024July 25, 2024 by Barbara Raymond

This shows the distribution of data, by location, into the individual partitions. We can now review some query plans to see what we could expect performance-wise:

SQL> variable x varchar2(30);
SQL> begindbms_stats.set_table_stats( user, ‘EMP’, numrows=>100000, numblks => 10000 ); end;/
PL/SQL procedure successfully completed.
SQL> explain plan for select empno, job, loc from emp where empno = :x; Explained.
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

The plan here shows an INDEX UNIQUE SCAN of the nonpartitioned index EMP_PK that was created in support of our primary key. Then there is a TABLE ACCESS BY GLOBAL INDEX ROWID, with a PSTART and PSTOP of ROWID/ROWID, meaning that when we get the ROWID from the index, it will tell us precisely which index partition to read to get this row. This index access will be as effective as on a nonpartitioned table and perform the same amount of I/O to do so. It is just a simple, single index unique scan followed by “get this row by rowid.” Now, let’s look at one of the other global indexes, the one on JOB:

SQL> explain plan for select empno, job, loc from emp where job = :x; Explained.
SQL> select * from table(dbms_xplan.display);

Sure enough, we see a similar effect for the INDEX RANGE SCAN. Our indexes are used and can provide high-speed OLTP access to the underlying data. If they were partitioned, they would have to be prefixed and enforce index partition elimination; hence, they are scalable as well, meaning we can partition them and observe the same behavior. In a moment, we’ll look at what would happen if we used LOCAL indexes only.

Lastly, let’s look at the area of availability. The Oracle documentation claims that globally partitioned indexes make for less available data than locally partitioned indexes. I don’t fully agree with this blanket characterization. I believe that in an OLTP system they are as highly available as a locally partitioned index. Consider the following:

SQL> alter tablespace p1 offline;
SQL> alter tablespace p2 offline;
SQL> alter tablespace p3 offline;
SQL> select empno, job, loc from emp where empno = 7782;
EMPNO JOB LOC
7782 MANAGER NEW YORK

Here, even though most of the underlying data is unavailable in the table, we can still gain access to any bit of data available via that index. As long as the EMPNO we want is in a tablespace that is available, and our GLOBAL index is available, our GLOBAL index works for us. On the other hand, if we had been using the highly available local index in the preceding case, we might have been prevented from accessing the data! This is a side effect of the fact that we partitioned on LOC but needed to query by EMPNO. We would have had to probe each local index partition and would have failed on the index partitions that were not available.

Other types of queries, however, will not (and cannot) function at this point in time:
SQL> select empno, job, loc from emp where job = ‘CLERK’;
ERROR at line 1:
ORA-00376: file 38 cannot be read at this time ORA-01110: data file 38: ‘/opt/oracle/oradata/CDB/ C217E68DF48779E1E0530101007F73B9/datafile/o1_mf_p2_jcbnhfh2_.dbf’

The CLERK data is in all of the partitions, and the fact that three of the tablespaces are offline does affect us. This is unavoidable unless we had partitioned on JOB, but then we would have had the same issues with queries that needed data by LOC. Anytime you need to access the data from many different keys, you will have this issue. Oracle will give you the data whenever it can.

Note, however, that if the query can be answered from the index, avoiding the TABLE ACCESS BY ROWID, the fact that the data is unavailable is not as meaningful:
SQL> select count() from emp where job = ‘CLERK’;COUNT()

Since Oracle didn’t need the table in this case, the fact that most of the partitions were offline doesn’t affect this query (assuming the index isn’t in one of the offline tablespaces of course). As this type of optimization (i.e., answer the query using just the index) is common in an OLTP system, there will be many applications that are not affected by the data that is offline. All we need to do now is make the offline data available as fast as possible (restore it and recover it).

Sliding Windows and Indexes-Partitioning-1

Posted on September 3, 2023July 25, 2024 by Barbara Raymond

The following example implements a classic sliding window of data. In many implementations, data is added to a warehouse over time and the oldest data is aged out. Many times, this data is range partitioned by a date attribute, so that the oldest data is stored together in a single partition, and the newly loaded data is likewise stored together in a new partition. The monthly load process involves the following:

•\ Detaching the old data: The oldest partition is either dropped or exchanged with an empty table (turning the oldest partition into a table) to permit archiving of the old data.
•\ Loading and indexing of the new data: The new data is loaded into a work table and indexed and validated.
•\ Attaching the new data: Once the new data is loaded and processed, the table it is in is exchanged with an empty partition in the partitioned table, turning this newly loaded data in a table into a partition of the larger partitioned table.

This process is repeated every month, or however often the load process is performed; it could be every day or every week. We will implement this very typical process in this section to show the impact of global partitioned indexes and demonstrate the options we have during partition operations to increase availability, allowing us to implement a sliding window of data and maintain continuous availability of data.

We’ll process yearly data in this example and have fiscal years 2020 and 2021 loaded up. The table will be partitioned by the TIMESTAMP column, and it will have two indexes created on it—one is a locally partitioned index on the ID column, and the other is a global index (nonpartitioned, in this case) on the TIMESTAMP column:

$ sqlplus eoda/foo@PDB1
SQL> drop table partitioned;
SQL> CREATE TABLE partitioned( timestamp date,id int)
PARTITION BY RANGE (timestamp)
(PARTITION fy_2020 VALUES LESS THAN
( to_date(’01-jan-2021′,’dd-mon-yyyy’) ) , PARTITION fy_2021 VALUES LESS THAN
( to_date(’01-jan-2022′,’dd-mon-yyyy’) ) ); Table created.
SQL> insert into partitioned partition(fy_2020) select to_date(’31-dec-2020′,’dd-mon-yyyy’)-mod(rownum,360), rownum from dual connect by level <= 70000;70000 rows created. SQL> insert into partitioned partition(fy_2021) select to_date(’31-dec-2021′,’dd-mon-yyyy’)-mod(rownum,360), rownum from dual connect by level <= 70000;70000 rows created. SQL> create index partitioned_idx_local on partitioned(id) LOCAL; Index created.
SQL> create index partitioned_idx_global on partitioned(timestamp) GLOBAL; Index created.

This sets up our warehouse table. The data is partitioned by fiscal year, and we have the last two years’ worth of data online. This table has two indexes: one is LOCAL and the other is GLOBAL. Now it’s the end of the year and we would like to do the following:

\ 1.\ Remove the oldest fiscal year data: We do not want to lose this data forever; we just want to age it out and archive it.
\ 2.\ Add the newest fiscal year data: It will take a while to load it, transform it, index it, and so on. We would like to do this work without impacting the availability of the current data, if at all possible.

The first step is to set up an empty table for fiscal year 2020 that looks just like the partitioned table. We’ll use this table to exchange with the FY_2020 partition in the partitioned table, turning that partition into a table and in turn emptying out the partition in the partitioned table. The net effect is that the oldest data in the partitioned table will have been in effect removed after the exchange:
SQL> create table fy_2020 ( timestamp date, id int ); Table created.
SQL> create index fy_2020_idx on fy_2020(id); Index created.

We’ll do the same to the new data to be loaded. We’ll create and load a table that structurally looks like the existing partitioned table (but that is not itself partitioned):

SQL> create table fy_2022 ( timestamp date, id int ); Table created.
SQL> insert into fy_2022 select to_date(’31-dec-2022′,’dd-mon-yyyy’)-mod(rownum,360), rownum from dual connect by level <= 70000;70000 rows created. SQL> create index fy_2022_idx on fy_2022(id) nologging; Index created.

Global Indexes-Partitioning

Posted on January 1, 2023July 25, 2024 by Barbara Raymond

Global indexes are partitioned using a scheme that is different from that used in the underlying table. The table might be partitioned by a TIMESTAMP column into ten partitions, and a global index on that table could be partitioned into five partitions by the REGION column. Unlike local indexes, there is only one class of global index, and that is a prefixed global index. There is no support for a global index whose index key does not begin with the partitioning key for that index. That implies that whatever attribute(s) you use to partition the index will be on the leading edge of the index key itself.

Building on our previous example, here is a quick example of the use of a global index. It shows that a global partitioned index can be used to enforce uniqueness for a primary key, so you can have partitioned indexes that enforce uniqueness, but do not include the partition key of the table. The following example creates a table partitioned by TIMESTAMP that has an index partitioned by ID:
$ sqlplus eoda/foo@PDB1
SQL> drop table partitioned;
SQL> CREATE TABLE partitioned( timestamp date,id int)
PARTITION BY RANGE (timestamp)
PARTITION part_1 VALUES LESS THAN
( to_date(’01-jan-2021′,’dd-mon-yyyy’) ) , PARTITION part_2 VALUES LESS THAN
( to_date(’01-jan-2022′,’dd-mon-yyyy’) ));Table created.
SQL> create index partitioned_indexon partitioned(id)GLOBALpartition by range(id)
partition part_1 values less than(1000), partition part_2 values less than (MAXVALUE) );Index created.

Note the use of MAXVALUE in this index. MAXVALUE can be used in any range partitioned table as well as in the index. It represents an infinite upper bound on the range. In our examples so far, we’ve used hard upper bounds on the ranges (values less than ). However, a global index has a requirement that the highest partition (the last partition) must have a partition bound whose value is MAXVALUE. This ensures that all rows in the underlying table can be placed in the index.

Now, completing this example, we’ll add our primary key to the table:
SQL> alter table partitioned add constraint partitioned_pk primarykey(id);Table altered.

It is not evident from this code that Oracle is using the index we created to enforce the primary key (it is to me because I know that Oracle is using it), so we can prove it by simply trying to drop that index:
SQL> drop index partitioned_index;drop index partitioned_index
ERROR at line 1:
ORA-02429: cannot drop index used for enforcement of unique/primary key

To show that Oracle will not allow us to create a nonprefixed global index, we only need try the following:
SQL> create index partitioned_index2on partitioned(timestamp,id)GLOBALpartition by range(id)(partition part_1 values less than(1000), partition part_2 values less than (MAXVALUE) );partition by range(id)
ERROR at line 4:
ORA-14038: GLOBAL partitioned index must be prefixed

The error message is pretty clear. The global index must be prefixed. So, when would you use a global index? We’ll take a look at two system types, data warehouse and OLTP, and see when they might apply.

Data Warehousing and Global Indexes

Many data warehouses implement a sliding window approach to managing data—that is, drop the oldest partition of a table and add a new partition for the newly loaded data. In the following sections, we’ll take a look at what is meant by a sliding window of data and the potential impact of a global index on it.

Row Movement-Partitioning

Posted on April 26, 2022July 25, 2024 by Barbara Raymond

You might wonder what would happen if the value of the column used to determine the partition is modified in any of the preceding partitioning schemes. There are two cases to consider:
•\ The modification would not cause a different partition to be used; therow would still belong in this partition. This is supported in all cases.
•\ The modification would cause the row to move across partitions. Thisis supported if row movement is enabled for the table; otherwise, anerror will be raised.

We can observe these behaviors easily. In the previous example in the “RangePartitioning” section, we inserted a pair of rows into PART_1 of the RANGE_EXAMPLE table:
$ sqlplus eoda/foo@PDB1
SQL> CREATE TABLE range_example
( range_key_column date,data varchar2(20))PARTITION BY RANGE (range_key_column)
( PARTITION part_1 VALUES LESS THAN
(to_date(’01/01/2021′,’dd/mm/yyyy’)), PARTITION part_2 VALUES LESS THAN
(to_date(’01/01/2022′,’dd/mm/yyyy’)));Table created.

SQL> insert into range_example( range_key_column, data )values(to_date( ’15-dec-2020 00:00:00′, ‘dd-mon-yyyy hh24:mi:ss’ ), ‘application data…’ );1 row created.

SQL> insert into range_example( range_key_column, data )values( to_date( ’01-jan-2021 00:00:00′, ‘dd-mon-yyyy hh24:mi:ss’ )-1/24/60/60,
‘application data…’ );1 row created.

SQL> select * from range_example partition(part_1); RANGE_KEY DATA
15-DEC-20 application data…
31-DEC-20 application data…

We take one of the rows and update the value in its RANGE_KEY_COLUMN such that it can remain in PART_1:
SQL> update range_example set range_key_column = trunc(range_key_column) where range_key_column =to_date( ’31-dec-2020 23:59:59′, ‘dd-mon-yyyy hh24:mi:ss’ ); 1 row updated.
As expected, this succeeds: the row remains in partition PART_1. Next, we update the
RANGE_KEY_COLUMN to a value that would cause it to belong in PART_2:
SQL> update range_exampleset range_key_column = to_date(’01-jan-2021′,’dd-mon-yyyy’) where range_key_column = to_date(’31-dec-2020′,’dd-mon-yyyy’);update range_example
ERROR at line 1:
ORA-14402: updating partition key column would cause a partition change

This immediately raises an error since we did not explicitly enable row movement. We can enable row movement on this table to allow the row to move from partition to partition.
You should be aware of a subtle side effect of doing this, however, namely, that the ROWID of a row will change as the result of the update:
SQL> select rowid from range_examplewhere range_key_column = to_date(’31-dec-2020′,’dd-mon-yyyy’);
SQL> alter table range_example enable row movement; Table altered.
SQL> update range_exampleset range_key_column = to_date(’01-jan-2021′,’dd-mon-yyyy’) where range_key_column = to_date(’31-dec-2020′,’dd-mon-yyyy’);1 row updated.

As long as you understand that the ROWID of the row will change on this update, enabling row movement will allow you to update partition keys.

Note There are other cases where a ROWID can change as a result of an update. It can happen as a result of an update to the primary key of an IOT. The universal ROWID will change for that row, too. The FLASHBACK TABLE command may also change the ROWID of rows, as might the ALTER TABLE SHRINK command.

You need to understand that, internally, row movement is done as if you had, in fact, deleted the row and reinserted it. It will update every single index on this table and delete the old entry and insert a new one. It will do the physical work of a DELETE plus an INSERT.

However, it is considered an update by Oracle even though it physically deletes and inserts the row—therefore, it won’t cause INSERT and DELETE triggers to fire, just the UPDATE triggers. Additionally, child tables that might prevent a DELETE due to a foreign key constraint won’t.
You do have to be prepared, however, for the extra work that will be performed; it is much more expensive than a normal UPDATE. Therefore, it would be a bad design decision to construct a system whereby the partition key was modified frequently and that modification would cause a partition movement.

Table Partitioning Schemes Wrap-Up-Partitioning

Posted on February 15, 2022July 25, 2024 by Barbara Raymond

In general, range partitioning is useful when you have data that is logically segregated by some value(s). Time-based data immediately comes to the forefront as a classic example—partition by “Sales Quarter,” “Fiscal Year,” or “Month.” Range partitioning is able to take advantage of partition elimination in many cases, including the use of exact equality and ranges (less than, greater than, between, and so on).

Hash partitioning is suitable for data that has no natural ranges by which you can partition. For example, if you had to load a table full of census-related data, there might not be an attribute by which it would make sense to range partition. However, you would still like to take advantage of the administrative, performance, and availability enhancements offered by partitioning. Here, you would simply pick a unique or almost unique set of columns to hash on. This would achieve an even distribution of data across as many partitions as you like. Hash partitioned objects can take advantage of partition elimination when exact equality or IN ( value, value, … ) is used, but not when ranges of data are used.

List partitioning is suitable for data that has a column with a discrete set of values, and partitioning by the column makes sense based on the way your application uses it (e.g., it easily permits partition elimination in queries). Classic examples would be a state or region code—or, in fact, many code type attributes in general.

Interval partitioning extends the range partitioning feature and allows partitions to automatically be added when data inserted into the table doesn’t fit into an existing partition. This feature greatly enhances range partitioning in that there is less maintenance involved (because the DBA doesn’t have to necessarily monitor the ranges and manually add partitions).

Reference partitioning eases the implementation of partitioned tables that are related through referential integrity constraints. This allows the child table to be logically partitioned in the same manner as the parent table without having to duplicate parent table columns to the child table.

Interval reference partitioning allows you to combine the interval and reference partitioning features. This ability is useful when you need to use the interval and reference partitioning features in tandem.

Virtual column partitioning allows you to partition using a virtual column as the key. This feature provides you the flexibility to partition on a substring of a regular column value (or any other SQL expression). This is useful when it’s not feasible to use an existing column as the partition key, but you can partition on a subset of the value contained in an existing column.

Composite partitioning is useful when you have something logical by which you can range partition, but the resulting range partitions are still too large to manage effectively. You can apply the range, list, or hash partitioning and then further divide each range by a hash function or use lists to partition or even ranges.

This will allow you to spread I/O requests out across many devices in any given large partition. Additionally, you may achieve partition elimination at three levels now. If you query on the partition key, Oracle is able to eliminate any partitions that do not meet your criteria. If you add the subpartition key to your query, Oracle can eliminate the other subpartitions within that partition. If you just query on the subpartition key (not using the partition key), Oracle will query only those hash or list subpartitions that apply from each partition.

It is recommended that if there is something by which it makes sense to range partition your data, you should use that over hash or list partitioning. Hash and list partitioning add many of the salient benefits of partitioning, but they are not as useful as range partitioning when it comes to partition elimination. Using hash or list partitions within range partitions is advisable when the resulting range partitions are too large to manage or when you want to use all PDML capabilities or parallel index scanning against a single range partition.

Interval Partitioning-Partitioning-2

Posted on December 9, 2021July 25, 2024 by Barbara Raymond

That is—all of the data for the month of February 2020. If we insert another row in some other month, as follows, we can see that another partition, SYS_P1624, is added that contains all of the data for the month of June 2020:

SQL> insert into audit_trail (ts,data) values ( to_date(’25-jun-2020′, ‘dd-mon-yyyy’), ‘xx’ );1 row created.
SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) intervalfrom user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’ and a.table_name = b.table_nameorder by a.partition_position;

You might be looking at this output and asking why everything is in the USERS tablespace. We clearly asked for the data to be spread out over the USERS tablespace and the EXAMPLE tablespace, so why is everything in a single tablespace?

It has to do with the fact that when the database is figuring out what partition the data goes into, it is also computing which tablespace it would go into. Since each of our partitions is an even number of months away from each other and we are using just two tablespaces, we end up using the same tablespace over and over.

If we only loaded “every other month” into this table, we would end up using only a single tablespace. We can see that the EXAMPLE tablespace can be used by adding some row that is an “odd” number of months away from our existing data:

SQL> insert into audit_trail (ts,data) values ( to_date(’15-mar-2020′,’dd-mon-yyyy’), ‘xx’ );1 row created.
SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) intervalfrom user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’and a.table_name = b.table_name order by a.partition_position;

Now we have used the EXAMPLE tablespace. This new partition was slid in between the two existing partitions and will contain all of our March 2020 data.

You might be asking, “What happens if I roll back at this point?” If we were to roll back, it should be obvious that the AUDIT_TRAIL rows we just inserted would go away:
SQL> select * from audit_trail;
SQL> rollback;Rollback complete.
SQL> select * from audit_trail; no rows selected

But what isn’t clear immediately is what would happen to the partitions we added:do they stay or will they go away as well? A quick query will verify that they will stay:

SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) interval from user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’ and a.table_name = b.table_name order by a.partition_position;PARTITION_ TABLESPACE HIGH_VALUE INTERVAL

As soon as they are created, they are committed and visible. These partitions are created using a recursive transaction, a transaction executed separate and distinct from any transaction you might already be performing.

When we went to insert the row and the database discovered that the partition we needed did not exist, the database immediately started a new transaction, updated the data dictionary to reflect the new partition’s existence, and committed its work.

It must do this, or there would be severe contention (serialization) on many inserts as other transactions would have to wait for us to commit to be able to see this new partition. Therefore, this DDL is done outside of your existing transaction, and the partitions will persist.

You might have noticed that the database names the partition for us; SYS_P1625 is the name of the newest partition. The names are not sortable nor very meaningful in the sense most people would be used to.

They show the order in which the partitions were added to the table (although you cannot rely on that always being true; it is subject to change) but not much else. Normally, in a range partitioned table, the DBA would have named the partition using some naming scheme and in most cases would have made the partition names sortable.

For example, the February data would be in a partition named PART_2020_02 (using a format of PART_yyyy_mm), March would be in PART_2020_03, and so on. With interval partitioning, you have no control over the partition names as they are created, but you can easily rename them afterward if you like.
For example, we could query out the HIGH_VALUE string and using dynamic SQL convert that into nicely formatted, meaningful names.

We can do this because we understand how we’d like the names formatted; the database does not. For example:

SQL> declarel_str varchar2(4000);beginfor x in ( select a.partition_name, a.tablespace_name, a.high_value from user_tab_partitions a
where a.table_name = ‘AUDIT_TRAIL’and a.interval = ‘YES’and a.partition_name like ‘SYS_P%’ escape ‘\’ )loopexecute immediate’select to_char( ‘ || x.high_value || ‘-numtodsinterval(1,”second”), ””PART_”yyyy_mm” ) from dual’ into l_str;execute immediate’alter table audit_trail rename partition “‘ || x.partition_name || ‘” to “‘ || l_str || ‘”‘;end loop;end;/
PL/SQL procedure successfully completed.

So, what we’ve done is take the HIGH_VALUE and subtract one second from it. We know that the HIGH_VALUE represents the strictly less than value, so one second before its value would be a value in the range. Once we have that, we applied the format “PART_”yyyy_mm to the resulting TIMESTAMP and get a string such as PART_2020_03 for March 2020. We use that string in a rename command, and now our data dictionary looks like this:
SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) intervalfrom user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’and a.table_name = b.table_nameorder by a.partition_position;

We would just run that script every now and then to rename any newly added partitions to keep the nice naming convention in place. Bear in mind, to avoid any SQL injection issues (we are using string concatenation, not bind variables; we cannot use bind variables in DDL), we would want to keep this script as an anonymous block or as an invoker’s rights routine if we decide to make it a stored procedure. That will prevent others from running SQL in our schema as if they were us, which could be a disaster.

Virtual Column Partitioning-Partitioning

Posted on November 2, 2021July 25, 2024 by Barbara Raymond

Virtual column partitioning allows you to partition based on a SQL expression. This type of partitioning is useful when a table column is overloaded with multiple business values and you want to partition on a portion of that column. For example, let’s say you have a RESERVATION_CODE column in a table:
$ sqlplus eoda/foo@PDB1
SQL> create table res(reservation_code varchar2(30)); Table created.

And the first character in the RESERVATION_CODE column defines a region from which the reservation originated. For the purposes of this example, let’s say a first character of an A or C maps to the NE region, values of B map to the SW region, and values of D map to the NW region.

In this scenario, we know that the first character represents the region, and we want to be able to list partition by region. With the data as it is, it’s not practical to list partition by the RESERVATION_CODE column, whereas virtual partitioning allows us to apply a SQL function to the column and list partition by the first character. Here’s what the table definition looks like with virtual column partitioning:
SQL> drop table res;
SQLcreatetableres(reservation_codevarchar2(30),regionas(decode(substr(reservation_code,1,1),’A’,’NE’,’C’,’NE’,’B’,’SW’,’D’,’NW’)))partition by list (region)(partition NE values(‘NE’),partition SW values(‘SW’),partition NW values(‘NW’));Table created.

We can view the partition information via this query:
SQL> select a.table_name, a.partition_name, a.high_value from user_tab_partitions a, user_part_tables b where a.table_name = ‘RES’and a.table_name = b.table_nameorder by a.table_name;

TABLE_NAME PARTITION_ HIGH_VALUE
RES NE ‘NE’
RES NW ‘NW’
RES SW ‘SW’

Next, some random test data is inserted into the table (your random results will be different from this example):
SQL> insert into res (reservation_code)select chr(64+(round(dbms_random.value(1,4)))) || level from dual connect by level < 10;9 rows created.

Now let’s see how the data was partitioned:
SQL> select ‘NE’, reservation_code, region from res partition(NE) union all
select ‘SW’, reservation_code, region from res partition(SW) union all
select ‘NW’, reservation_code, region from res partition(NW);

‘N RESERVATION_CODE RE

NE C3 NE
NE C5 NE
NE A6 NE
NE C8 NE
SW B2 SW
SW B7 SW
SW B9 SW
NW D1 NW
NW D4 NW

In this way, virtual column partitioning is often appropriate when there is a business requirement to partition on portions of data in a column or combinations of data from different columns (especially when there might not be an obvious way to list or range partition). The expression behind a virtual column can be a complex calculation, return a subset of a column string, combine column values, and so on.

Reference Partitioning-Partitioning-2

Posted on September 19, 2021July 25, 2024 by Barbara Raymond

The simple syntax to reimplement our previous example could be as follows. We’ll reuse the existing parent table ORDERS and just truncate that table:
SQL> drop table order_line_items cascade constraints; Table dropped.
SQL> truncate table orders;Table truncated.
SQL> insert into orders values ( 1, to_date( ’01-jun-2020′, ‘dd-mon-yyyy’ ), ‘xxx’ );1 row created.
SQL> insert into orders values ( 2, to_date( ’01-jun-2021′, ‘dd-mon-yyyy’ ), ‘xxx’ );1 row created.

And create a new child table:
SQL> create table order_line_items(order# number,line# number,datavarchar2(30),constraint c1_pk primary key(order#,line#),constraint c1_fk_p foreign key(order#) references orders)enable row movementpartition by reference(c1_fk_p);Table created.
SQL> insert into order_line_items values ( 1, 1, ‘yyy’ ); 1 row created.
SQL> insert into order_line_items values ( 2, 1, ‘yyy’ ); 1 row created.

The magic is on line 10 of the CREATE TABLE statement. Here, we replaced the range partitioning statement with PARTITION BY REFERENCE. This allows us to name the foreign key constraint to use to discover what our partitioning scheme will be.

Here, we see the foreign key is to the ORDERS table—the database read the structure of the ORDERS table and determined that it had two partitions—therefore, our child table will have two partitions.

In fact, if we query the data dictionary right now, we can see that the two tables have the same exact partitioning structure:
SQL> select table_name, partition_name from user_tab_partitionswhere table_name in ( ‘ORDERS’, ‘ORDER_LINE_ITEMS’ )order by table_name, partition_name;
TABLE_NAME PARTITION_NAME
ORDERS PART_2020
ORDERS PART_2021
ORDER_LINE_ITEMS PART_2020
ORDER_LINE_ITEMS PART_2021

Further, since the database understands these two tables are related, we can drop the parent table partition and have it automatically clean up the related child table partitions (since the child inherits from the parent, any alteration of the parent’s partition structure cascades down):
SQL> alter table orders drop partition part_2020 update global indexes; Table altered.
SQL> select table_name, partition_name from user_tab_partitionswhere table_name in ( ‘ORDERS’, ‘ORDER_LINE_ITEMS’ )order by table_name, partition_name;
TABLE_NAME PARTITION_NAME
ORDERS PART_2021
ORDER_LINE_ITEMS PART_2021

So, the DROP we were prevented from performing before is now permitted, and it cascades to the child table automatically.

Further, if we ADD a partition, as follows, we can see that that operation is cascaded as well; there will be a one-to-one parity between the parent and the child:
SQL> alter table orders add partitionpart_2022 values less than(to_date( ’01-01-2023′, ‘dd-mm-yyyy’ )); Table altered.
SQL> select table_name, partition_name from user_tab_partitionswhere table_name in ( ‘ORDERS’, ‘ORDER_LINE_ITEMS’ )order by table_name, partition_name;
TABLE_NAME PARTITION_NAME
ORDERS PART_2021
ORDERS PART_2022
ORDER_LINE_ITEMS PART_2021
ORDER_LINE_ITEMS PART_2022

A part of the preceding CREATE TABLE statement that we did not discuss is the ENABLE ROW MOVEMENT. In short, the syntax allows an UPDATE to take place such that the UPDATE modifies the partition key value and modifies it in such a way as to cause the row to move from its current partition into some other partition.

Now, since we defined our parent table originally as permitting row movement, we were forced to define all of our child tables (and their children and so on) as having that capability as well, for if the parent row moves and we are using reference partitioning, we know the child row(s) must move as well. For example:
SQL> select ‘2021’, count() from order_line_items partition(part_2021) union all select ‘2022’, count() from order_line_items partition(part_2022);’202 COUNT(*)

We can see that right now our data in the child table ORDER_LINE_ITEMS is in the 2021 partition. By performing a simple update against the parent ORDERS table, as follows, we can see our data moved—in the child table:
SQL> update orders set order_date = add_months(order_date,12); 1 row updated.
SQL> select ‘2021’, count() from order_line_items partition(part_2021) union all select ‘2022’, count() from order_line_items partition(part_2022);’202 COUNT(*)

An update against the parent was cascaded down to the child table and caused the child table to move a row (or rows as needed).

To summarize, reference partitioning removes the need to denormalize data when partitioning parent and child tables. Furthermore, when dropping a parent partition, it will automatically drop the referenced child partition. These features are very useful in data warehousing environments.