How Hash Partitioning Works at Oracle Database

Local Indexes-Partitioning

Posted on March 5, 2022July 25, 2024 by Barbara Raymond

Oracle makes a distinction between the following two types of local indexes:

•\ Local prefixed indexes: These are indexes whereby the partition keys are on the leading edge of the index definition. For example, if a table is range partitioned on a column named LOAD_DATE, a local prefixed index on that table would have LOAD_DATE as the first column in its column list.

•\ Local nonprefixed indexes: These indexes do not have the partition key on the leading edge of their column list. The index may or may not contain the partition key columns.

Both types of indexes are able to take advantage of partition elimination, both can support uniqueness (as long as the nonprefixed index includes the partition key), and so on. The fact is that a query that uses a local prefixed index will always allow for index partition elimination, whereas a query that uses a local nonprefixed index might not. This is why local nonprefixed indexes are said to be slower by some people—they do not enforce partition elimination (but they do support it).

There is nothing inherently better about a local prefixed index as opposed to a local nonprefixed index when that index is used as the initial path to the table in a query. What I mean is that if the query can start with “scan an index” as the first step, there isn’t much difference between a prefixed and a nonprefixed index.

Partition Elimination Behavior

For the query that starts with an index access, whether or not it can eliminate partitions from consideration all really depends on the predicate in your query. A small example will help demonstrate this. The following code creates a table, PARTITIONED_TABLE, that is range partitioned on a numeric column A such that values less than two will be in partition PART_1 and values less than three will be in partition PART_2:
$ sqlplus eoda/foo@PDB1
SQL> CREATE TABLE partitioned_table( a int,b int,data char(20))PARTITION BY RANGE (a)(PARTITION part_1 VALUES LESS THAN(2) tablespace p1, PARTITION part_2 VALUES LESS THAN(3) tablespace p2);Table created.

We then create both a local prefixed index, LOCAL_PREFIXED, and a local nonprefixed index, LOCAL_NONPREFIXED. Note that the nonprefixed index does not have A on the leading edge of its definition, which is what makes it a nonprefixed index:
SQL> create index local_prefixed on partitioned_table (a,b) local; Index created.
SQL> create index local_nonprefixed on partitioned_table (b) local; Index created.

Next, we’ll insert some data into one partition and gather statistics:
SQL> insert into partitioned_table select mod(rownum-1,2)+1, rownum, ‘x’
from dual connect by level <= 70000; 70000 rows created.

We take tablespace P2 offline, which contains the PART_2 partition for both the tables and indexes:
SQL> alter tablespace p2 offline;Tablespace altered.

Taking tablespace P2 offline will prevent Oracle from accessing those specific index partitions. It will be as if we had suffered media failure, causing them to become unavailable. Now we’ll query the table to see what index partitions are needed by different queries. This first query is written to permit the use of the local prefixed index:
SQL> select * from partitioned_table where a = 1 and b = 1;

This query succeeded, and we can see why by reviewing the explain plan. We’ll use the built-in package DBMS_XPLAN to see what partitions this query accesses. The PSTART (partition start) and PSTOP (partition stop) columns in the output show us exactly what partitions this query needs to have online and available in order to succeed:
SQL> explain plan for select * from partitioned_table where a = 1 and b = 1; Explained.

Now access DBMS_XPLAN.DISPLAY and instruct it to show the basic explain plan details plus partitioning information:
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

So, the query that uses LOCAL_PREFIXED succeeds. The optimizer was able to exclude PART_2 of LOCAL_PREFIXED from consideration because we specified A=1 in the query, and we can see that clearly in the plan PSTART and PSTOP are both equal to 1. Partition elimination kicked in for us. The second query fails, however:
SQL> select * from partitioned_table where b = 1;
ERROR:
ORA-00376: file 23 cannot be read at this time ORA-01110: data file 23: ‘/opt/oracle/oradata/CDB/ C217E68DF48779E1E0530101007F73B9/datafile/o1_mf_p2_jc8bg9py_.dbf’

And using the same technique, we can see why:
SQL> explain plan for select * from partitioned_table where b = 1; Explained.
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

Here, the optimizer was not able to remove PART_2 of LOCAL_NONPREFIXED from consideration—it needed to look in both the PART_1 and PART_2 partitions of the index to see if B=1 was in there. Herein lies a performance issue with local nonprefixed indexes: they do not make you use the partition key in the predicate as a prefixed index does. It is not that prefixed indexes are better; it’s just that in order to use them, you must use a query that allows for partition elimination.

If we drop the LOCAL_PREFIXED index and rerun the original successful query, as follows:
SQL> drop index local_prefixed;Index dropped.
SQL> select * from partitioned_table where a = 1 and b = 1;

It succeeds, but as we’ll see, it used the same index that just a moment ago failed us. The plan shows that Oracle was able to employ partition elimination here—the predicate A=1 was enough information for the database to eliminate index partition PART_2 from consideration:
SQL> explain plan for select * from partitioned_table where a = 1 and b = 1; Explained.
SQL> select * from table(dbms_xplan.display(null,null,’BASIC +PARTITION’));

Note the PSTART and PSTOP column values of 1 and 1. This proves that the optimizer is able to perform partition elimination even for nonprefixed local indexes.

If you frequently query the preceding table with the following queries, then you might consider using a local nonprefixed index on (b,a):
select … from partitioned_table where a = :a and b = :b; select … from partitioned_table where b = :b;

That index would be useful for both of the preceding queries. The local prefixed index on (a,b) would be useful only for the first query.

The bottom line here is that you should not be afraid of nonprefixed indexes or consider them as major performance inhibitors. If you have many queries that could benefit from a nonprefixed index as outlined previously, then you should consider using one. The main concern is to ensure that your queries contain predicates that allow for index partition elimination whenever possible. The use of prefixed local indexes enforces that consideration. The use of nonprefixed indexes does not. Consider also how the index will be used. If it will be used as the first step in a query plan, there are not many differences between the two types of indexes.

Table Partitioning Schemes Wrap-Up-Partitioning

Posted on February 15, 2022July 25, 2024 by Barbara Raymond

In general, range partitioning is useful when you have data that is logically segregated by some value(s). Time-based data immediately comes to the forefront as a classic example—partition by “Sales Quarter,” “Fiscal Year,” or “Month.” Range partitioning is able to take advantage of partition elimination in many cases, including the use of exact equality and ranges (less than, greater than, between, and so on).

Hash partitioning is suitable for data that has no natural ranges by which you can partition. For example, if you had to load a table full of census-related data, there might not be an attribute by which it would make sense to range partition. However, you would still like to take advantage of the administrative, performance, and availability enhancements offered by partitioning. Here, you would simply pick a unique or almost unique set of columns to hash on. This would achieve an even distribution of data across as many partitions as you like. Hash partitioned objects can take advantage of partition elimination when exact equality or IN ( value, value, … ) is used, but not when ranges of data are used.

List partitioning is suitable for data that has a column with a discrete set of values, and partitioning by the column makes sense based on the way your application uses it (e.g., it easily permits partition elimination in queries). Classic examples would be a state or region code—or, in fact, many code type attributes in general.

Interval partitioning extends the range partitioning feature and allows partitions to automatically be added when data inserted into the table doesn’t fit into an existing partition. This feature greatly enhances range partitioning in that there is less maintenance involved (because the DBA doesn’t have to necessarily monitor the ranges and manually add partitions).

Reference partitioning eases the implementation of partitioned tables that are related through referential integrity constraints. This allows the child table to be logically partitioned in the same manner as the parent table without having to duplicate parent table columns to the child table.

Interval reference partitioning allows you to combine the interval and reference partitioning features. This ability is useful when you need to use the interval and reference partitioning features in tandem.

Virtual column partitioning allows you to partition using a virtual column as the key. This feature provides you the flexibility to partition on a substring of a regular column value (or any other SQL expression). This is useful when it’s not feasible to use an existing column as the partition key, but you can partition on a subset of the value contained in an existing column.

Composite partitioning is useful when you have something logical by which you can range partition, but the resulting range partitions are still too large to manage effectively. You can apply the range, list, or hash partitioning and then further divide each range by a hash function or use lists to partition or even ranges.

This will allow you to spread I/O requests out across many devices in any given large partition. Additionally, you may achieve partition elimination at three levels now. If you query on the partition key, Oracle is able to eliminate any partitions that do not meet your criteria. If you add the subpartition key to your query, Oracle can eliminate the other subpartitions within that partition. If you just query on the subpartition key (not using the partition key), Oracle will query only those hash or list subpartitions that apply from each partition.

It is recommended that if there is something by which it makes sense to range partition your data, you should use that over hash or list partitioning. Hash and list partitioning add many of the salient benefits of partitioning, but they are not as useful as range partitioning when it comes to partition elimination. Using hash or list partitions within range partitions is advisable when the resulting range partitions are too large to manage or when you want to use all PDML capabilities or parallel index scanning against a single range partition.

Interval Partitioning-Partitioning-2

Posted on December 9, 2021July 25, 2024 by Barbara Raymond

That is—all of the data for the month of February 2020. If we insert another row in some other month, as follows, we can see that another partition, SYS_P1624, is added that contains all of the data for the month of June 2020:

SQL> insert into audit_trail (ts,data) values ( to_date(’25-jun-2020′, ‘dd-mon-yyyy’), ‘xx’ );1 row created.
SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) intervalfrom user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’ and a.table_name = b.table_nameorder by a.partition_position;

You might be looking at this output and asking why everything is in the USERS tablespace. We clearly asked for the data to be spread out over the USERS tablespace and the EXAMPLE tablespace, so why is everything in a single tablespace?

It has to do with the fact that when the database is figuring out what partition the data goes into, it is also computing which tablespace it would go into. Since each of our partitions is an even number of months away from each other and we are using just two tablespaces, we end up using the same tablespace over and over.

If we only loaded “every other month” into this table, we would end up using only a single tablespace. We can see that the EXAMPLE tablespace can be used by adding some row that is an “odd” number of months away from our existing data:

SQL> insert into audit_trail (ts,data) values ( to_date(’15-mar-2020′,’dd-mon-yyyy’), ‘xx’ );1 row created.
SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) intervalfrom user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’and a.table_name = b.table_name order by a.partition_position;

Now we have used the EXAMPLE tablespace. This new partition was slid in between the two existing partitions and will contain all of our March 2020 data.

You might be asking, “What happens if I roll back at this point?” If we were to roll back, it should be obvious that the AUDIT_TRAIL rows we just inserted would go away:
SQL> select * from audit_trail;
SQL> rollback;Rollback complete.
SQL> select * from audit_trail; no rows selected

But what isn’t clear immediately is what would happen to the partitions we added:do they stay or will they go away as well? A quick query will verify that they will stay:

SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) interval from user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’ and a.table_name = b.table_name order by a.partition_position;PARTITION_ TABLESPACE HIGH_VALUE INTERVAL

As soon as they are created, they are committed and visible. These partitions are created using a recursive transaction, a transaction executed separate and distinct from any transaction you might already be performing.

When we went to insert the row and the database discovered that the partition we needed did not exist, the database immediately started a new transaction, updated the data dictionary to reflect the new partition’s existence, and committed its work.

It must do this, or there would be severe contention (serialization) on many inserts as other transactions would have to wait for us to commit to be able to see this new partition. Therefore, this DDL is done outside of your existing transaction, and the partitions will persist.

You might have noticed that the database names the partition for us; SYS_P1625 is the name of the newest partition. The names are not sortable nor very meaningful in the sense most people would be used to.

They show the order in which the partitions were added to the table (although you cannot rely on that always being true; it is subject to change) but not much else. Normally, in a range partitioned table, the DBA would have named the partition using some naming scheme and in most cases would have made the partition names sortable.

For example, the February data would be in a partition named PART_2020_02 (using a format of PART_yyyy_mm), March would be in PART_2020_03, and so on. With interval partitioning, you have no control over the partition names as they are created, but you can easily rename them afterward if you like.
For example, we could query out the HIGH_VALUE string and using dynamic SQL convert that into nicely formatted, meaningful names.

We can do this because we understand how we’d like the names formatted; the database does not. For example:

SQL> declarel_str varchar2(4000);beginfor x in ( select a.partition_name, a.tablespace_name, a.high_value from user_tab_partitions a
where a.table_name = ‘AUDIT_TRAIL’and a.interval = ‘YES’and a.partition_name like ‘SYS_P%’ escape ‘\’ )loopexecute immediate’select to_char( ‘ || x.high_value || ‘-numtodsinterval(1,”second”), ””PART_”yyyy_mm” ) from dual’ into l_str;execute immediate’alter table audit_trail rename partition “‘ || x.partition_name || ‘” to “‘ || l_str || ‘”‘;end loop;end;/
PL/SQL procedure successfully completed.

So, what we’ve done is take the HIGH_VALUE and subtract one second from it. We know that the HIGH_VALUE represents the strictly less than value, so one second before its value would be a value in the range. Once we have that, we applied the format “PART_”yyyy_mm to the resulting TIMESTAMP and get a string such as PART_2020_03 for March 2020. We use that string in a rename command, and now our data dictionary looks like this:
SQL> select a.partition_name, a.tablespace_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) intervalfrom user_tab_partitions a, user_part_tables b where a.table_name = ‘AUDIT_TRAIL’and a.table_name = b.table_nameorder by a.partition_position;

We would just run that script every now and then to rename any newly added partitions to keep the nice naming convention in place. Bear in mind, to avoid any SQL injection issues (we are using string concatenation, not bind variables; we cannot use bind variables in DDL), we would want to keep this script as an anonymous block or as an invoker’s rights routine if we decide to make it a stored procedure. That will prevent others from running SQL in our schema as if they were us, which could be a disaster.

Virtual Column Partitioning-Partitioning

Posted on November 2, 2021July 25, 2024 by Barbara Raymond

Virtual column partitioning allows you to partition based on a SQL expression. This type of partitioning is useful when a table column is overloaded with multiple business values and you want to partition on a portion of that column. For example, let’s say you have a RESERVATION_CODE column in a table:
$ sqlplus eoda/foo@PDB1
SQL> create table res(reservation_code varchar2(30)); Table created.

And the first character in the RESERVATION_CODE column defines a region from which the reservation originated. For the purposes of this example, let’s say a first character of an A or C maps to the NE region, values of B map to the SW region, and values of D map to the NW region.

In this scenario, we know that the first character represents the region, and we want to be able to list partition by region. With the data as it is, it’s not practical to list partition by the RESERVATION_CODE column, whereas virtual partitioning allows us to apply a SQL function to the column and list partition by the first character. Here’s what the table definition looks like with virtual column partitioning:
SQL> drop table res;
SQLcreatetableres(reservation_codevarchar2(30),regionas(decode(substr(reservation_code,1,1),’A’,’NE’,’C’,’NE’,’B’,’SW’,’D’,’NW’)))partition by list (region)(partition NE values(‘NE’),partition SW values(‘SW’),partition NW values(‘NW’));Table created.

We can view the partition information via this query:
SQL> select a.table_name, a.partition_name, a.high_value from user_tab_partitions a, user_part_tables b where a.table_name = ‘RES’and a.table_name = b.table_nameorder by a.table_name;

TABLE_NAME PARTITION_ HIGH_VALUE
RES NE ‘NE’
RES NW ‘NW’
RES SW ‘SW’

Next, some random test data is inserted into the table (your random results will be different from this example):
SQL> insert into res (reservation_code)select chr(64+(round(dbms_random.value(1,4)))) || level from dual connect by level < 10;9 rows created.

Now let’s see how the data was partitioned:
SQL> select ‘NE’, reservation_code, region from res partition(NE) union all
select ‘SW’, reservation_code, region from res partition(SW) union all
select ‘NW’, reservation_code, region from res partition(NW);

‘N RESERVATION_CODE RE

NE C3 NE
NE C5 NE
NE A6 NE
NE C8 NE
SW B2 SW
SW B7 SW
SW B9 SW
NW D1 NW
NW D4 NW

In this way, virtual column partitioning is often appropriate when there is a business requirement to partition on portions of data in a column or combinations of data from different columns (especially when there might not be an obvious way to list or range partition). The expression behind a virtual column can be a complex calculation, return a subset of a column string, combine column values, and so on.

Reference Partitioning-Partitioning-2

Posted on September 19, 2021July 25, 2024 by Barbara Raymond

The simple syntax to reimplement our previous example could be as follows. We’ll reuse the existing parent table ORDERS and just truncate that table:
SQL> drop table order_line_items cascade constraints; Table dropped.
SQL> truncate table orders;Table truncated.
SQL> insert into orders values ( 1, to_date( ’01-jun-2020′, ‘dd-mon-yyyy’ ), ‘xxx’ );1 row created.
SQL> insert into orders values ( 2, to_date( ’01-jun-2021′, ‘dd-mon-yyyy’ ), ‘xxx’ );1 row created.

And create a new child table:
SQL> create table order_line_items(order# number,line# number,datavarchar2(30),constraint c1_pk primary key(order#,line#),constraint c1_fk_p foreign key(order#) references orders)enable row movementpartition by reference(c1_fk_p);Table created.
SQL> insert into order_line_items values ( 1, 1, ‘yyy’ ); 1 row created.
SQL> insert into order_line_items values ( 2, 1, ‘yyy’ ); 1 row created.

The magic is on line 10 of the CREATE TABLE statement. Here, we replaced the range partitioning statement with PARTITION BY REFERENCE. This allows us to name the foreign key constraint to use to discover what our partitioning scheme will be.

Here, we see the foreign key is to the ORDERS table—the database read the structure of the ORDERS table and determined that it had two partitions—therefore, our child table will have two partitions.

In fact, if we query the data dictionary right now, we can see that the two tables have the same exact partitioning structure:
SQL> select table_name, partition_name from user_tab_partitionswhere table_name in ( ‘ORDERS’, ‘ORDER_LINE_ITEMS’ )order by table_name, partition_name;
TABLE_NAME PARTITION_NAME
ORDERS PART_2020
ORDERS PART_2021
ORDER_LINE_ITEMS PART_2020
ORDER_LINE_ITEMS PART_2021

Further, since the database understands these two tables are related, we can drop the parent table partition and have it automatically clean up the related child table partitions (since the child inherits from the parent, any alteration of the parent’s partition structure cascades down):
SQL> alter table orders drop partition part_2020 update global indexes; Table altered.
SQL> select table_name, partition_name from user_tab_partitionswhere table_name in ( ‘ORDERS’, ‘ORDER_LINE_ITEMS’ )order by table_name, partition_name;
TABLE_NAME PARTITION_NAME
ORDERS PART_2021
ORDER_LINE_ITEMS PART_2021

So, the DROP we were prevented from performing before is now permitted, and it cascades to the child table automatically.

Further, if we ADD a partition, as follows, we can see that that operation is cascaded as well; there will be a one-to-one parity between the parent and the child:
SQL> alter table orders add partitionpart_2022 values less than(to_date( ’01-01-2023′, ‘dd-mm-yyyy’ )); Table altered.
SQL> select table_name, partition_name from user_tab_partitionswhere table_name in ( ‘ORDERS’, ‘ORDER_LINE_ITEMS’ )order by table_name, partition_name;
TABLE_NAME PARTITION_NAME
ORDERS PART_2021
ORDERS PART_2022
ORDER_LINE_ITEMS PART_2021
ORDER_LINE_ITEMS PART_2022

A part of the preceding CREATE TABLE statement that we did not discuss is the ENABLE ROW MOVEMENT. In short, the syntax allows an UPDATE to take place such that the UPDATE modifies the partition key value and modifies it in such a way as to cause the row to move from its current partition into some other partition.

Now, since we defined our parent table originally as permitting row movement, we were forced to define all of our child tables (and their children and so on) as having that capability as well, for if the parent row moves and we are using reference partitioning, we know the child row(s) must move as well. For example:
SQL> select ‘2021’, count() from order_line_items partition(part_2021) union all select ‘2022’, count() from order_line_items partition(part_2022);’202 COUNT(*)

We can see that right now our data in the child table ORDER_LINE_ITEMS is in the 2021 partition. By performing a simple update against the parent ORDERS table, as follows, we can see our data moved—in the child table:
SQL> update orders set order_date = add_months(order_date,12); 1 row updated.
SQL> select ‘2021’, count() from order_line_items partition(part_2021) union all select ‘2022’, count() from order_line_items partition(part_2022);’202 COUNT(*)

An update against the parent was cascaded down to the child table and caused the child table to move a row (or rows as needed).

To summarize, reference partitioning removes the need to denormalize data when partitioning parent and child tables. Furthermore, when dropping a parent partition, it will automatically drop the referenced child partition. These features are very useful in data warehousing environments.

List Partitioning-Partitioning

Posted on September 9, 2021July 25, 2024 by Barbara Raymond

List partitioning provides the ability to specify in which partition a row will reside, based on discrete lists of values. It is often useful to be able to partition by some code, such as a state or region code. For example, we might want to pull together in a single partition all records for people in the states of Maine (ME), New Hampshire (NH), Vermont (VT), and Massachusetts (MA), since those states are located next to or near each other, and our application queries data by geographic region. Similarly, we might want to group together Connecticut (CT), Rhode Island (RI), and New York (NY).

We can’t use a range partition, since the range for the first partition would be ME through VT, and the second range would be CT through RI. Those ranges overlap. We can’t use hash partitioning since we can’t control which partition any given row goes into; the built-in hash function provided by Oracle does that. With list partitioning, we can accomplish this custom partitioning scheme easily:
$ sqlplus eoda/foo@PDB1
SQL> create table list_example( state_cd varchar2(2),data varchar2(20))partition by list(state_cd)(partition part_1 values ( ‘ME’, ‘NH’, ‘VT’, ‘MA’ ), partition part_2 values ( ‘CT’, ‘RI’, ‘NY’ )
);Table created.

Figure 13-3 shows that Oracle will inspect the STATE_CD column and, based on its value, place the row into the correct partition.

Figure 13-3. List partition insert example

As we saw for range partitioning, if we try to insert a value that isn’t specified in the list partition, Oracle will raise an appropriate error back to the client application. In other words, a list partitioned table without a DEFAULT partition will implicitly impose a constraint much like a check constraint on the table:
SQL> insert into list_example values ( ‘VA’, ‘data’ ); insert into list_example values ( ‘VA’, ‘data’ )
ERROR at line 1:
ORA-14400: inserted partition key does not map to any partition

If we want to segregate these seven states into their separate partitions, as we have, but have all remaining state codes (or, in fact, any other row that happens to be inserted that doesn’t have one of these seven codes) go into a third partition, then we can use the VALUES ( DEFAULT ) clause. Here, we’ll alter the table to add this partition (we could use this in the CREATE TABLE statement as well):
SQL1> alter table list_example add partition part_3 values ( DEFAULT ); Table altered.
SQL> insert into list_example values ( ‘VA’, ‘data’ ); 1 row created.

All values that are not explicitly in our list of values will go here. A word of caution on the use of DEFAULT: Once a list partitioned table has a DEFAULT partition, you cannot add any more partitions to it, so
SQL> alter table list_example add partition part_4 values( ‘CA’, ‘NM’ ); alter table list_example
ERROR at line 1:
ORA-14323: cannot add partition when DEFAULT partition exists

We would have to remove the DEFAULT partition, then add PART_4, and then put the DEFAULT partition back. The reason behind this is that the DEFAULT partition could have had rows with the list partition key value of CA or NM—they would not belong in the DEFAULT partition after adding PART_4.

How Hash Partitioning Works-Partitioning

Posted on August 15, 2021July 25, 2024 by Barbara Raymond

Hash partitioning is designed to achieve a good spread of data across many different devices (disks) or just to segregate data out into more manageable chunks.

The hash key chosen for a table should be a column or set of columns that are unique or at least have as many distinct values as possible to provide for a good spread of the rows across partitions.

If you choose a column that has only four values, and you use two partitions, then all the rows could quite easily end up hashing to the same partition, obviating the goal of partitioning in the first place!

We will create a hash table with two partitions in this case. We will use a column named HASH_KEY_COLUMN as our partition key.
Oracle will take the value in this column and determine the partition this row will be stored in by hashing that value:
$ sqlplus eoda/foo@PDB1
SQL> CREATE TABLE hash_example( hash_key_column date,data varchar2(20))PARTITION BY HASH (hash_key_column)( partition part_1 tablespace p1,partition part_2 tablespace p2);Table created.

Figure 13-2 shows that Oracle will inspect the value in the HASH_KEY_COLUMN, hash it, and determine which of the two partitions a given row will appear in.

Figure 13-2. Hash partition insert example

As noted earlier, hash partitioning gives you no control over which partition a row ends up in. Oracle applies the hash function, and the outcome of that hash determines where the row goes. If you want a specific row to go into partition PART_1 for whatever reason, you should not—in fact, you cannot—use hash partitioning.

The row will go into whatever partition the hash function says to put it in. If you change the number of hash partitions, the data will be redistributed over all of the partitions (adding or removing a partition to a hash partitioned table will cause all of the data to be rewritten, as every row may now belong in a different partition).

Hash partitioning is most useful when you have a large table, such as the one shown in the “Reduced Administrative Burden” section, and you would like to divide and conquer it. Rather than manage one large table, you would like to have 8 or 16 smaller tables to manage.

Hash partitioning is also useful to increase availability to some degree, as demonstrated in the “Increased Availability” section; the temporary loss of a single hash partition permits access to all of the remaining partitions. Some users may be affected, but there is a good chance that many will not be.

Additionally, the unit of recovery is much smaller now. You do not have a single large table to restore and recover; you have a fraction of that table to recover. Lastly, hash partitioning is useful in high update contention environments, as mentioned in the “Reduced Contention in an OLTP System” section. Instead of having a single hot segment, we can hash partition a segment into 16 pieces, each of which is now receiving modifications.

Reference Partitioning-Partitioning-1

Posted on June 25, 2021July 25, 2024 by Barbara Raymond

Reference partitioning addresses the issue of parent/child equipartitioning; that is, when you need the child table to be partitioned in such a manner that each child table partition has a one-to-one relationship with a parent table partition.

This is important in situations such as a data warehouse where you want to keep a specific amount of data online (say the last five years’ worth of ORDER information) and need to ensure the related child data(theORDER_LINE_ITEMS data) is online as well. In this classic example, the ORDERS table would typically have a column ORDER_DATE, making it easy to partition by month and thus facilitate keeping the last five years of data online easily.

As time advances, you would just have next month’s partition available for loading, and you would drop the oldest partition. However, when you consider the ORDER_LINE_ITEMS table, you can see you would have a problem. It does not have the ORDER_DATE column, and there is nothing in the ORDER_LINE_ITEMS table to partition it by; therefore, it’s not facilitating the purging of old information or loading of new information.

In the past, prior to reference partitioning, developers would have to denormalize the data, in effect copying the ORDER_DATE attribute from the parent table ORDERS into the child ORDER_LINE_ITEMS table.

This presented the typical problems of data redundancy, that of increased storage overhead, increased data loading resources, cascading update issues (if you modify the parent, you have to ensure you update all copies of the parent data), and so on.

Additionally, if you enabled foreign key constraints in the database (as you should), you would discover that you lost the ability to truncate or drop old partitions in the parent table.

For example, let’s set up the conventional ORDERS and ORDER_LINE_ITEMS tables starting with the ORDERS table:
$ sqlplus eoda/foo@PDB1
SQL> create table orders(order# number primary key,order_date date,data varchar2(30))enable row movementPARTITION BY RANGE (order_date)(PARTITION part_2020 VALUES LESS THAN (to_date(’01-01-2021′,’dd-mm-yyyy’)) , PARTITION part_2021 VALUES LESS THAN (to_date(’01-01-2022′,’dd-mm-yyyy’)));Table created.
SQL> insert into orders values (1, to_date( ’01-jun-2020′, ‘dd-mon-yyyy’ ), ‘xxx’ ); 1 row created.
SQL> insert into orders values (2, to_date( ’01-jun-2021′, ‘dd-mon-yyyy’ ), ‘xxx’ ); 1 row created.

And now we’ll create the ORDER_LINE_ITEMS table—with a bit of data pointing to theORDERS table:
SQL> create table order_line_items(order# number,line# number,order_date date, — manually copied from ORDERS!data varchar2(30),constraint c1_pk primary key(order#,line#),constraint c1_fk_p foreign key(order#) references orders)enable row movementPARTITION BY RANGE (order_date)(PARTITION part_2020 VALUES LESS THAN (to_date(’01-01-2021′,’dd-mm-yyyy’)) , PARTITION part_2021 VALUES LESS THAN (to_date(’01-01-2022′,’dd-mm-yyyy’)));Table created.
SQL> insert into order_line_items values ( 1, 1, to_date( ’01-jun-2020′,’dd-mon-yyyy’ ), ‘yyy’ );1 row created.
SQL> insert into order_line_items values ( 2, 1, to_date( ’01-jun-2021′,’dd-mon-yyyy’ ), ‘yyy’ );1 row created.

Now, if we were to drop the ORDER_LINE_ITEMS partition containing 2020 data, you know and I know that the corresponding ORDERS partition for 2020 could be dropped as well, without violating the referential integrity constraint. You and I know it, but the database is not aware of that fact:
SQL> alter table order_line_items drop partition part_2020; Table altered.
SQL> alter table orders drop partition part_2020; alter table orders drop partition part_2020 *
ERROR at line 1:
ORA-02266: unique/primary keys in table referenced by enabled foreign keys

So, not only is the approach of denormalizing the data cumbersome, resource intensive, and potentially damaging to our data integrity, it prevents us from doing something we frequently need to do when administering partitioned tables: purging old information.

Enter reference partitioning. With reference partitioning, a child table will inherit the partitioning scheme of its parent table without having to denormalize the partitioning key, and it allows the database to understand that the child table is equipartitioned with the parent table. That is, we’ll be able to drop or truncate the parent table partition when we truncate or drop the corresponding child table partition.

Interval Reference Partitioning-Partitioning

Posted on April 5, 2021July 25, 2024 by Barbara Raymond

Oracle flexibly allows you to use a combination of interval and reference partitioning.
For example, if you create an interval range partitioned parent table, as follows:
SQL> create table orders(order# number primary key,order_date timestamp,data varchar2(30))PARTITION BY RANGE (order_date)INTERVAL (numtoyminterval(1,’year’))

(PARTITION part_2020 VALUES LESS THAN (to_date(’01-01-2021′,’dd-mm-yyyy’)) , PARTITION part_2021 VALUES LESS THAN (to_date(’01-01-2022′,’dd-mm-yyyy’)));
Table created.

Next is the code to create a reference partitioned child table:
SQL> create table order_line_items( order# number,line# number,data varchar2(30),constraint c1_pk primary key(order#,line#),constraint c1_fk_p foreign key(order#) references orders)partition by reference(c1_fk_p);Table created.

To see interval reference partitioning in action, let’s insert some data. First, we insert rows that will fit within existing range partitions:
SQL> insert into orders values (1, to_date( ’01-jun-2020′, ‘dd-mon-yyyy’ ), ‘xxx’); 1 row created.
SQL> insert into orders values (2, to_date( ’01-jun-2021′, ‘dd-mon-yyyy’ ), ‘xxx’); 1 row created.
SQL> insert into order_line_items values( 1, 1, ‘yyy’ ); 1 row created.
SQL> insert into order_line_items values( 2, 1, ‘yyy’ ); 1 row created.

All of the prior rows fit into the partitions specified when creating the tables. The following query displays the current partitions:

SQL> select table_name, partition_name from user_tab_partitions where table_name in ( ‘ORDERS’, ‘ORDER_LINE_ITEMS’ ) order by table_name, partition_name;TABLE_NAME PARTITION_NAME
ORDERS PART_2020
ORDERS PART_2021
ORDER_LINE_ITEMS PART_2020
ORDER_LINE_ITEMS PART_2021

Next, rows are inserted that don’t fit into an existing range partition; therefore, Oracle automatically creates partitions to hold the newly inserted rows:
SQL> insert into orders values (3, to_date( ’01-jun-2022′,’dd-mon-yyyy’ ), ‘xxx’);1 row created.
SQL> insert into order_line_items values (3, 1, ‘zzz’ ); 1 row created.

The following query shows that two interval partitions were automatically created, one for the parent table and one for the child table:

SQL> select a.table_name, a.partition_name, a.high_value, decode( a.interval, ‘YES’, b.interval ) interval from user_tab_partitions a, user_part_tables b where a.table_name IN (‘ORDERS’, ‘ORDER_LINE_ITEMS’) and a.table_name = b.table_nameorder by a.table_name;

TABLE_NAME PARTITION_ HIGH_VALUE INTERVAL
—————— ———- ——————————— ——–
ORDERS PART_2020 TIMESTAMP’ 2021-01-01 00:00:00′
ORDERS PART_2021 TIMESTAMP’ 2022-01-01 00:00:00′
ORDERS SYS_P1640 TIMESTAMP’ 2023-01-01 00:00:00′ NUMTOY
MINTERVAL
(1,’YEAR’)
ORDER_LINE_ITEMS ORDER_LINE_ITEMS ORDER_LINE_ITEMS
PART_2020
PART_2021
SYS_P1640 YES

Two partitions named SYS_P1640 were created, with the parent table partition having a high value of 2023-01-01. If desired, you can rename the partitions via the ALTER TABLE command:
SQL> alter table orders rename partition sys_p1640 to part_2022; Table altered.
SQL> alter table order_line_items rename partition sys_p1640 to part_2022; Table altered.

Tip See the “Interval Partitioning” section of this chapter for an example of automating the renaming of partitions via PL/SQL.