Skip to content

Oracle Index Rebuild - A Corner Case

Regular index rebuilds in Oracle are a vital necessity - this old myth has been proven wrong long ago. But here is the story about an index where at first we did not understand why a rebuild reduced the size so heavily.

After a blooper involving a DROP TABLE command (it wasn't me) we had to restore a table from backup. For recovery we created a new database and then copied the table to the original database using CTAS over a database link. We were astonished to find out that one of the recreated indexes on the table would occupy only about 10% of its old space. This was totally unexpected as the usage pattern should lead to a dense index:

  • it is a single column index on a DATE column
  • the column contains the time of the insert job
  • the column is never updated
  • no rows are ever deleted from the table

As the column is set to the date of the insert job we expected the inserts to hit the same index leaf block. When the block is filled we should see a 90-10 block split and subsequent inserts should use a new leaf block. That should leave the previous block completely filled. So why would the recreated index be so much smaller than before?

The mystery became clearer when I looked at the table definition and thought about the usage pattern. The table is partitioned by range and it has two partitions: current rows and archived rows. The import job triggers a subsequent job that groups all rows in the current partition by a business criteria. If the grouping returns more than one row we know that new data has been imported and the old row is moved into the archive partition. This has been implemented with the help of row movement by simply updating the partition key.

To analyze this further I used the following objects:

  id          NUMBER(10) NOT NULL PRIMARY KEY,
  insert_at   DATE       NOT NULL,
  archived    NUMBER(1)  NOT NULL

CREATE INDEX i ON t(insert_at);

I looked up the object_id for the index to be able to create tree dumps and saved it in a sqlplus variable:

COLUMN object_id new_value object_id

SELECT object_id FROM user_objects WHERE object_name = 'I';

Finally, some data is inserted into the table:

SELECT level, sysdate, 0
  FROM dual CONNECT BY level <= 694;


I used 694 rows here for educational reasons as in my setup (Oracle on Linux with 8K block size) the effect is clearly visible with exactly that number of rows. Then I looked at the index tree dump to examine the index:

ALTER SESSION SET events 'immediate trace name treedump level &object_id';

The tree dump shows that the index consists of two leaf blocks each containing exactly 347 rows (rrows: 347):

----- begin tree dump
branch: 0x1000093 16777363 (0: nrow: 2, level: 1)
   leaf: 0x1000094 16777364 (-1: nrow: 347 rrow: 347)
   leaf: 0x1000095 16777365 (0: nrow: 347 rrow: 347)
----- end tree dump

So after the initial data load the index exactly matches the expected structure: perfectly packed. Next I looked at what happens to the index when all rows are moved into the archive partition:

UPDATE t SET archived = 1 WHERE archived = 0;


ALTER SESSION SET events 'immediate trace name treedump level &object_id';

Now the index suddenly has four leaf blocks. The previously used blocks 0x1000094 and 0x1000095 still have a row directory containing 347 entries (nrow: 347) but do not contain any actual rows (rrow: 0). All the index entries have moved to two completely new blocks:

----- begin tree dump
branch: 0x1000093 16777363 (0: nrow: 4, level: 1)
   leaf: 0x1000094 16777364 (-1: nrow: 347 rrow: 0)
   leaf: 0x1000095 16777365 (0: nrow: 347 rrow: 0)
   leaf: 0x1000096 16777366 (1: nrow: 347 rrow: 347)
   leaf: 0x1000097 16777367 (2: nrow: 347 rrow: 347)
----- end tree dump

There was the key to the mystery. Updating the archive flag caused row movement which lead to the index updates as every row got a new ROWID. Remember that for a non-unique index the ROWID is treated as if it were part of the indexed columns and therefore a change of the ROWID translates into a change of the relative position of the index entry compared to other entries with the same INSERT_AT.

Some business items were never updated so we had a certain number of rows that had never moved. Because of these rows most of the index blocks never went completely empty. Analysis showed that on the left hand side of the index we had blocks with sometimes only one or two rows while on the right hand side the index was packed with hundreds of rows per block.

I finished the test by showing that completely empty index blocks would indeed be reused:

INSERT INTO t VALUES (1000, sysdate, 0);


ALTER SESSION SET events 'immediate trace name treedump level &object_id';

The tree dump clearly shows that the previously empty block 0x1000094 has been reused and contains the newly inserted row:

----- begin tree dump
branch: 0x1000093 16777363 (0: nrow: 4, level: 1)
   leaf: 0x1000095 16777365 (-1: nrow: 0 rrow: 0)
   leaf: 0x1000096 16777366 (0: nrow: 347 rrow: 347)
   leaf: 0x1000097 16777367 (1: nrow: 347 rrow: 347)
   leaf: 0x1000094 16777364 (2: nrow: 1 rrow: 1)
----- end tree dump

As a side effect this busts the myth that empty blocks are never re-used.

If you want to know more about index rebuilds you should check out the excellent presentation Index Internals - Rebuilding the Truth by Richard Foote.


No Trackbacks


Display comments as Linear | Threaded

No comments

Add Comment

Standard emoticons like :-) and ;-) are converted to images.
Textile-formatting allowed
E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.

Form options

Submitted comments will be subject to moderation before being displayed.