iTranslated by AI
May the B-tree Index Be with You
Whether it is a relational SQL database or a non-relational system, creating the correct index is the best way to shorten query response times.
I have never performed performance tuning. While I had heard that using RDB indexes speeds up queries, I had never thought about how indexes are actually used for real SQL or how much performance improvement is achieved.
At that time, I read the online book about RDB indexes and tuning, "USE THE INDEX, LUKE!," and found it very helpful. This post is a summary of that online book for my own understanding.
Chapter 1 covers the structure of indexes and their impact on performance, and Chapter 2 covers how indexes are used in actual SQL. I have summarized Chapter 1 to review the overview of indexes and Chapter 2 as a reference for how to perform tuning in real-world SQL.
Index Structure and Performance
Index Structure
Indexes exist to speed up query execution and are a combination of data structures: doubly linked lists and search trees. Doubly linked lists are used for efficient access to continuous data, while search trees are used for high-speed searching.
Indexes are maintained as metadata, separate from the actual table data. Making the actual table data continuous would lead to poor performance because a large amount of data would need to be shifted with every update. As a countermeasure, logical ordering is created using a doubly linked list, independent of the physical arrangement of the table data.
An index consists of leaf nodes connected by a doubly linked list, where each node contains pairs of keys (the indexed columns) and references to table data rows. To retrieve a row from an index, the system finds the leaf node using the key and then accesses the actual table data using the reference to the table data row.
A doubly linked list alone would require, in the worst case, accessing as many elements as there are leaf nodes to find a key. Therefore, a search tree is used to enable efficient searching. A search tree consists of branch nodes formed by taking a certain number of keys selected from each leaf node. Furthermore, a root node is formed by taking a certain number of keys selected from each branch node. These nodes reference each other, allowing for efficient searching from the root node based on the magnitude of the keys.

Search trees have the characteristic that the tree depth increases slowly relative to the increase in the number of leaf nodes, so performance is unlikely to degrade even as data grows. Since searching a search tree involves traversing from parent nodes to child nodes, the number of traversals is determined by the depth of the tree; a shallower tree tends to result in better performance.
Table data and each node of the index are stored in the smallest storage units handled by the database, called database blocks or pages. Units such as 16KB or 8KB are typically used.
Cases where Index Searches are Slow
There are cases where using an index does not provide the expected speed. It was often said that this is caused by index degradation, but in most cases, rebuilding the index to reduce leaf nodes does not reduce the depth of the index, so performance does not improve significantly.
There are actually two main reasons why searching becomes slow:
- Leaf node chains
- Access to the table
The first issue, leaf node chains, becomes a problem when a large number of leaf nodes connected by a doubly linked list must be traversed. Examples include cases using indexes with poor selectivity or cases where the order of columns in a composite index is inappropriate. An index with poor selectivity is one created on a column where many rows have the same value; even if you narrow down values using this, the large number of rows makes it necessary to traverse many leaf nodes.
The second issue, access to the table, becomes a problem when a large number of table accesses occur for each search. Even a single leaf node can hold hundreds of entries, and the corresponding table rows are usually stored dispersed across many table blocks. While the database can retrieve data stored in the same table block in a single read, it needs to read many blocks if the data is dispersed. This becomes a problem, for example, when the data cannot be narrowed down by the index alone, or when there are many rows in the result even after narrowing down with the index.
Searching using an index is performed in the following steps:
- Tree traversal
- Following the leaf node chain
- Reading data from the table
While tree traversal can be done efficiently with an index, the other two steps require accessing many blocks, which becomes the cause of slow searches.
Access and Filter Predicates
Access predicates are conditions that determine the start and end points of an index operation. In other words, they represent the conditions for starting and stopping the traversal of leaf nodes. For example, in an Oracle execution plan, an access method called INDEX UNIQUE SCAN might be shown with an access predicate such as access("EMPLOYEE_ID"=123). This indicates that the search is retrieving only the entry in the index tree where EMPLOYEE_ID is 123.
There are two types of filter predicates: index filter predicates and table filter predicates. Index filter predicates are applied during the traversal of leaf nodes, while table filter predicates are applied after loading the table data. In an Oracle execution plan, these are displayed as something like filter("SUBSIDIARY_ID"=30). Unlike index filter predicates, table filter predicates require reading the actual table data. Therefore, if the number of targeted leaf nodes is large, it can lead to a significant amount of unnecessary table access, causing performance to degrade.
To search effectively using an index, it is ideal to minimize filter predicates and rely primarily on access predicates. If both an access predicate and an index filter predicate are used, it suggests that a large number of leaf nodes might be traversed. If both an access predicate and a table filter predicate are used, it indicates that a large number of table accesses might be occurring.
Clustering in Indexes
Indexes can cluster data, which can reduce the large number of table accesses that cause slowness. Data clustering refers to storing data that is accessed consecutively close together so that it can be accessed with fewer I/O operations. Here, the term is used in the sense of a data cluster, not a computer cluster representing a collection of databases.
Methods for improving performance through clustering in indexes include the following. Each of these methods can be considered a way to reduce high-load table accesses by providing additional information in the index.
- Intentional use of filter predicates
- Covering indexes
- Clustered indexes
Intentional use of filter predicates allows for performance improvement by using a table filter as an index filter. If a condition in the WHERE clause cannot use an index, it is typically used as a table filter. Since a table filter requires loading table data, it becomes inefficient if many rows do not match the condition. By adding columns to the index to let them be used as an index filter, you can perform filtering within the index itself.
Covering indexes refer to a situation where all the information required for a query exists within the index, allowing table access to be completely eliminated. A query results in a covering index when all columns present in the SELECT and WHERE clauses are included in the index.
Clustered indexes are indexes where all table data is stored. They are suitable for tables that only have one index, but performance drops if multiple indexes exist. This is because subsequent indexes are called secondary indexes, which store the key of the clustered index; to access the table data, the clustered index must then be traversed.
While indexes allow for efficient lookups, they often have a negative impact on update-type queries. Therefore, it is extremely important to keep the number of indexes small. When data is inserted, it must also be added to the index, and maintenance to keep the index tree balanced must be performed. The cost of this increases as the number of indexes grows.
The Three Powers of an Index
The powerful features of an index are the following three. Thanks to these features, using indexes correctly can improve query performance.
- Data structure suitable for searching
- Clustered data
- Sorted data
Data structure suitable for searching refers to the B-tree, which, as mentioned above, allows for high-speed searches. Because of its balanced tree structure, all elements can be accessed in the same number of steps, allowing even huge datasets to be processed almost instantly. Another reason is that the tree depth increases very slowly compared to the increase in the number of leaf nodes.
Clustered data refers to the characteristic that an index can hold various types of data. Ideally, the data included in an index should primarily be used as access predicates to narrow down the range of index leaf nodes. On the other hand, indexes can also include other data; performance improves in some cases by including data for filter predicates or data that the query needs to return.
Sorted data refers to the characteristic that an index is sorted by the values of the columns it contains. Because of this feature, there are cases where sort processing for queries that require ordering becomes unnecessary. As a result, there is no need to read all rows for sorting, creating room for optimization where reading is stopped once the required rows are obtained. A mechanism that returns partial results without waiting for all data to be processed is called pipelining.
How Indexes are Used
In this chapter, I will describe how indexes are used in actual SQL. Since index usage varies depending on the actual stored data and statistics, what is written here is just one example.
WHERE
With a composite index like the following:
CREATE UNIQUE INDEX employees_pk
ON employees (employee_id, subsidiary_id)
Even if you execute a SQL query with only an equality operator like the one below, the index will not be used, and a full table scan will be performed.
SELECT first_name, last_name
FROM employees
WHERE subsidiary_id = 20
In the index above, data is first sorted by employee_id and then by subsidiary_id. Therefore, it is not sorted in order of subsidiary_id, and it cannot be searched efficiently using the index. To make the database use the index, the columns must be reordered.
Next, let's look at SQL that includes both range searches and equality operations.
SELECT first_name, last_name, date_of_birth
FROM employees
WHERE date_of_birth >= TO_DATE(?, 'YYYY-MM-DD')
AND date_of_birth <= TO_DATE(?, 'YYYY-MM-DD')
AND subsidiary_id = ?
What happens if an index is ordered by date_of_birth and then subsidiary_id for such a SQL query? The Oracle execution plan, assuming that index is used, would look like the following:
CREATE UNIQUE INDEX emp_test
ON employees (date_of_birth, subsidiary_id)
--------------------------------------------------------------
|Id | Operation | Name | Rows | Cost |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 |
|*1 | FILTER | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 4 |
|*3 | INDEX RANGE SCAN | EMP_TEST | 2 | 2 |
--------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(:END_DT >= :START_DT)
3 - access(DATE_OF_BIRTH >= :START_DT
AND DATE_OF_BIRTH <= :END_DT)
filter(SUBSIDIARY_ID = :SUBS_ID)
Looking at the predicates used in the INDEX RANGE SCAN entry with ID 3, you can see that the condition for date_of_birth is used as an access predicate, while the condition for subsidiary_id is used as an index filter predicate. In this case, unnecessary leaf node traversal is occurring. For example, if there are multiple entries for date_of_birth within the specified range but only one entry for the specified subsidiary_id, unnecessary traversal of leaf nodes will occur.
On the other hand, if an index ordered by subsidiary_id and then date_of_birth exists, the process terminates once the single matching subsidiary_id is found, eliminating the need to traverse unnecessary leaf nodes. The Oracle execution plan might look like this:
CREATE UNIQUE INDEX emp_test2
ON employees (subsidiary_id, date_of_birth)
---------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
---------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 |
|* 1 | FILTER | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 3 |
|* 3 | INDEX RANGE SCAN | EMP_TEST2 | 1 | 2 |
---------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(:END_DT >= :START_DT)
3 - access(SUBSIDIARY_ID = :SUBS_ID
AND DATE_OF_BIRTH >= :START_DT
AND DATE_OF_BIRTH <= :END_T)
Since both the subsidiary_id and date_of_birth conditions are used as access predicates, there is no unnecessary leaf node traversal.
This makes sense when you consider the mechanism of composite indexes. In a composite index, since data is sorted by the first column and then the next, the second column is not necessarily arranged continuously.
Roughly speaking, an index is first for checking equality and then for looking at ranges.
There are also cases where executing a full table scan is faster than using an index. For example, consider the following index and SQL:
CREATE UNIQUE INDEX EMPLOYEES_PK
ON EMPLOYEES (SUBSIDIARY_ID, EMPLOYEE_ID)
SELECT first_name, last_name, subsidiary_id, phone_number
FROM employees
WHERE last_name = 'WINAND'
AND subsidiary_id = 30
In this case, if there are a large number of rows where subsidiary_id is 30, the database would have to read data for each row individually, potentially making the query slow and making a full table scan faster. Of course, adding an index on last_name would allow for a more effective search.
As these examples show, to add appropriate indexes, you must understand how your actual application accesses the database rather than following generalities. There is an urban legend that says you should put the most selective column on the left of an index, but that index is meaningless if that column is not used.
JOIN
Databases use different algorithms for joining, such as nested loops, hash joins, and sort-merge joins, depending on the situation.
When using a nested loop, given the following indexes and SQL:
CREATE INDEX emp_up_name ON employees (UPPER(last_name))
CREATE INDEX sales_emp ON sales (subsidiary_id, employee_id)
SELECT e0_.employee_id AS employee_id0
-- MORE COLUMNS
FROM employees e0_
LEFT JOIN sales s1_
ON e0_.subsidiary_id = s1_.subsidiary_id
AND e0_.employee_id = s1_.employee_id
WHERE UPPER(e0_.last_name) LIKE ?
The following execution plan is displayed:
---------------------------------------------------------------
|Id |Operation | Name | Rows | Cost |
---------------------------------------------------------------
| 0 |SELECT STATEMENT | | 822 | 38 |
| 1 | NESTED LOOPS OUTER | | 822 | 38 |
| 2 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 4 |
|*3 | INDEX RANGE SCAN | EMP_UP_NAME | 1 | |
| 4 | TABLE ACCESS BY INDEX ROWID| SALES | 821 | 34 |
|*5 | INDEX RANGE SCAN | SALES_EMP | 31 | |
---------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access(UPPER("LAST_NAME") LIKE 'WIN%')
filter(UPPER("LAST_NAME") LIKE 'WIN%')
5 - access("E0_"."SUBSIDIARY_ID"="S1_"."SUBSIDIARY_ID"(+)
AND "E0_"."EMPLOYEE_ID" ="S1_"."EMPLOYEE_ID"(+))
The database retrieves results from the EMPLOYEES table using EMP_UP_NAME and then extracts records corresponding to each employee from the SALES table. As the name "nested loop" suggests, it takes data from the searched EMPLOYEES table row by row and joins them with rows from the SALES table. It performs well when the number of rows on the outside (in this case, the filtered EMPLOYEES table) is small. Therefore, in tuning, it is necessary to ensure that the outer query returns as few rows as possible.
On the other hand, the hash join algorithm reads records from one side of the join into a hash table all at once. In a nested loop join, if the number of rows in the outer query is large, it has the disadvantage of causing a massive number of B-tree traversals in the inner query due to the nature of joining in a loop. In the example above, the operation of taking one row from the EMPLOYEES table and joining it with a row from the SALES table is repeated for the number of rows in the EMPLOYEES table. Consequently, it becomes necessary to traverse the index tree of the SALES table many times. Therefore, by loading all target records from one side of the join into a hash table beforehand, high-speed matching is made possible.
With the following index and SQL:
CREATE INDEX sales_date ON sales (sale_date)
SELECT *
FROM sales s
JOIN employees e ON (s.subsidiary_id = e.subsidiary_id
AND s.employee_id = e.employee_id )
WHERE s.sale_date > trunc(sysdate) - INTERVAL '6' MONTH
The following execution plan is displayed:
--------------------------------------------------------------
| Id | Operation | Name | Bytes| Cost|
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 59M| 3252|
|* 1 | HASH JOIN | | 59M| 3252|
| 2 | TABLE ACCESS FULL | EMPLOYEES | 9M| 478|
| 3 | TABLE ACCESS BY INDEX ROWID| SALES | 10M| 1724|
|* 4 | INDEX RANGE SCAN | SALES_DATE| | |
--------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("S"."SUBSIDIARY_ID"="E"."SUBSIDIARY_ID"
AND "S"."EMPLOYEE_ID" ="E"."EMPLOYEE_ID" )
4 - access("S"."SALE_DATE" > TRUNC(SYSDATE@!)
-INTERVAL'+00-06' YEAR(2) TO MONTH)
While the nested loop join uses an index for the join condition, here it results in a full table access. During the join, all table data is read once and stored in a hash table, and that table is used during the hash join. Although an index is not used for the join condition, looking at Id 4, you can see that an index is used for sale_date, which is used in the condition for the independent table.
Tuning for hash joins requires a completely different approach than for nested loop joins: reducing the size of the hash table. For example, you can reduce the size by explicitly specifying necessary columns instead of using * in the SELECT statement.
Sort-merge join is a method for joining two tables sorted by the join attribute, enabling full outer joins that execute left and right outer joins simultaneously. Performance is good when the tables are already sorted, but if they are not, it is rarely used because the sorting process is heavy.
ORDER BY
Because indexes have an order, for SQL with an ORDER BY clause, if the relevant index has already sorted the rows, there is no need to perform a sort operation on the results, and performance improves.
In the following index and SQL example:
CREATE INDEX sales_dt_pr ON sales (sale_date, product_id)
SELECT sale_date, product_id, quantity
FROM sales
WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY
ORDER BY sale_date, product_id
The following execution plan is displayed:
---------------------------------------------------------------
|Id | Operation | Name | Rows | Cost |
---------------------------------------------------------------
| 0 | SELECT STATEMENT | | 320 | 300 |
| 1 | TABLE ACCESS BY INDEX ROWID| SALES | 320 | 300 |
|*2 | INDEX RANGE SCAN | SALES_DT_PR | 320 | 4 |
---------------------------------------------------------------
Although there is an ORDER BY in the query, SORT ORDER BY does not appear in the execution plan. The database uses the ordering provided by the index to skip the sorting process. Here, we call this type of ORDER BY a pipelined ORDER BY.
A pipelined ORDER BY can proceed in either direction. For example, in the following index and SQL, the sort is skipped because the order of the ORDER BY matches the index direction.
CREATE INDEX sales_dt_pr ON sales (sale_date, product_id)
SELECT sale_date, product_id, quantity
FROM sales
WHERE sale_date >= TRUNC(sysdate) - INTERVAL '1' DAY
ORDER BY sale_date DESC, product_id DESC
GROUP BY
Databases use either a hash algorithm or a sort-group algorithm for GROUP BY. The hash algorithm processes all records together on a temporary hash table before returning the results, while the sort-group algorithm sorts by the group key so that each group can be processed in order.
In the sort-group algorithm, using an index eliminates the need for sorting, which improves performance. Here, we call this a pipelined GROUP BY.
In the following index and SQL example:
CREATE INDEX sales_dt_pr ON sales (sale_date, product_id)
SELECT product_id, sum(eur_value)
FROM sales
WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY
GROUP BY product_id
The following execution plan is displayed:
---------------------------------------------------------------
|Id |Operation | Name | Rows | Cost |
---------------------------------------------------------------
| 0 |SELECT STATEMENT | | 17 | 192 |
| 1 | SORT GROUP BY NOSORT | | 17 | 192 |
| 2 | TABLE ACCESS BY INDEX ROWID| SALES | 321 | 192 |
|*3 | INDEX RANGE SCAN | SALES_DT_PR | 321 | 3 |
---------------------------------------------------------------
Since sale_date is specified as a single value and the index is sorted by product_id, the index can be utilized, and the supplement NOSORT is added to SORT GROUP BY.
On the other hand, in cases where sale_date cannot be narrowed down to a single value and a pipelined ORDER BY cannot be used, as in the following SQL:
SELECT product_id, sum(eur_value)
FROM sales
WHERE sale_date >= TRUNC(sysdate) - INTERVAL '1' DAY
GROUP BY product_id
The following execution plan is displayed:
---------------------------------------------------------------
|Id |Operation | Name | Rows | Cost |
---------------------------------------------------------------
| 0 |SELECT STATEMENT | | 24 | 356 |
| 1 | HASH GROUP BY | | 24 | 356 |
| 2 | TABLE ACCESS BY INDEX ROWID| SALES | 596 | 355 |
|*3 | INDEX RANGE SCAN | SALES_DT_PR | 596 | 4 |
---------------------------------------------------------------
Here, a hash algorithm is used. Unlike the sort-group algorithm, the hash algorithm only needs to buffer the aggregation results, which requires less memory.
LIMIT
When using a pipelined ORDER BY, you can use LIMIT or FETCH FIRST to inform the database that you do not need all rows, allowing it to stop execution before retrieving everything.
For example, in the following index and SQL:
CREATE INDEX sales_dt_pr ON sales (sale_date, product_id)
SELECT *
FROM sales
ORDER BY sale_date DESC
FETCH FIRST 10 ROWS ONLY
The following execution plan is displayed:
-------------------------------------------------------------
| Operation | Name | Rows | Cost |
-------------------------------------------------------------
| SELECT STATEMENT | | 10 | 9 |
| COUNT STOPKEY | | | |
| VIEW | | 10 | 9 |
| TABLE ACCESS BY INDEX ROWID| SALES | 1004K| 9 |
| INDEX FULL SCAN DESCENDING| SALES_DT_PR | 10 | 3 |
-------------------------------------------------------------
You can see that an execution stop using COUNT STOPKEY is planned. Selecting the first N items using a pipelined ORDER BY is scalable because the performance remains almost unchanged even as the table size increases. Conversely, if an index does not exist and a pipelined ORDER BY cannot be used, the execution plan would look like this:
--------------------------------------------------
| Operation | Name | Rows | Cost |
--------------------------------------------------
| SELECT STATEMENT | | 10 | 59558 |
| COUNT STOPKEY | | | |
| VIEW | | 1004K| 59558 |
| SORT ORDER BY STOPKEY| | 1004K| 59558 |
| TABLE ACCESS FULL | SALES | 1004K| 9246 |
--------------------------------------------------
Because no sorting has been performed beforehand, the database must first read all rows and perform a sort before the SORT ORDER BY STOPKEY selects the first N items. Although COUNT STOPKEY is present, it must process 1004K rows before reaching that step. Since it accesses all data, performance is highly likely to degrade as the table size grows.
Conclusion
I have summarized USE THE INDEX, LUKE! in my own way from the perspective of "indexes that make things slow and indexes that make things fast."
I realized that indexes do not improve speed in every scenario, and it is crucial to understand the cases where they cause slowness. To that end, I believe the concepts of "Cases where Index Searches are Slow" and "Access and Filter Predicates" are very useful to remember.
In this post, I introduced how indexes are used in some common SQL queries, but in real-world applications, a much wider variety of SQL is used. I hope that referring to the examples in this post and in USE THE INDEX, LUKE! will be helpful when you encounter performance problems.
Discussion