Why is the optimizer not choosing INDEX FULL/RANGE SCAN (MIN/MAX) with KEEP function

In your first example, the primary key index only contains the column a. So in order to execute:

 SELECT MAX(b) KEEP(DENSE_RANK LAST ORDER BY a) FROM t

Oracle Database has to access the table to get the values for column b. So if you use the index it has to read both the table and the index. You have no where clause so it can’t do any filtering in the index itself. So it has to full scan it:

EXPLAIN PLAN FOR SELECT /*+ index (t PK_T) */MAX(b) KEEP(DENSE_RANK LAST ORDER BY a) FROM t / SELECT * FROM TABLE(Dbms_Xplan.Display(format => 'basic +rows')) / ------------------------------------------------------------- | Id | Operation | Name | Rows | ------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | SORT AGGREGATE | | 1 | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | | 3 | INDEX FULL SCAN | PK_T | 1 | ------------------------------------------------------------- 

So you’ve swapped a full table scan for a full index scan + table access by rowid. In most cases this will be more work, as we can see if we load the table up with data and get the execution plan. Pay attention to the buffers column:

insert into t with rws as ( select level a, level b from dual connect by level <= 10000 ) select a, b from rws; commit; exec dbms_stats.gather_table_stats(user, 't'); SELECT /*+ gather_plan_statistics index (t PK_T) */MAX(b) KEEP(DENSE_RANK LAST ORDER BY a) FROM t / select * from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST')); / ------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 37 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 37 | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 10000 | 10000 |00:00:00.04 | 37 | | 3 | INDEX FULL SCAN | PK_T | 1 | 10000 | 10000 |00:00:00.01 | 19 | ------------------------------------------------------------------------------------------------------- SELECT /*+ gather_plan_statistics */MAX(b) KEEP(DENSE_RANK LAST ORDER BY a) FROM t / select * from table(dbms_xplan.display_cursor(null, null, 'LAST BASIC IOSTATS')); / ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 23 | 1 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 23 | 1 | | 2 | TABLE ACCESS FULL| T | 1 | 10000 | 10000 |00:00:00.01 | 23 | 1 | ---------------------------------------------------------------------------------------------- 

Notice that the index + table access uses 37 buffers. But the full table scan accesses just 23. So it does 14 more IO operations.

In your second example:

 SELECT b FROM t WHERE a = (SELECT MAX(a) FROM t)

the table t appears twice. So the database will access it twice in the plan! It has to read the index first to find the maximum value for A. Then access the table (using the index) to find the value for b.

In your third example the optimizer choosing a full table scan is likely an due to the fact your table is empty. If there’s no rows it doesn’t make any difference whether you scan the index or the table: there’s nothing to read!

Keeping the table loaded with data from my previous test and the optimizer does choose an index full scan:

EXPLAIN PLAN FOR SELECT b FROM t WHERE ROWID = (SELECT MAX(ROWID) KEEP(DENSE_RANK LAST ORDER BY a) FROM t) / SELECT * FROM TABLE(Dbms_Xplan.Display(format => 'basic +rows')) / --------------------------------------------------- | Id | Operation | Name | Rows | --------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | TABLE ACCESS BY USER ROWID| T | 1 | | 2 | SORT AGGREGATE | | 1 | | 3 | INDEX FAST FULL SCAN | PK_T | 10000 | --------------------------------------------------- SELECT /*+ gather_plan_statistics */b FROM t WHERE ROWID = (SELECT MAX(ROWID) KEEP(DENSE_RANK LAST ORDER BY a) FROM t); select * from table(dbms_xplan.display_cursor(null, null, 'LAST BASIC IOSTATS')); --------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 25 | | 1 | TABLE ACCESS BY USER ROWID| T | 1 | 1 | 1 |00:00:00.01 | 25 | | 2 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 24 | | 3 | INDEX FAST FULL SCAN | PK_T | 1 | 10000 | 10000 |00:00:00.01 | 24 | ---------------------------------------------------------------------------------------------

You’ve got the same problems in your second set of queries. Yes, you’ve added another column to the PK. But you’ve also added another column to the table that’s not part of the index! And you’ve still got no data, which is likely to lead to “usual” plans.

Also note that in general your queries aren’t equivalent. The function:

MAX(b) KEEP(DENSE_RANK LAST ORDER BY a)

Sorts the rows by A and finds the greatest value. It then inspects all the rows with this value and returns the largest value for B among them.

When you do:

 SELECT b FROM t WHERE a = (SELECT MAX(a) FROM t)

You find the largest value for A, then return B for all the rows that have that value.

And with:

 SELECT b FROM t WHERE ROWID = (SELECT MAX(ROWID) KEEP(DENSE_RANK LAST ORDER BY a) FROM t)

You find all the rows which have the largest value for A. Then return B for whichever of these has the highest rowid.

In cases where A is not unique, these can all return different results:

CREATE TABLE t(a NUMBER , b NUMBER) / insert into t values (1, 2); insert into t values (1, 3); insert into t values (1, 1); commit; SELECT MAX(b) KEEP(DENSE_RANK LAST ORDER BY a) FROM t / MAX(B)KEEP(DENSE_RANKLASTORDERBYA) 3 SELECT b FROM t WHERE a = (SELECT MAX(a) FROM t) / B 2 3 1 SELECT b FROM t WHERE ROWID = (SELECT MAX(ROWID) KEEP(DENSE_RANK LAST ORDER BY a) FROM t) / B 1 

Related:

Leave a Reply