PURPOSE
This document discusses the use of large IN lists and multiple OR statements in queries.
There is no strict definition of a 'large' inlist, but as soon as a query becomes unmanageable (either at parse or execution time) due to the size of the list, it may be time to think about the validity of it. Anything above 10 values would tend to imply that perhaps an object is missing from the database (see below).(应该建一个表来join,可能从nest loop(如果用了index就是NL ,如果没用index就是Filter)变成hash loop)
INLIST 这是理解filter和nl的好的案例
IN 都会切换成OR是前提
QUESTIONS AND ANSWERS
Very Long Inlists
Very long inlists can cause problems for the Cost Based Optimizer due to the extended comparisons required. Parse performance was a particular issue in earlier versions, especially when the inlist was expanded into a large number of UNION ALLed statements. Historically this was because the CBO had to determine the cost for the expanded statement which was time consuming because of the large number of branches. More recently, with different implementation methods (see below), the main issue is likely to be CPU consumption due to large numbers of comparisons at run time as opposed to parse issues. In these cases the only real solution is to think whether this long list indicates a missing object in the database that should be storing this data and maybe the queries should be re-coded so that the inlist is stored in a lookup table and then join to that table instead of using the inlist. If a permanent table is unavailable then a temporary table can offer a suitable solution. See:
Oracle Database Online Documentation 12c Release 1 (12.1)
Database Administrator's Guide
Creating a Temporary Table
Managing Tables
How the CBO processes an IN list or multiple OR statements
When the Cost Based optimizer encounters an IN list it has a couple of options available to it:
-
Use the inlist iterator functionality to process the list as a filter
Inlist iterator functionality has been available since Oracle 8i
In early versions (ie Oracle7 and below) the inlist iterator functionality was unavailable and this option could not use indexes, making it very undesirable. -
In earlier versions the optimizer considered breaking the statement into a series of statements UNION ALLed together. In modern versions, this option is far less prevalent and the concatenation option is generally phased out, being as it almost never results in a better plan - the cost of the concatenation is usually more than other options. The example below is provided for historical reference. Example of the concatenation expansion E.g.:
SELECT empno FROM emp WHERE deptno IN (10,20,30)
can be rewritten as:
SELECT empno FROM emp WHERE deptno = 10
UNION ALL
SELECT empno FROM emp WHERE deptno = 20
UNION ALL
SELECT empno FROM emp WHERE deptno = 30
In this example if the deptno column is indexed then an index could be used to lookup on each branch. If the concatenation does not occur automatically with the Cost Based Optimizer (due to cost comparisons) in some cases it can be forced by using the USE_CONCAT hint (see Note:17214.1) although in later versions this is subject to numerous restrictions and may not result in a concatenation. If the expansion to a union all is not desirable then the CBO can be forced not to expand using the NO_EXPAND hint.
Historical Considerations
Parse performance related to long inlists was not an issue for the (now obsolete) Rule Based optimizer because it did not do cost evaluation.
So previously if there is a parsing issue with long inlists then the workarounds were as follows:
- Use NO_EXPAND hint. This will NOT use an index with Oracle7 but can in Oracle8. Remember that the use of hints will force the use of the CBO.
- Pre 10g: Use RBO
Explain the use of the USE_CONCAT hint.
Note: This Document is a legacy document written in the Oracle7 time frame. Although the general basis of the article may still be valid, some changes may have occurred.
SOLUTION
The use of the USE_CONCAT hint forces the optimizer to fully expand each OR predicate in the query into a separate query block. When you have multiple in-lists this can cause a single query to be expanded into many query blocks.
Using USE_CONCAT hint to override a full table scan with IN/OR statements.
With the cost based optimizer prior to 7.2.2 / 7.3, it is not possible to force the optimizer to use an index on a list of OR predicates or an IN clause with multiple values, if the total cost of the Index Scans is greater than the cost of a Full Table Scan. This is not a problem with small numbers of rows, but a sub optimal path may be chosen
when large numbers of rows are involved. In 7.2.2 a new hint for the Cost Based Optimizer, USE_CONCAT, is introduced which allows the user to suggest the use of indexes over a Full Table Scan.
Consider the following:
Table "table1" has been analyzed with the compute option and the OPTIMIZER_GOAL parameter has been set to CHOOSE in each case.
Full Table Scan cost = 3
Index Scan Cost = 1 每访问一次是1
Table "TABLE1" has an index "IND1" on column "COL1".
Remember that the optimizer converts IN predicates into a list of OR's.
Examples:
SELECT * FROM table1 WHERE col1 IN (1,2,3)
Query Plan
------------------------------------------------------------------------
1.1 SELECT STATEMENT Cost = 3
2.1 CONCATENATION
3.1 INDEX RANGE SCAN IND1
3.2 INDEX RANGE SCAN IND1
3.3 INDEX RANGE SCAN IND1
Explanation:
The optimizer chooses 3 index scans to retrieve the data when the cost is the same as a Full Table Scan.
SELECT * FROM table1 WHERE col1 IN (1,2,3,4) 变成4个OR cost 为4
Query Plan
-----------------------------------------------------------------------
1.1 SELECT STATEMENT Cost = 3
2.1 TABLE ACCESS FULL TABLE1
Explanation: The optimizer chooses a Full Table Scan in preference to index scans as the cost is lower (FTS cost = 3, cost of 4 Index scans = 4).
SELECT /*+ INDEX(table1 ind1) */ * FROM table1 WHERE col1 IN (1,2,3,4)
Query Plan
------------------------------------------------------------------------
1.1 SELECT STATEMENT Cost = 3
2.1 TABLE ACCESS FULL TABLE1
Explanation: The optimizer chooses a Full Table Scan over an index scan of multiple OR'd values (or an extended IN list), if the total cost of the multiple OR predicates is greater than the Full Table Scan cost. This behavior is acceptable unless large numbers of records are involved which cause the cost of a Full Table Scan to become restricted. The use of the index hint (/*+ INDEX(table1 ind1) */) does not help in this case as the index hint is applied to each of the OR predicates rather than to the query as a whole. When the optimizer adds together the cost of all the OR predicates, it finds that the total cost is greater than the cost of a Full Table Scan and chooses this instead, as per query 2. The hint is not being ignored.
The desired use of indexes can be forced in 7.2.2 using the USE_CONCAT hint which is enabled using event 10078. This event can be enabled at session or instance level as follows:
Session level:
ALTER SESSION SET EVENTS '10078 trace name context forever, level 99';
Instance Level (in the initialization parameter file):
EVENT="10078 trace name context forever, level 1"
With this event enabled the USE_CONCAT hint may be used, which results in the following explain plan:
SELECT /*+ USE_CONCAT */ * FROM table1 WHERE col1 IN (1,2,3,4);
Query Plan
------------------------------------------------------------------------
1.4 SELECT STATEMENT Cost = 4
2.1 CONCATENATION
3.1 INDEX RANGE SCAN IND1
3.2 INDEX RANGE SCAN IND1
3.3 INDEX RANGE SCAN IND1
3.4 INDEX RANGE SCAN IND1
The hint will be enabled permanently in 7.3 - it will not require that you set the 10078 event.