USE_CONCAT in list OR 以及 filter Nest LOOP

PURPOSE

This document discusses the use of large IN lists and multiple OR statements in queries.

There is no strict definition of a 'large' inlist, but as soon as a query becomes unmanageable (either at parse or execution time) due to the size of the list, it may be time to think about the validity of it. Anything above 10 values would tend to imply that perhaps an object is missing from the database (see below).(应该建一个表来join,可能从nest loop(如果用了index就是NL ,如果没用index就是Filter)变成hash loop)

INLIST 这是理解filter和nl的好的案例

IN 都会切换成OR是前提

QUESTIONS AND ANSWERS

Very Long Inlists

Very long inlists can cause problems for the Cost Based Optimizer due to the extended comparisons required. Parse performance was a particular issue in earlier versions, especially when the inlist was expanded into a large number of UNION ALLed statements. Historically this was because the CBO had to determine the cost for the expanded statement which was time consuming because of the large number of branches. More recently, with different implementation methods (see below), the main issue is likely to be CPU consumption due to large numbers of comparisons at run time as opposed to parse issues. In these cases the only real solution is to think whether this long list indicates a missing object in the database that should be storing this data and maybe the queries should be re-coded so that the inlist is stored in a lookup table and then join to that table instead of using the inlist. If a permanent table is unavailable then a temporary table can offer a suitable solution. See:

Oracle Database Online Documentation 12c Release 1 (12.1)

Database Administrator's Guide

Creating a Temporary Table
Managing Tables

How the CBO processes an IN list or multiple OR statements

When the Cost Based optimizer encounters an IN list it has a couple of options available to it:

  • Use the inlist iterator functionality to process the list as a filter
    Inlist iterator functionality has been available since Oracle 8i
    In early versions (ie Oracle7 and below) the inlist iterator functionality was unavailable and this option could not use indexes, making it very undesirable.

  • In earlier versions the optimizer considered breaking the statement into a series of statements UNION ALLed together. In modern versions, this option is far less prevalent and the concatenation option is generally phased out, being as it almost never results in a better plan - the cost of the concatenation is usually more than other options. The example below is provided for historical reference. Example of the concatenation expansion E.g.:

    SELECT empno FROM emp WHERE deptno IN (10,20,30)

    can be rewritten as:

    SELECT empno FROM emp WHERE deptno = 10

    UNION ALL

    SELECT empno FROM emp WHERE deptno = 20

    UNION ALL

    SELECT empno FROM emp WHERE deptno = 30

    In this example if the deptno column is indexed then an index could be used to lookup on each branch. If the concatenation does not occur automatically with the Cost Based Optimizer (due to cost comparisons) in some cases it can be forced by using the USE_CONCAT hint (see Note:17214.1) although in later versions this is subject to numerous restrictions and may not result in a concatenation. If the expansion to a union all is not desirable then the CBO can be forced not to expand using the NO_EXPAND hint.

Historical Considerations

Parse performance related to long inlists was not an issue for the (now obsolete) Rule Based optimizer because it did not do cost evaluation.

So previously if there is a parsing issue with long inlists then the workarounds were as follows:

  • Use NO_EXPAND hint. This will NOT use an index with Oracle7 but can in Oracle8. Remember that the use of hints will force the use of the CBO.
  • Pre 10g: Use RBO

Explain the use of the USE_CONCAT hint.

Note: This Document is a legacy document written in the Oracle7 time frame. Although the general basis of the article may still be valid, some changes may have occurred.

SOLUTION

The use of the USE_CONCAT hint forces the optimizer to fully expand each OR predicate in the query into a separate query block. When you have multiple in-lists this can cause a single query to be expanded into many query blocks.

Using USE_CONCAT hint to override a full table scan with IN/OR statements.

With the cost based optimizer prior to 7.2.2 / 7.3, it is not possible to force the optimizer to use an index on a list of OR predicates or an IN clause with multiple values, if the total cost of the Index Scans is greater than the cost of a Full Table Scan. This is not a problem with small numbers of rows, but a sub optimal path may be chosen

when large numbers of rows are involved. In 7.2.2 a new hint for the Cost Based Optimizer, USE_CONCAT, is introduced which allows the user to suggest the use of indexes over a Full Table Scan.

Consider the following:

Table "table1" has been analyzed with the compute option and the OPTIMIZER_GOAL parameter has been set to CHOOSE in each case.

Full Table Scan cost = 3

Index Scan Cost = 1 每访问一次是1

Table "TABLE1" has an index "IND1" on column "COL1".

Remember that the optimizer converts IN predicates into a list of OR's.

Examples:

复制代码
SELECT * FROM table1 WHERE col1 IN (1,2,3)

Query Plan
------------------------------------------------------------------------
1.1 SELECT STATEMENT   Cost = 3
  2.1 CONCATENATION
    3.1 INDEX RANGE SCAN IND1
    3.2 INDEX RANGE SCAN IND1
    3.3 INDEX RANGE SCAN IND1

Explanation:

The optimizer chooses 3 index scans to retrieve the data when the cost is the same as a Full Table Scan.

复制代码
SELECT * FROM table1 WHERE col1 IN (1,2,3,4) 变成4个OR cost 为4

Query Plan
-----------------------------------------------------------------------
1.1 SELECT STATEMENT   Cost = 3
  2.1 TABLE ACCESS FULL TABLE1

Explanation: The optimizer chooses a Full Table Scan in preference to index scans as the cost is lower (FTS cost = 3, cost of 4 Index scans = 4).

复制代码
SELECT /*+ INDEX(table1 ind1) */ * FROM table1 WHERE col1 IN (1,2,3,4)

Query Plan
------------------------------------------------------------------------
1.1 SELECT STATEMENT   Cost = 3
  2.1 TABLE ACCESS FULL TABLE1

Explanation: The optimizer chooses a Full Table Scan over an index scan of multiple OR'd values (or an extended IN list), if the total cost of the multiple OR predicates is greater than the Full Table Scan cost. This behavior is acceptable unless large numbers of records are involved which cause the cost of a Full Table Scan to become restricted. The use of the index hint (/*+ INDEX(table1 ind1) */) does not help in this case as the index hint is applied to each of the OR predicates rather than to the query as a whole. When the optimizer adds together the cost of all the OR predicates, it finds that the total cost is greater than the cost of a Full Table Scan and chooses this instead, as per query 2. The hint is not being ignored.

The desired use of indexes can be forced in 7.2.2 using the USE_CONCAT hint which is enabled using event 10078. This event can be enabled at session or instance level as follows:

Session level:

复制代码
      ALTER SESSION SET EVENTS '10078 trace name context forever, level 99';

Instance Level (in the initialization parameter file):

复制代码
EVENT="10078 trace name context forever, level 1"

With this event enabled the USE_CONCAT hint may be used, which results in the following explain plan:

复制代码
      SELECT /*+ USE_CONCAT */ * FROM table1 WHERE col1 IN (1,2,3,4);

Query Plan
------------------------------------------------------------------------
1.4 SELECT STATEMENT   Cost = 4
  2.1 CONCATENATION
    3.1 INDEX RANGE SCAN IND1
    3.2 INDEX RANGE SCAN IND1
    3.3 INDEX RANGE SCAN IND1
    3.4 INDEX RANGE SCAN IND1

The hint will be enabled permanently in 7.3 - it will not require that you set the 10078 event.

相关推荐
CV工程师小林12 分钟前
【算法】BFS 系列之边权为 1 的最短路问题
数据结构·c++·算法·leetcode·宽度优先
Navigator_Z36 分钟前
数据结构C //线性表(链表)ADT结构及相关函数
c语言·数据结构·算法·链表
还听珊瑚海吗37 分钟前
数据结构—栈和队列
数据结构
Aic山鱼41 分钟前
【如何高效学习数据结构:构建编程的坚实基石】
数据结构·学习·算法
sjsjs111 小时前
【数据结构-一维差分】力扣1893. 检查是否区域内所有整数都被覆盖
数据结构·算法·leetcode
Lzc7741 小时前
堆+堆排序+topK问题
数据结构·
cleveryuoyuo2 小时前
二叉树的链式结构和递归程序的递归流程图
c语言·数据结构·流程图
湫兮之风2 小时前
c++:tinyxml2如何存储二叉树
开发语言·数据结构·c++
zhyhgx3 小时前
【算法专场】分治(上)
数据结构·算法
Joeysoda3 小时前
Java数据结构 时间复杂度和空间复杂度
java·开发语言·jvm·数据结构·学习·算法