01.06 Oracle面對“數據傾斜列使用綁定變量”場景的解決方案

原文鏈接:https://mp.weixin.qq.com/s/4eXFVWBXynAQvAdtvr-B9w (複製至瀏覽器,即可查看)


1. 背景知識介紹

我們知道,Oracle在傳統的OLTP(在線事務處理)類系統中,強烈推薦使用綁定變量,這樣可以有效的減少硬解析從而增加系統的併發處理能力。甚至在有些老舊系統,由於在開始開發階段缺乏認識沒有使用到綁定變量,後期併發量增長且無法改造程序時,運維DBA還會不得已去設置cursor_sharing=force來強制使用系統的綁定變量(這是一個萬不得已的方案,並不是最佳實踐)。

雖然使用綁定變量給OLTP系統帶來了巨大的好處,但也同時帶來一些棘手的問題,最典型的就是由於SQL文本中包含綁定變量,優化器無法知道綁定變量代表的具體值,只能使用默認的可選擇率,這就可能導致由於無法準確判斷值的可選擇率而造成選擇錯誤的執行計劃。Oracle在9i時代就有了針對這個問題的解決方案,即綁定變量窺探(bind peeking)特性。開啟該特性的情況下,當遇到有綁定變量的SQL,在其第一次硬解析時,優化器會窺探真實的值從而準確判斷可選擇率(selectivity),最終選擇正確的執行計劃。可是該特性同時又引入另一個棘手的問題,因為在第一次硬解析之後就都是軟/軟軟解析,所以也就不會再次窺探綁定變量的真實值,而如果該值所在字段本身數值比例就分佈不均,就極可能導致性能問題(尤其是如果第一次窺探的值代表了少數情況,那問題就會更加嚴重),所以一直以來,雖然Oracle默認是開啟這個特性的,但很多的客戶生產環境最佳實踐都將這個特性給關閉了。

直到Oracle 11g的時代,才推出了acs(adaptive_cursor_sharing)特性,配合bind peeking才算真正意義上解決了這個問題。不過也不夠完美,因為acs特性本身也的確會增加額外的硬解析,且會導致child cursor增多,從而軟解析掃描chain的時間變長,同時對shared pool空間需求也增加,且早期bug較多,即使Oracle默認也是開啟這個特性的,很多客戶生產環境也是將其關閉的。

在這種背景下,諮詢了公司SQL優化專家趙勇,建議是當遇到在數據傾斜的列上使用綁定變量的情況,應該及時與開發溝通,能否在這類數據分佈嚴重傾斜的列上不用綁定變量,若該列上的值很多,不用綁定變量可能導致大量的硬解析的話,還可在應用發出SQL前,先判斷其傳入的值,是否是非典型值,若不是的話,使用非綁定變量的SQL;若是典型值,則使用綁定變量的語句。

如果是不能改應用的情況呢?我目前能想到的是要麼犧牲非典型值的執行效率(防止非典型值先被窺探導致更嚴重的性能後果,可以按典型值的執行計劃綁定);要麼是乾脆嘗試同時打開bind peeking和acs特性,實際測試驗證能否解決問題同時不引起其他性能問題(如果是已經關閉這些特性的生產系統,開啟還是要慎重測試後決定)。


2.構造測試用例

下面構造一個簡單的測試用例來說明Oracle在這種場景下提供的解決方案(bind peeking + acs):

--建表T_SKEW,構造出嚴重的數據傾斜:

create table jingyu.t_skew as select * from dba_objects;

create index jingyu.idx_t_skew on jingyu.t_skew(object_id);

update jingyu.t_skew set object_id=3 where object_id>3;

commit;

--查看數據列OBJECT_ID的傾斜程度:

select object_id, count(*) from jingyu.t_skew group by object_id;

OBJECT_ID COUNT(*)

---------- ----------

2 1

3 86412

--收集統計信息:

exec dbms_stats.gather_table_stats('JINGYU','T_SKEW');

--查看列OBJECT_ID的直方圖信息:

select owner, table_name, column_name, histogram from dba_tab_col_statistics where table_name = 'T_SKEW' and column_name = 'OBJECT_ID';

OWNER Name Name HISTOGRAM

------------------------------------------------------------ --------------- ------------------------- ------------------------------

JINGYU T_SKEW OBJECT_ID FREQUENCY

使用MOS:SCRIPT - Select to show Optimizer Statistics for CBO (文檔 ID 31412.1) 提供的腳本查詢信息:

SQL> @sosi

SQL> set echo off

Please enter Name of Table Owner (Null = SYS): jingyu

Please enter Table Name to show Statistics for: t_skew

***********

Table Level

***********

Table Number Empty Average Chain Average Global User Sample Date

Name of Rows Blocks Blocks Space Count Row Len Stats Stats Size MM-DD-YYYY

--------------- ------------------ -------- ------------ ------- -------- ------- ------ ------ ------------------ ----------

T_SKEW 86,413 1,262 0 0 0 96 YES NO 86,413 08-26-2019

Column Column Distinct Number Number Global User Sample Date

Name Details Values Density Buckets Nulls Stats Stats Size MM-DD-YYYY

------------------------- ------------------------ ------------ ------- ------- ---------- ------ ------ ------------------ ----------

OWNER VARCHAR2(30) 27 0 1 0 YES NO 86,413 08-26-2019

OBJECT_NAME VARCHAR2(128) 51,864 0 1 0 YES NO 86,413 08-26-2019

SUBOBJECT_NAME VARCHAR2(30) 87 0 1 86,152 YES NO 261 08-26-2019

OBJECT_ID NUMBER(22) 2 0 2 0 YES NO 5,389 08-26-2019

DATA_OBJECT_ID NUMBER(22) 8,670 0 1 77,703 YES NO 8,710 08-26-2019

OBJECT_TYPE VARCHAR2(19) 44 0 1 0 YES NO 86,413 08-26-2019

CREATED DATE 904 0 1 0 YES NO 86,413 08-26-2019

LAST_DDL_TIME DATE 995 0 1 0 YES NO 86,413 08-26-2019

TIMESTAMP VARCHAR2(19) 1,036 0 1 0 YES NO 86,413 08-26-2019

STATUS VARCHAR2(7) 2 1 1 0 YES NO 86,413 08-26-2019

TEMPORARY VARCHAR2(1) 2 1 1 0 YES NO 86,413 08-26-2019

GENERATED VARCHAR2(1) 2 1 1 0 YES NO 86,413 08-26-2019

SECONDARY VARCHAR2(1) 2 1 1 0 YES NO 86,413 08-26-2019

NAMESPACE NUMBER(22) 20 0 1 0 YES NO 86,413 08-26-2019

EDITION_NAME VARCHAR2(30) 0 0 0 86,413 YES NO 08-26-2019

B Average Average

Index Tree Leaf Distinct Number Leaf Blocks Data Blocks Cluster Global User Sample Date

Name Unique Level Blks Keys of Rows Per Key Per Key Factor Stats Stats Size MM-DD-YYYY

--------------- --------- ----- ---- -------------- ------------------ ----------- ----------- ------------ ------ ------ ------------------ ----------

IDX_T_SKEW NONUNIQUE 1 298 2 86,413 149 617 1,234 YES NO 86,413 08-26-2019

Index Column Col Column

Name Name Pos Details

--------------- ------------------------- ---- ------------------------

IDX_T_SKEW OBJECT_ID 1 NUMBER(22)

***************

Partition Level

***************

***************

SubPartition Level

***************

SQL>


3. 場景測試

3.1 首先確認bind_peeking和acs都是開啟狀態

--查詢隱藏參數:

set linesize 333

col name for a35

col description for a66

col value for a30

SELECT i.ksppinm name,

i.ksppdesc description,

CV.ksppstvl VALUE

FROM sys.x$ksppi i, sys.x$ksppcv CV

WHERE i.inst_id = USERENV ('Instance')

AND CV.inst_id = USERENV ('Instance')

AND i.indx = CV.indx

AND i.ksppinm LIKE '%¶m%'

ORDER BY REPLACE (i.ksppinm, '_', '');

--相關隱藏參數的默認值(表示bind_peeking和acs都是開啟的):

NAME DESCRIPTION VALUE

----------------------------------- ------------------------------------------------------------------ ------------------------------

_optim_peek_user_binds enable peeking of user binds TRUE

_optimizer_adaptive_cursor_sharing optimizer adaptive cursor sharing TRUE

_optimizer_extended_cursor_sharing optimizer extended cursor sharing UDO

_optimizer_extended_cursor_sharing_ optimizer extended cursor sharing for relational operators SIMPLE

rel

3.2 場景測試用例和測試結果

--1)場景測試用例

alter session set current_schema=jingyu;

alter session set statistics_level=all;

set lines 200 pages 200

var v1 number;

exec :v1 := 2;

select count(*) from t_skew where object_id = :v1;

select * from table(dbms_xplan.display_cursor(null,null,'allstats'));

exec :v1 := 3;

select count(*) from t_skew where object_id = :v1;

select * from table(dbms_xplan.display_cursor(null,null,'allstats'));

select count(*) from t_skew where object_id = :v1;

select * from table(dbms_xplan.display_cursor(null,null,'allstats'));

--2)場景測試結果

SQL> alter system flush shared_pool;

SQL> alter session set current_schema=jingyu;

SQL> alter session set statistics_level=all;

SQL> set lines 200 pages 200

SQL>

--綁定變量值為2,第一次執行,採用INDEX RANGE SCAN的執行計劃,Plan hash value: 3167530345:

SQL> var v1 number;

SQL> exec :v1 := 2;

SQL> select count(*) from t_skew where object_id = :v1;

COUNT(*)

----------

1

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats'));

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID 7mz2mhz0nq92n, child number 0

-------------------------------------

select count(*) from t_skew where object_id = :v1

Plan hash value: 3167530345

------------------------------------------------------------------------------------------

| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |

------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 2 |

| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 2 |

|* 2 | INDEX RANGE SCAN| IDX_T_SKEW | 1 | 16 | 1 |00:00:00.01 | 2 |

------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_ID"=:V1)

--綁定變量值為3,第一次執行,沿用INDEX RANGE SCAN的執行計劃,Plan hash value: 3167530345:

SQL> exec :v1 := 3;

SQL> select count(*) from t_skew where object_id = :v1;

COUNT(*)

----------

86412

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats'));

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID 7mz2mhz0nq92n, child number 0

-------------------------------------

select count(*) from t_skew where object_id = :v1

Plan hash value: 3167530345

------------------------------------------------------------------------------------------

| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |

------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 2 | | 2 |00:00:00.10 | 301 |

| 1 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:00.10 | 301 |

|* 2 | INDEX RANGE SCAN| IDX_T_SKEW | 2 | 16 | 86413 |00:00:00.06 | 301 |

------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("OBJECT_ID"=:V1)

--綁定變量值為3,第二次執行,變為INDEX FAST FULL SCAN的執行計劃,Plan hash value: 2333720604:

SQL> select count(*) from t_skew where object_id = :v1;

COUNT(*)

----------

86412

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats'));

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID 7mz2mhz0nq92n, child number 1

-------------------------------------

select count(*) from t_skew where object_id = :v1

Plan hash value: 2333720604

----------------------------------------------------------------------------------------------

| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |

----------------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.07 | 502 |

| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.07 | 502 |

|* 2 | INDEX FAST FULL SCAN| IDX_T_SKEW | 1 | 86389 | 86412 |00:00:00.04 | 502 |

----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("OBJECT_ID"=:V1)

SQL>

可以看到,當第二次執行綁定變量值為3的SQL時,執行計劃自適應調整了。

3.3 場景測試深入分析

You can use the V$ views for adaptive cursor sharing to see selectivity ranges, cursor information (such as whether a cursor is bind-aware or bind-sensitive), and execution statistics:

V$SQL shows whether a cursor is bind-sensitive or bind-aware

V$SQL_CS_HISTOGRAM shows the distribution of the execution count across a three-bucket execution history histogram

V$SQL_CS_SELECTIVITY shows the selectivity ranges stored for every predicate containing a bind variable if the selectivity was used to check cursor sharing

V$SQL_CS_STATISTICS summarizes the information that the optimizer uses to determine whether to mark a cursor bind-aware.

通過v$sql查看SQL(SQL_ID = '7mz2mhz0nq92n')的child_number, executions, buffer_gets, bind-sensitive, bind-aware, is_shareable信息:

SQL> SELECT CHILD_NUMBER, EXECUTIONS, BUFFER_GETS, IS_BIND_SENSITIVE AS "BS",

2 IS_BIND_AWARE AS "BA", IS_SHAREABLE AS "SH", PLAN_HASH_VALUE

3 FROM V$SQL

4 WHERE SQL_ID = '7mz2mhz0nq92n';

CHILD_NUMBER EXECUTIONS BUFFER_GETS BS BA SH PLAN_HASH_VALUE

------------ ---------- ----------- -- -- -- ---------------

0 2 348 Y N N 3167530345

1 1 502 Y Y Y 2333720604

--再次分別執行綁定變量值為3和2的SQL:

SQL> select count(*) from t_skew where object_id = :v1;

COUNT(*)

----------

86412

SQL> exec :v1 := 2;

SQL> select count(*) from t_skew where object_id = :v1;

COUNT(*)

----------

1

--再次查詢v$sql

CHILD_NUMBER EXECUTIONS BUFFER_GETS BS BA SH PLAN_HASH_VALUE

------------ ---------- ----------- -- -- -- ---------------

0 2 348 Y N N 3167530345

1 2 1004 Y Y Y 2333720604

2 1 2 Y Y Y 3167530345

可以看到目前該SQL的parent cursor下掛了3個child_number(0和1和2,其中1和2的SH值為Y,意思為可共享;0的SH值為N,意思為不可共享)。

通過v$sql_cs_*查詢acs的相關信息:

--V$SQL_CS_HISTOGRAM

SQL> select * from V$SQL_CS_HISTOGRAM where sql_id = '7mz2mhz0nq92n';

ADDRESS HASH_VALUE SQL_ID CHILD_NUMBER BUCKET_ID COUNT

---------------- ---------- -------------------------- ------------ ---------- ----------

0000000087F34700 3242927188 7mz2mhz0nq92n 2 0 1

0000000087F34700 3242927188 7mz2mhz0nq92n 2 1 0

0000000087F34700 3242927188 7mz2mhz0nq92n 2 2 0

0000000087F34700 3242927188 7mz2mhz0nq92n 1 0 0

0000000087F34700 3242927188 7mz2mhz0nq92n 1 1 2

0000000087F34700 3242927188 7mz2mhz0nq92n 1 2 0

0000000087F34700 3242927188 7mz2mhz0nq92n 0 0 1

0000000087F34700 3242927188 7mz2mhz0nq92n 0 1 1

0000000087F34700 3242927188 7mz2mhz0nq92n 0 2 0

--V$SQL_CS_SELECTIVITY

SQL> col PREDICATE for a30

SQL> select * from V$SQL_CS_SELECTIVITY where sql_id = '7mz2mhz0nq92n';

ADDRESS HASH_VALUE SQL_ID CHILD_NUMBER PREDICATE RANGE_ID LOW HIGH

---------------- ---------- -------------------------- ------------ ------------------------------ ---------- -------------------- --------------------

0000000087F34700 3242927188 7mz2mhz0nq92n 2 =V1 0 0.000167 0.000204

0000000087F34700 3242927188 7mz2mhz0nq92n 1 =V1 0 0.899749 1.099694

SQL>

--V$SQL_CS_STATISTICS

SQL> select * from V$SQL_CS_STATISTICS where sql_id = '7mz2mhz0nq92n';

ADDRESS HASH_VALUE SQL_ID CHILD_NUMBER BIND_SET_HASH_VALUE PE EXECUTIONS ROWS_PROCESSED BUFFER_GETS CPU_TIME

---------------- ---------- -------------------------- ------------ ------------------- -- ---------- -------------- ----------- ----------

0000000087F34700 3242927188 7mz2mhz0nq92n 2 2064090006 Y 1 4 2 0

0000000087F34700 3242927188 7mz2mhz0nq92n 1 2706503459 Y 1 172826 502 0

0000000087F34700 3242927188 7mz2mhz0nq92n 0 2064090006 Y 1 4 49 0

SQL>


4.總結

4.1 清理某條SQL的執行計劃

--查詢SQL的ADDRESS和HASH_VALUE

SQL> select sql_id, ADDRESS, HASH_VALUE from v$sqlarea where sql_id = '7mz2mhz0nq92n';

SQL_ID ADDRESS HASH_VALUE

-------------------------- ---------------- ----------

7mz2mhz0nq92n 0000000087F34700 3242927188

--清理SQL的執行計劃

SQL> exec sys.DBMS_SHARED_POOL.PURGE('0000000087F34700,3242927188','C');

4.2 bind peeking和acs特性的關閉

--均為動態參數

--bind peeking(綁定變量窺探)

alter system set "_optim_peek_user_binds"=false;

--acs(adaptive cursor sharing)

alter system set "_optimizer_extended_cursor_sharing_rel"=NONE;

alter system set "_optimizer_extended_cursor_sharing"=NONE;

alter system set "_optimizer_adaptive_cursor_sharing"=false;

特別注意:如果bind peeking是關閉的,實際上acs也就不會起作用,比如我這裡只將_optim_peek_user_binds參數設置為false,再次按照3.2步驟重複同樣實驗,查詢結果如下,不會用到acs特性,即使我沒有顯示禁用掉acs對應的參數:

SQL> SELECT CHILD_NUMBER, EXECUTIONS, BUFFER_GETS, IS_BIND_SENSITIVE AS "BS",

2 IS_BIND_AWARE AS "BA", IS_SHAREABLE AS "SH", PLAN_HASH_VALUE

3 FROM V$SQL

4 WHERE SQL_ID = '7mz2mhz0nq92n';

CHILD_NUMBER EXECUTIONS BUFFER_GETS BS BA SH PLAN_HASH_VALUE

------------ ---------- ----------- -- -- -- ---------------

0 3 1506 N N Y 2333720604

--可以看到這3次執行執行計劃都是一樣的,因為受到OPT_PARAM('_optim_peek_user_binds' 'false')影響,採用了INDEX FAST FULL SCAN的執行計劃,Plan hash value: 2333720604:

SQL> select * from table(dbms_xplan.display_cursor('7mz2mhz0nq92n',0,'advanced'));

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID 7mz2mhz0nq92n, child number 0

-------------------------------------

select count(*) from t_skew where object_id = :v1

Plan hash value: 2333720604

------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | 82 (100)| |

| 1 | SORT AGGREGATE | | 1 | 3 | | |

|* 2 | INDEX FAST FULL SCAN| IDX_T_SKEW | 43207 | 126K| 82 (0)| 00:00:01 |

------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):

-------------------------------------------------------------

1 - SEL$1

2 - SEL$1 / T_SKEW@SEL$1

Outline Data

-------------

/*+

BEGIN_OUTLINE_DATA

IGNORE_OPTIM_EMBEDDED_HINTS

OPTIMIZER_FEATURES_ENABLE('11.2.0.4')

DB_VERSION('11.2.0.4')

OPT_PARAM('_optim_peek_user_binds' 'false')

ALL_ROWS

OUTLINE_LEAF(@"SEL$1")

INDEX_FFS(@"SEL$1" "T_SKEW"@"SEL$1" ("T_SKEW"."OBJECT_ID"))

END_OUTLINE_DATA

*/

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("OBJECT_ID"=:V1)

Column Projection Information (identified by operation id):

-----------------------------------------------------------

1 - (#keys=0) COUNT(*)[22]

所以在確認acs特性是否開啟時,同時也要查詢bind peek的設置情況。

原文:https://www.cnblogs.com/jyzhao/p/11415820.html


分享到:


相關文章: