59
Performance troubleshooting using Active Session History OUG Ireland Conference 2012 Marcin Przepiórowski

performance-troubleshooting-using-active-session-history.pdf

  • Upload
    noman78

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: performance-troubleshooting-using-active-session-history.pdf

Performance troubleshooting using Active Session History

OUG Ireland Conference 2012

Marcin Przepiórowski

Page 2: performance-troubleshooting-using-active-session-history.pdf

Intro : About me

• Oracle DBA / consultant / trainer since 2000

• Oracle ACE – since 2010

• Blogger - http://oracleprof.blogspot.com/

2

Page 3: performance-troubleshooting-using-active-session-history.pdf

Agenda

• Performance tuning problems and goals

• DBTime is money

• Average Active Session

• Active Session History

• Case study

3

Page 4: performance-troubleshooting-using-active-session-history.pdf

Performance tuning

• Define a metrics to measure – best metrics are related to time and business activities

• Compare performance with system baseline and business goals

IT USER

4

Page 5: performance-troubleshooting-using-active-session-history.pdf

Tuning goals

• 99 % of response time for SQL/business transaction has to be X seconds, rest has to be maximum Y seconds

• For 99 % of executions - X number of rows/business operations have to be processed in Y seconds

5

Page 6: performance-troubleshooting-using-active-session-history.pdf

DBTime is money

DB Time

– Oracle doc “Database time represents the total time spent in database calls, and is an indicator of the total instance workload”

– Sum of CPU time and not idle wait time (IO and others)

– ASH samples counts = DB Time in seconds (proof in DB Time Oracle Performance Tuning: Theory and Practice by Graham Wood, John Beresniewicz)

6

Page 7: performance-troubleshooting-using-active-session-history.pdf

DB Time

• DB Time increase when system load increase

– number of active sessions increase

– number of calls increase

• DB Time increase when system throughput/performance decrease

– IO time increase

– non-idle events time increase (locks, network problems)

7

Page 8: performance-troubleshooting-using-active-session-history.pdf

Average Active Session

Average Active Session

AAS = ∆DB Time / Elapsed clock time

Average Active Session using ASH samples

AAS = ∑ ASH samples (∆ t=1s)/ Elapsed clock time

For more detail go to Appendix A of Average active sessions: the magic metric ? By John Beresniewicz

8

Page 9: performance-troubleshooting-using-active-session-history.pdf

Average Active Session

Time

t0 t1

AAS = (DB TIME(t1) – DB TIME(t0) ) / (t1 – t0)

9

Page 10: performance-troubleshooting-using-active-session-history.pdf

Average Active Session

Time

t1

AAS = ∑ ASH samples/ (t1 – t0)

2 2 0 3 3 2 2 2 No of samples

t0

10

Page 11: performance-troubleshooting-using-active-session-history.pdf

AAS using ASH

Time

t1 t0

Be aware of using small number of samples – poor and misleading results More samples – better results

DB TIME

ASH samples

11

Page 12: performance-troubleshooting-using-active-session-history.pdf

Average Active Session

• Total upper AAS boundary is equal to number of sessions connected to instance

• AAS ≈ 0 – database is idle

• AAS < no of CPU – no system bottleneck

• AAS >> no of CPU – database bottleneck

12

Page 13: performance-troubleshooting-using-active-session-history.pdf

• Upper boundary for CPU class – number of CPU cores

• If CPU class is close to upper boundary there is a CPU bottleneck in the system

Average Active Session - CPU

13

Page 14: performance-troubleshooting-using-active-session-history.pdf

• Upper boundary for IO vs. maximum system IO/s

• Single IO operation time

– single block read

– multiple block read

• Number of IO requests in one ASH sample #IO requests / sample ≈ 1 s / max ( sbrt, mbrt )

Average Active Session - IO

14

Page 15: performance-troubleshooting-using-active-session-history.pdf

System max IO/s = 8000 IO/s Avg IO time = 5 ms (0.005 s)

Max #IO per ASH sample = 1 / 0.005 = 200

Upper boundary for User/System IO class is

8000 / 200 = 40

Average Active Session - IO

15

Page 16: performance-troubleshooting-using-active-session-history.pdf

Active Session History

• Active Session History is system-wide activity trace

• Active sessions are sampled and keep in-memory

– Active != Idle - not idle event or on CPU

– Sampled every second

– Inserted into circular buffer in SGA

– MMON Light – take care about it

16

Page 17: performance-troubleshooting-using-active-session-history.pdf

Active Session History

• Started with first release of Oracle 10g

• Enhancement in all next versions – number of columns/metrics increased

• Together with AWR can provide historical information – every 1/10 of samples are store in persistent tables

17

Page 18: performance-troubleshooting-using-active-session-history.pdf

Active Session History

• 45 columns in 10.2.0.4

• 66 columns in 11.1.0.7

• 93 columns in 11.2.0.1

Important new columns:

- sql_plan_line_id – 11g

- machine – 11gR2

- delta columns – 11gR2

18

Page 19: performance-troubleshooting-using-active-session-history.pdf

Active Session History

V$ACTIVE_SESSION_HISTORY

Main view based on in-memory X$ASH table. Circular buffer size from 1 to 128 MB about 2 MB / CPU.

Flushed to disk:

- every hour / AWR sample

- when buffer is 2/3 full

19

Page 20: performance-troubleshooting-using-active-session-history.pdf

DBA_HIST_ACTIVE_SESS_HISTORY

Flushed history of ASH – persistent table

– 1/10 of sampled data

– Partitioned for easier purging

– Part of AWR system

Active Session History

20

Page 21: performance-troubleshooting-using-active-session-history.pdf

Using ASH – ON CPU

21

Column name Value

SESSION_STATE ON CPU

WAIT_TIME Non zero value

EVENT, P1, P2, P3, CURRENT_OBJ#, CURRENT_FILE#, CURRENT_BLOCK#

Maybe not cleared from previous event

Page 22: performance-troubleshooting-using-active-session-history.pdf

Using ASH - WAITING

22

Column name Value

SESSION_STATE WAITING

WAIT_TIME 0

TIME_WAITED only last sample is updated/fixed after event – DO NOT use in ASH calculations

Page 23: performance-troubleshooting-using-active-session-history.pdf

Using ASH - WAITING

SQL> select sample_time, event, TIME_WAITED from

v$active_session_history where session_id = 170

and sql_id = '6548zp29zsqgj' order by 1;

SAMPLE_TIME EVENT TIME_WAITED

------------------ ------------------------------- -----------

08.39.59.483 AM enq: TX - row lock contention 0

08.40.00.483 AM enq: TX - row lock contention 0

08.40.01.483 AM enq: TX - row lock contention 0

08.40.02.483 AM enq: TX - row lock contention 0

. . .

08.40.18.503 AM enq: TX - row lock contention 0

08.40.19.503 AM enq: TX - row lock contention 21295426

23

Page 24: performance-troubleshooting-using-active-session-history.pdf

Concurrency Cluster

User I/O Application

System I/O Queuing

Administrative Scheduler

Other Network

Configuration Commit

Using ASH - WAITING

• EVENTS – 1142 in 11.2.0.2

• WAIT CLASS

24

Page 25: performance-troubleshooting-using-active-session-history.pdf

ASH - Math

• Use count(*) for calculate wait or ON CPU time

• Group by session_id, sql_id, client_id, etc....

• As ASH is a sample do not use time_waited column in calculations:

– sum(time_waited)

– avg(time_waited)

– etc.

25

Page 26: performance-troubleshooting-using-active-session-history.pdf

ASH - Math

Example of query – Profile of SQL queries from ASH

select

sql_id,

event,

count(*) cnt

from v$active_session_history

group by sql_id,event

26

Page 27: performance-troubleshooting-using-active-session-history.pdf

Using ASH - time

• sample_time – when it happen

• Using samples across time

– in-memory – 1 second sample

– on-disk – 10 second sample

select count(*)

from v$active_session_history ...

select count(*) * 10

from dba_hist_active_sess_history ...

27

Page 28: performance-troubleshooting-using-active-session-history.pdf

Using ASH - Drilldown

• session_id

• user_id

• program

• module

• action

• machine

• client_id

• sql_id

• pl/sql

• blocking_session

• object

• file

• etc.

28

Page 29: performance-troubleshooting-using-active-session-history.pdf

Top SQL

select sql_id,

round((count(*) / sum(count(*)) over

())*100,2) ActPct

from v$active_session_history

where sql_id is not null

and sample_time > sysdate–5/1440

group by sql_id

order by ActPct

29

Page 30: performance-troubleshooting-using-active-session-history.pdf

Top Session

select session_id, round((count(*) /

sum(count(*)) over ())*100,2) ActPct

from v$active_session_history

where

sample_time > sysdate–5/1440

group by session_id

order by ActPct desc

30

Page 31: performance-troubleshooting-using-active-session-history.pdf

Top program

select program, round((count(*) /

sum(count(*)) over ())*100,2) ActPct

from v$active_session_history

where

sample_time > sysdate–5/1440

group by program

order by ActPct desc

31

Page 32: performance-troubleshooting-using-active-session-history.pdf

Blocking session

select

session_id, event, blocking_session

from v$active_session_history

where

blocking_session is not null

and sample_time > sysdate - 5/1440;

32

Page 33: performance-troubleshooting-using-active-session-history.pdf

OEM 11g

33

Page 34: performance-troubleshooting-using-active-session-history.pdf

OEM 12c – ASH Analytics

34

Page 35: performance-troubleshooting-using-active-session-history.pdf

35

OEM 12c – ASH Analytics

Page 36: performance-troubleshooting-using-active-session-history.pdf

36

OEM 12c – ASH Analytics

Page 37: performance-troubleshooting-using-active-session-history.pdf

37

OEM 12c – ASH Analytics

Page 38: performance-troubleshooting-using-active-session-history.pdf

38

OEM 12c – ASH Analytics

Page 39: performance-troubleshooting-using-active-session-history.pdf

ASH availability

• Oracle 10g onwards

• Oracle Enterprise Edition only

• Diagnostic and Tuning Pack required

• Accessibility:

– Oracle Enterprise Manager – Performance Tab

– Text / html reports – ASHRPT.SQL

– V$ and DBA_HIST tables

39

Page 40: performance-troubleshooting-using-active-session-history.pdf

Other options

• 3rd party products – DB Optimizer, I3, Spotlight

• Free solution – Simulating-ASH

http://ashmasters.com/

https://sourceforge.net/projects/orasash https://github.com/pioro/orasash

• Interactive ASH tool with statistics Snapper v. 3.0. by Tanel Poder

40

Page 41: performance-troubleshooting-using-active-session-history.pdf

Case study

• Average Active Sessions vs. system throughput

• Is “big stuff” always a problem ?

• Do I need to investigate a “big stuff” ?

• Compare it with throughput first – DB Time will increase when system load increase

41

Page 42: performance-troubleshooting-using-active-session-history.pdf

Case study

AverageTransactions/s - 80.23 AverageTransactions/s - 98.18

42

Page 43: performance-troubleshooting-using-active-session-history.pdf

Case Study

• Average Active Session – system load

• Load profile

• SQL plan flip

• SQL query “profile”

43

Page 44: performance-troubleshooting-using-active-session-history.pdf

select mtime, round(sum(c1),2) AAS_WAIT,

round(sum(c2),2) AAS_CPU,

round(sum(cnt),2) AAS

from ( select

to_char(sample_time,'YYYY-MM-DD HH24‘) mtime,

decode(session_state,'WAITING',count(*),0)/360 c1,

decode(session_state,'ON CPU',count(*),0) /360 c2,

count(*)/360 cnt

from dba_hist_active_sess_history

group by to_char(sample_time,'YYYY-MM-DD HH24'),

session_state

)

group by mtime order by mtime;

AAS – system load

44

Page 45: performance-troubleshooting-using-active-session-history.pdf

AAS – system load

MTIME AAS_WAIT AAS_CPU AAS

------------- -------- ------- ------

2012-03-03 08 6.36 4.23 10.58

2012-03-03 09 17.91 6.64 24.54

2012-03-03 10 33.10 8.63 41.73

2012-03-03 11 29.34 8.90 38.24

2012-03-03 12 29.76 8.76 38.52

2012-03-03 13 33.30 9.59 42.89

2012-03-03 14 38.02 9.40 47.42

2012-03-03 15 21.05 6.11 27.16

2012-03-03 16 3.81 1.63 5.44

2012-03-03 17 3.84 1.41 5.25

45

Page 46: performance-troubleshooting-using-active-session-history.pdf

AAS – system load

46

Page 47: performance-troubleshooting-using-active-session-history.pdf

Load profile

select * from (

select

decode(session_state,'WAITING',event,'ON CPU'),

count(*) cnt,

count(*) / (sum(count(*)) over ()) * 100 pct

from dba_hist_active_sess_history

where

sample_time between X and Y

group by

decode(session_state,'WAITING',event,'ON CPU')

order by cnt desc

) where rownum <= 5;

47

Page 48: performance-troubleshooting-using-active-session-history.pdf

Load profile

EVENT CNT PCT

------------------------------ ---------- -------

enq: US - contention 8059 53.65

ON CPU 3107 20.68

db file sequential read 645 4.29

latch: undo global data 536 3.57

enq: TX - row lock contention 446 2.97

ON CPU 1347 38.38

db file sequential read 524 14.93

enq: US - contention 365 10.40

LNS wait on SENDREQ 215 6.13

LGWR-LNS wait on channel 201 5.73

48

Page 49: performance-troubleshooting-using-active-session-history.pdf

SQL plan flip

select sql_id, count(*)

from (

select distinct sql_id, sql_plan_hash_value

from dba_hist_active_sess_history

where sql_opname <> 'INSERT' and user_id = X

and sql_plan_hash_value <> 0)

group by sql_id

having count(*) > 1

order by 2;

49

Page 50: performance-troubleshooting-using-active-session-history.pdf

SQL plan flip

SQL_ID COUNT(*)

------------- ----------

a7rf2v00g2y9t 2

8w2904am00nbq 2

dyuz1k3bdr36s 2

7upuf7wbk7nbf 2

da6wsxtsz2ftk 3

gur9jrbxvbvur 6

50

No sql_opname filter

Page 51: performance-troubleshooting-using-active-session-history.pdf

SQL plan flip ?

select distinct sql_id, sql_plan_hash_value

from dba_hist_active_sess_history

where sql_id = 'gur9jrbxvbvur';

SQL_ID SQL_PLAN_HASH_VALUE

------------- -------------------

gur9jrbxvbvur 2263242137

gur9jrbxvbvur 3926490670

select sql_id from v$sql

where plan_hash_value = 3926490670;

SQL_ID

-------------

62fbdqwqht3x1

51

Page 52: performance-troubleshooting-using-active-session-history.pdf

Lazy clean up

select *

from table(dbms_xplan.display_cursor

('62fbdqwqht3x1',null));

SQL_ID 62fbdqwqht3x1, child number 0

-------------------------------------

select KVAL_SEQUENCE.nextval from dual

Plan hash value: 3926490670

----------------------

| 0 | SELECT STATEMENT |

| 1 | SEQUENCE |

| 2 | FAST DUAL |

52

Page 53: performance-troubleshooting-using-active-session-history.pdf

SQL plan flip

select

distinct sql_id, sql_plan_hash_value, sql_opname

from dba_hist_active_sess_history

where sql_id = 'dyuz1k3bdr36s';

SQL_ID SQL_PLAN_HASH_VALUE SQL_OPNAME

------------- ------------------- -----------

dyuz1k3bdr36s 760792886 SELECT

dyuz1k3bdr36s 2913971644 SELECT

53

Page 54: performance-troubleshooting-using-active-session-history.pdf

SQL plan flip

select

to_char(sample_time, 'YYYY-MM-DD HH24') sample_time,

sql_id, sql_plan_hash_value

from dba_hist_active_sess_history

where

sql_id = 'dyuz1k3bdr36s'

group by

to_char(sample_time, 'YYYY-MM-DD HH24'),

sql_id, sql_plan_hash_value

order by sample_time;

54

Page 55: performance-troubleshooting-using-active-session-history.pdf

SQL plan flip

SAMPLE_TIME SQL_ID SQL_PLAN_HASH_VALUE

------------- ------------- -------------------

2012-03-01 08 dyuz1k3bdr36s 760792886

2012-03-01 09 dyuz1k3bdr36s 760792886

2012-03-01 10 dyuz1k3bdr36s 760792886

2012-03-01 10 dyuz1k3bdr36s 2913971644

2012-03-01 11 dyuz1k3bdr36s 2913971644

. . .

2012-03-02 11 dyuz1k3bdr36s 2913971644

2012-03-02 12 dyuz1k3bdr36s 760792886

2012-03-02 12 dyuz1k3bdr36s 2913971644

2012-03-02 13 dyuz1k3bdr36s 760792886

2012-03-02 14 dyuz1k3bdr36s 760792886

55

Page 56: performance-troubleshooting-using-active-session-history.pdf

SQL query “profile”

select sql_plan_hash_value, sql_exec_id,

count(*) cnt

from dba_hist_active_sess_history

where sample_time between X and Y

and sql_id = 'dyuz1k3bdr36s'

group by sql_plan_hash_value,sql_exec_id

order by sql_plan_hash_value, cnt;

SQL_PLAN_HASH_VALUE SQL_EXEC_ID CNT

------------------- ----------- ----------

760792886 17324679 1

2913971644 17324251 51

56

Page 57: performance-troubleshooting-using-active-session-history.pdf

SQL query “profile”

select

decode(session_state,'WAITING',event,'ON CPU'),

count(*) cnt

from dba_hist_active_sess_history

where sql_id = 'dyuz1k3bdr36s'

and SQL_EXEC_ID = 999999999

group by

decode(session_state,'WAITING',event,'ON CPU')

order by cnt;

57

Page 58: performance-troubleshooting-using-active-session-history.pdf

SQL query “profile”

SQL EXEC ID - 17324255

EVENT CNT

----------------------------- ----------

db file sequential read 3

ON CPU 6

db file scattered read 41

SQL EXEC ID - 17324679

EVENT CNT

----------------------------- ----------

db file scattered read 1

58

Page 59: performance-troubleshooting-using-active-session-history.pdf

Q & A 59