19
1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.co © 2007 Julian Dyke

1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

Embed Size (px)

Citation preview

Page 1: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

1

RACInternals

Julian Dyke

Independent Consultant

Web Version

juliandyke.com

© 2007 Julian Dyke

Page 2: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

2

juliandyke.com

© 2007 Julian Dyke

Agenda

Transactions in RAC Cross Instance Consistent Reads

Page 3: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

3

juliandyke.com

© 2007 Julian Dyke

Introduction

Page 4: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

4

juliandyke.com

© 2007 Julian Dyke

System Change Number In RAC clusters SCN must be maintained across all nodes in

cluster SCN propagation scheme differs according to version

In Oracle 9.2 and below defaults to Lamport algorithm Lamport or SCN Scheme 2 in alert.log SCN piggy-backed on GCS/GES messages Recorded in redo log Default delay of 7 seconds

In Oracle 10.1 and above uses a new algorithm SCN Scheme 3 in alert.log Broadcast on commit Apparently no delay

Page 5: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

5

juliandyke.com

© 2007 Julian Dyke

MAX_COMMIT_PROPAGATION_DELAY Prior to Oracle 10.2

Default value is 700 centiseconds (7 seconds) Specifies maximum time taken for a COMMIT on one node

to be reflected on other nodes in the cluster For some applications, value must be set to 0 (Broadcast

on commit) including: E-Business suite SAP

In Oracle 10.2 and above default value is 0

Page 6: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

6

juliandyke.com

© 2007 Julian Dyke

LMS Background Processes LMS background processes:

Implement cache fusion Serve both consistent and current versions of blocks in

cache of local instance to other instances Maintain local part of Global Resource Directory

Minimum of 1 LMS process per instance Maximum is version dependent

Oracle 9.2 10 Oracle 10.1 20 Oracle 10.2 36

Prior to Oracle 10.1, could be configured using _lm_lms parameter

In Oracle 10.1 and above, initial number of LMS processes specified by gcs_server_processes parameter

Page 7: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

7

juliandyke.com

© 2007 Julian Dyke

LMS Background Processes Each LMS background process manages a set of blocks

Determined by hash function based on number of LMS background processes

Consequently a block will always be handled by the same LMS

process

Number of blocks served recorded in Session / System statistics V$CR_BLOCK_SERVER

Page 8: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

8

juliandyke.com

© 2007 Julian Dyke

Cross Instance Consistent Read

UPDATE score SET runs = runs + 4 WHERE team = 'ENG';

slot 0

col1: ENG

col2: 340

col3: 1

slot 1

col1: AUS

col2: 99

col3: 10

block 42 slot 0

col3: 3405.1

block 42 slot 0

col3: 3445.1

ITL1

block 42 slot 0

col3: 3505.1

uba 800777.530.13

uba 800777.530.12

uba: -

seq: 530 irb 12

12

13

14

xid: 0005.018.4E7

segment 5 slot 18: state: 10wrap#: 4E7dba: 00800777

UPDATE score SET runs = runs + 6 WHERE team = 'ENG';

UPDATE score SET runs = runs + 2 WHERE team = 'ENG'; Undo Header

Data Block 42 Undo Block 800777

xid: 0005.018.4E7

uba: 800777.530.12

col2: 344

uba: 800777.530.13

col2: 350col2: 352

uba: 800777.530.14

Session 27LMS0

Instance 2Instance 1

Session 15

SELECT runs,wicketsFROM scoreWHERE team = 'ENG';

Build read consistent version of block 42

slot 0

col1: ENG

col2: 340

col3: 1

slot 1

col1: AUS

col2: 99

col3: 10

ITL1

Data Block 42

xid: 0005.018.4E7

uba: 800777.530.12

col2: 344

uba: 800777.530.13

col2: 350col2: 352

uba: 800777.530.14

slot 0

col1: ENG

col2: 340

col3: 1

slot 1

col1: AUS

col2: 99

col3: 10

ITL1

Data Block 42

xid: 0005.018.4E7

uba: 800777.530.12

col2: 344

uba: 800777.530.13

col2: 350col2: 352

uba: 800777.530.14

Data Block 42 (copy)

uba: 800777.530.13

col2: 350

uba: 800777.530.12

col2: 344

uba: -

col2: 340

slot 0

col1: ENG

col2: 340

col3: 1

slot 1

col1: AUS

col2: 99

col3: 10

ITL1

Data Block 42

xid: 0005.018.4E7

uba: 800777.530.12

col2: 344

uba: 800777.530.13

col2: 350col2: 352

uba: 800777.530.14

Data Block 42 (copy)

uba: 800777.530.13

col2: 350

uba: 800777.530.12

col2: 344

uba: -

col2: 340

slot 0

col1: ENG

col2: 340

col3: 1

slot 1

col1: AUS

col2: 99

col3: 10

ITL1

Data Block 42

xid: 0005.018.4E7

uba: 800777.530.12

col2: 344

uba: 800777.530.13

col2: 350col2: 352

uba: 800777.530.14

Data Block 42 (copy)

uba: 800777.530.13

col2: 350

uba: 800777.530.12

col2: 344

uba: -

col2: 340

Page 9: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

9

juliandyke.com

© 2007 Julian Dyke

V$CR_BLOCK_SERVERColumn Name Data

TypeDescription

CR_REQUESTS NUMBER # CR Blocks served to other instances

CURRENT_REQUESTS NUMBER # Current Blocks served to other instances

DATA_REQUESTS NUMBER # Data Blocks served to other instances

UNDO_REQUESTS NUMBER # Undo Blocks served to other instances

TX_REQUESTS NUMBER # Undo Segment Headers served to other instances

CURRENT_REQUESTS NUMBER # requests requiring no changes to blocks served

PRIVATE_REQUESTS NUMBER # requests requiring changes for requesting transaction only

ZERO_RESULTS NUMBER # requests requiring changes for zero-XID transactions only

DISK_READ_RESULTS NUMBER # requests requiring requesting instance to read block from disk

FAIL_RESULTS NUMBER # requests failing - requesting instance must reissue request

FAIRNESS_DOWN_CONVERTS

NUMBER # times receiving instance has downgraded an X lock

FAIRNESS_CLEARS NUMBER # times fairness counter was cleared

FREE_GC_ELEMENTS NUMBER # times request received and X-lock had no buffers

FLUSHES NUMBER # times log flushes by LMS process(es)

FLUSHES_QUEUED NUMBER # flushes queued by LMS process(es)

FLUSH_QUEUE_FULL NUMBER # times flush queue was full

FLUSH_MAX_TIME NUMBER maximum time for flush

LIGHT_WORKS NUMBER # times light works rule was invoked

ERRORS NUMBER # times error signalled by LMS process

Page 10: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

10

juliandyke.com

© 2007 Julian Dyke

Light Works Rule In theory, once a block has been written to disk, the LMS

process will not attempt to read it again when responding to a consistent read request

Light Works Rule Prevents LMS processes from going to disk when

responding to CR requests for data, undo or undo segment blocks

Can prevent LMS process from completing its response to a CR request

Page 11: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

11

juliandyke.com

© 2007 Julian Dyke

GC Read Committed Block

Instance 1 Instance 2

22:10

AUS 99

ENG 199

Block 42 UndoBlock

SELECT runs FROM score WHERE team = 'ENG';

UPDATE score SET runs = 200WHERE team = 'ENG';

UPDATE score SET runs = 204WHERE team = 'ENG';

UPDATE score SET runs = 205WHERE team = 'ENG';

AUS 99

ENG 199

ENG 200ENG 204ENG 205

ENG 199

ENG 200

ENG 204

COMMIT;

22:9Session15

Session27

LMS0

AUS 99

ENG 205

Committed Block - Data Block on disk

AUS 99

ENG 205

AUS 99

ENG 205

STOP

Page 12: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

12

juliandyke.com

© 2007 Julian Dyke

GC Read Committed Block

Instance 1 Instance 2

22:10

AUS 99

ENG 199

Block 42 UndoBlock

SELECT runs FROM score WHERE team = 'ENG';

UPDATE score SET runs = 200WHERE team = 'ENG';

UPDATE score SET runs = 204WHERE team = 'ENG';

UPDATE score SET runs = 205WHERE team = 'ENG';

AUS 99

ENG 199

ENG 200ENG 204ENG 205

ENG 199

ENG 200

ENG 204

COMMIT;

22:9Session15

Session27

LMS0

AUS 99

ENG 205

AUS 99

ENG 205

Committed Block - Data Block in buffer cache

STOP

Page 13: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

13

juliandyke.com

© 2007 Julian Dyke

GC Read Uncommitted Block Uncommitted changes MUST be flushed to the redo log before

the LMS process can ship a consistent block to another instance

Reading process must wait until redo log changes have been written to redo log by LMS process

Bad for standard RAC databases Reads must wait for redo log writes

Worse for extended / stretch RAC clusters Increased latency of cross site disk communications

Page 14: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

14

juliandyke.com

© 2007 Julian Dyke

GC Read Uncommitted Block For each block on which a consistent read is performed, a

redo log flush must first be performed

Number of redo log flushes is recorded in the FLUSHES column of V$CR_BLOCK_SERVER

Redo log flush time is recorded in the gc cr block flush time statistic for the

LMS process will increase time taken to serve consistent block will increase time taken to perform consistent read

If LMS processes become very busy, consistent reads will experience high wait times e.g. for a full table scan gc cr multi block request

Page 15: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

15

juliandyke.com

© 2007 Julian Dyke

GC Read Uncommitted Block

Instance 1 Instance 2

22:10

AUS 99

ENG 199

Block 42 UndoBlock

SELECT runs FROM score WHERE team = 'ENG';

UPDATE score SET runs = 200WHERE team = 'ENG';

UPDATE score SET runs = 204WHERE team = 'ENG';

UPDATE score SET runs = 205WHERE team = 'ENG';

AUS 99

ENG 199

ENG 200ENG 204ENG 205

ENG 199

ENG 200

ENG 204

Session15

Session27

LMS0

AUS 99

ENG 205

AUS 99

ENG 205ENG 204ENG 200ENG 199

Block 42Copy

AUS 99

ENG 199

AUS 99

ENG 199

Uncommitted Block - Data Block in buffer cache

STOP

Page 16: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

16

juliandyke.com

© 2007 Julian Dyke

GC Read Uncommitted Block

Instance 1 Instance 2

22:10

AUS 99

ENG 199

Block 42 UndoBlock

UPDATE score SET runs = 200WHERE team = 'ENG';

UPDATE score SET runs = 204WHERE team = 'ENG';

UPDATE score SET runs = 205WHERE team = 'ENG';

AUS 99

ENG 199

ENG 200ENG 204ENG 205

ENG 199

ENG 200

ENG 204

Session15

Session27

LMS0

Uncommitted Block - Data Block on disk

SELECT runs FROM score WHERE team = 'ENG';

AUS 99

ENG 199ENG 200ENG 204ENG 205

AUS 99

ENG 205

AUS 99

ENG 199ENG 200ENG 204ENG 205

AUS 99

ENG 205

AUS 99

ENG 199ENG 200ENG 204ENG 205

AUS 99

ENG 205

ENG 199

ENG 200

ENG 204

ENG 199

ENG 200

ENG 204

ENG 204ENG 200ENG 199

STOP

SEE SLIDE NOTES FOR ADDITIONAL INFORMATION

Page 17: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

17

juliandyke.com

© 2007 Julian Dyke

Consistent Reads in RAC If possible, blocks will always be read from the cache of

another instance

Undo blocks will be flushed to disk more frequently when: All columns are updated Indexed columns are updated Single rows inserted

as opposed to using array inserts Transactions are regularly rolled back Rows locked using SELECT FOR UPDATE

Data blocks will be flushed to disk more frequently when: Most transactions are read-only

Page 18: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

18

juliandyke.com

© 2007 Julian Dyke

Consistent Reads in RAC Consistent read response times in RAC can be reduced by:

Avoid reading uncommitted blocks on remote nodes Partitioning Limiting number of rows per block Specifying SCN

Minimizing size of transactions on remote nodes Must retain ACID properties May be possible to use application logic to synchronize

writes and reads Increasing number of LMS processes on remote node

Should be added dynamically by kernel Also by obvious hardware changes such as

reducing latency of interconnect increasing disk speed

Page 19: 1 RAC Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2007 Julian Dyke

19

juliandyke.com

© 2007 Julian Dyke

Thank you for your interest

For more information and to provide feedback

please contact me

My e-mail address is:[email protected]

My website address is:

www.juliandyke.com