29
CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/it Ideas for evolution of replication technology @ CERN Distributed Databases Operations Workshop November 16 th , 2010 Zbigniew Baranowski, IT-DB

Ideas for evolution of replication technology @ CERN

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

Ideas for evolution of

replication technology @

CERN

Distributed Databases Operations Workshop

November 16th, 2010

Zbigniew Baranowski, IT-DB

Page 2: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Replication use cases at CERN

• Motivation for evolution of replication

• Oracle replication technologies

• Possible future replication solutions for LCG

• Summary

Ideas for evolution of replication technology @ CERN

Outline

2

Page 3: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• ATLAS

– CONDITIONS (4M LCRs/day)

– PVSS (60M LCRs/day)

• CMS

– CONDITIONS (6M LCRs/day)

– PVSS (20M LCRs/day)

• LHCb

– CONDITIONS (6K LCRs/day)

• ALICE

– PVSS (4M LCRs/day)

• COMPASS

– PVSS (4M LCRs/day)

Ideas for evolution of replication technology @ CERN

Replication use cases:

ONLINE - OFFLINE

3

CONDITIONS

PVSS

Page 4: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• LHCb ( in addition to ONLINE-OFFLINE)

– CONDITIONS (8K LCRs/day)

Ideas for evolution of replication technology @ CERN

Replication use cases:

OFFLINE - ONLINE

4

Page 5: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it Ideas for evolution of replication technology @ CERN

Replication use cases:

OFFLINE – T1s

5

– ATLAS

• CONDITIONS (4M LCRs/day)

– LHCb

• LFC (235K LCRs/day)

• CONDITIONS (15K LCRs/day)

Page 6: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• ATLAS

– AMI (800K LCRs/day)

– Muon (700K LCRs/day)

Ideas for evolution of replication technology @ CERN

Replication use cases:

T1 - OFFLINE

6

Page 7: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Need of stable and reliable replication

service

• Streams 10g require frequent interventions

(at least once per week)

– Consistency problems

– Blocking sessions

– Memory pools shortage

– Logminer crashes

– Users unsupported changes

• Streams administration is time consuming

and requires expert knowledge

• Migration to 11gR2 in 2012Ideas for evolution of replication technology @ CERN

Motivation for evolution of replication

solutions

7

Page 8: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Is there a solution which can simplify

maintenance of replication?

– Satisfies physics data workload

– Requires minimum maintenance effort

– Is resilient to user’s unsupported operations

– Ensures replicated data consistency

– Utilizes minimum amount of resources

Ideas for evolution of replication technology @ CERN

Motivation for other replication solutions

8

Page 9: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Logical (SQL based) replication

– Streams11gR2

– GoldenGate

• Physical (block-level) replication

– Active DataGuard11gR2

• Combinations of physical and logical

replication

Ideas for evolution of replication technology @ CERN

Possible replication solutions

9

Page 10: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it Ideas for evolution of replication technology @ CERN

Streams 11gR2

10

Page 11: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Technology features

– Considerable maintenance effort

• but in 11g should be less than in 10g

– No additional license required

– Many improvements

• stability, management, monitoring, verification of data

consistency

– Very good performance (30K-40K LCRs/s)

– Best practices identified – a lot of experience

– Source and destination database fully

accessible for reads and writes

Ideas for evolution of replication technology @ CERN

Streams11gR2 solution

11

Page 12: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• As ONLINE – OFFLINE replication

– Users and data content can abort the

replication

– streams processes may affect performance of

online database

– no extra hardware needed

– bi-directional replication

Ideas for evolution of replication technology @ CERN

Streams11gR2 solution

12

SQLsSQLs

SQLsSQLs

Page 13: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• As OFFLINE – T1s

– Recovery of replica requires

• coordination between T1 and other T1, T0

• expert knowledge of procedures

– Downstream capture

• additional hardware required

• complete isolation from OFFLINE database

• standby database can be source of replication

– T1s databases is read/write accessible

– Good monitoring for distributed streams

deployment (strmmon, EM)

Streams11gR2

13

Redo Transport

Page 14: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it Ideas for evolution of replication technology @ CERN

GoldenGate

14

Source: Oracle.com

Page 15: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Technology features

– source and destination database fully accessible

for reads and writes

– good quality of software (very stable, free of

locks, almost transparent for databases)

– good performance 7 – 12K LCRs /s

– additional license required

– standby database cannot be used as source

– no in-house experience

– additional dedicated disk space required for trail

files

– additional software to be installed and

maintained on database’s machines

GoldenGate

15

Page 16: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• As ONLINE-OFFLINE replication

– no extra hardware needed

– possible loops back in replication

– minor impact on source database

– users and data content can abort the

replication

Ideas for evolution of replication technology @ CERN

GoldenGate solution

16

SQLsSQLs

SQLsSQLs

GG GG

GGGG

Page 17: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• As OFFLINE – T1s

– easier maintenance

• No side effects on source when target is down

• No split of replication required

• Trail files can be used for T1 recovery

– no remote administration - access to nodes

required

– no monitoring for distributed environment

– cannot use standby database (i.e. Active

Dataguard) as a source of replication

Ideas for evolution of replication technology @ CERN

GoldenGate solution

17

Page 18: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it Ideas for evolution of replication technology @ CERN

Active DataGuard 11gR2

18

Source: Oracle.com

Page 19: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Technology features

– Physical replication

• identical copy

– Minimum maintenance effort

– Outperforms other replication technologies

• Oracle claims 200 MB/s of redo processing

– Improved data reliability of primary database

• failover

• automatic recovery of corrupted blocks

– Fast recovery with RMAN

– Additional license required

– Target/standby database is read only

Ideas for evolution of replication technology @ CERN

Active DataGuard 11gR2

19

Page 20: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• As ONLINE – OFFLINE replication

– additional database installations needed for

no replicated data (split of OFFLINE)

– same version of software required

(installation, upgrades)

– online database is protected with another

standby database

– further replication to T1s is possible in

sequential standbys configuration

Active DataGuard 11gR2

20

Redo Transport Redo Transport

Page 21: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• As OFFLINE – T1s

– same version required on all T1s DBs

• Coordination of interventions becomes critical

– T1 database is read only

– additional database installations needed for no

replicated data (split of OFFLINE)

– Physical replication: lower maintenance effort

– No downstream needed

Active DataGuard 11gR2

21

Redo Transport

Page 22: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Streams11gR2 replication at all Tiers

– Same setup as current production

• No additional installations needed

Ideas for evolution of replication technology @ CERN

Possible solutions

22

Redo Transport

PROPAGATIONPROPAGATION

Page 23: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

GG

GG

GG

• GoldenGate replication at all Tiers

• New software has to be deployed

• Additional port needs to be opened

• Do we need downstream database?

Ideas for evolution of replication technology @ CERN

Possible solutions

23

GG FILESGGGGFILES GGFILES GG ?

Page 24: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• ONLINE –> OFFLINE: Active DataGuard

• OFFLINE –> T1s: Streams11g

Possible solutions

24

PROPAGATION

Redo Transport Redo Transport

transport directions

Possible redo

transport directionsAdditional standby Additional standby

database for ONLINE-

OFFLINE model

protection

Page 25: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

Online database failover and

recovery with ADG11gR2

25

PROPAGATION

Redo Transport Redo TransportX

Red

o T

ran

sp

ort

ONLINE-OFFLINE model is broken !!!

Page 26: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

Offline database failover and

recovery with ADG11gR2

26

PROPAGATION

Redo Transport Redo Transport

ONLINE-OFFLINE model is broken !!!

Re

co

ve

ry

XRedo Transport

Page 27: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Migration to the new database versions

(2012) gives an opportunity to re-design and

improve the replication service

• Three candidate technologies are being

investigated

– Streams11gR 2

– GoldenGate

– Active DataGuard

– Combined solutions

• Other proposals/ideas?

• Experiments requirements?

Ideas for evolution of replication technology @ CERN

Summary

27

Page 28: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it

• Many thanks to all Physics DBAs,

especially:

– Luca

– Jacek

– Dawid

• Consultancy

– Gancho

– Stephen Balousek (Oracle)

– Jagdev Dhillon (Oracle)

Ideas for evolution of replication technology @ CERN

Acknowledgements

28

Page 29: Ideas for evolution of replication technology @ CERN

CERN IT DepartmentCH-1211 Geneva 23

Switzerland

www.cern.ch/it Ideas for evolution of replication technology @ CERN

Questions?

29