31
CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/ Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben Gaspar Aparicio

Experience with NetApp at CERN IT/DB

  • Upload
    raquel

  • View
    58

  • Download
    6

Embed Size (px)

DESCRIPTION

Experience with NetApp at CERN IT/DB. Giacomo Tenaglia o n behalf of Eric Grancher Ruben Gaspar Aparicio. Outline. NAS-based usage at CERN Key features Future plans. Storage for Oracle at CERN. 1982: Oracle at CERN, PDP-11, mainframe, VAX VMS, Solaris SPARC 32 and 64 - PowerPoint PPT Presentation

Citation preview

Page 1: Experience with NetApp at CERN IT/DB

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

Experience with NetAppat CERN IT/DB

Giacomo Tenagliaon behalf of

Eric GrancherRuben Gaspar Aparicio

Page 2: Experience with NetApp at CERN IT/DB

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

Outline

• NAS-based usage at CERN• Key features• Future plans

Experience with NetApp at CERN IT/DB - 2

Page 3: Experience with NetApp at CERN IT/DB

Storage for Oracle at CERN

• 1982: Oracle at CERN, PDP-11, mainframe, VAX VMS, Solaris SPARC 32 and 64

• 1996: Solaris SPARC with OPS, then RAC• 2000: Linux x86 on single node, DAS• 2005: Linux x86_64 / RAC / SAN

– Experiment and part of WLCG on SAN until 2012• 2006: Linux x86_64 / RAC / NFS (IBM/NetApp)• 2012: all production primary Oracle databases (*)

on NFS

(*) apart from ALICE and LHCb onlineExperience with NetApp at CERN IT/DB - 3

Page 4: Experience with NetApp at CERN IT/DB

Network topology

• All 10Gb/s Ethernet• Same network for storage and cluster interconnect

filer1 filer2 filer3 filer4

serverBserverA serverC serverEserverD

Ethernet switch

Private 1

Ethernet switch

Private 2

Internal HA pair interconnect

Private network, both CRS and storage

“public network” Ethernet switch Public

Page 5: Experience with NetApp at CERN IT/DB

Domains: space/filers

Total size (TB) Used for backup (TB) # of Filersdes-nas 47.4 62.6 10

shosts 204 4

gen3 97 4

rac10 59 6

rac11 59 6

castor 154 18

acc 281 8

db disk 1000 2

TOTAL 901.4 1062.6 58

Experience with NetApp at CERN IT/DB - 5

Page 6: Experience with NetApp at CERN IT/DB

Typical setup

Page 7: Experience with NetApp at CERN IT/DB

Impact of storage architecture on Oracle stability at CERN

Experience with NetApp at CERN IT/DB - 7

Page 8: Experience with NetApp at CERN IT/DB

Key features

• Flash cache• RaidDP• Snapshots• Compression

Experience with NetApp at CERN IT/DB - 8

Page 9: Experience with NetApp at CERN IT/DB

Flash cache

• Help to increase random IOPs on disks– Very good for OLTP-like workload

• Don’t get wiped when servers reboot• For databases

– Decide what volumes to cache:fas3240>priority on

fas3240>priority set volume volname cache=[reuse|keep]

• 512 GB modules• 1 per controller

Experience with NetApp at CERN IT/DB - 9

Page 10: Experience with NetApp at CERN IT/DB

IOPs and Flash cache

Experience with NetApp at CERN IT/DB - 10

Page 11: Experience with NetApp at CERN IT/DB

IOPs and Flash cache

Experience with NetApp at CERN IT/DB - 11

Page 12: Experience with NetApp at CERN IT/DB

Key features

• Flash cache• RaidDP• Snapshots• Compression

Experience with NetApp at CERN IT/DB - 12

Page 13: Experience with NetApp at CERN IT/DB

Disk and redundancy (1/2)

• Disks are larger and larger – speed stay ~constant → issue with performance– bit error rate stay constant (10-14 to 10-16), increasing

issue with availability

• With x as the size and α the “bit error rate”

Experience with NetApp at CERN IT/DB - 13

Page 14: Experience with NetApp at CERN IT/DB

Disks, redundancy comparison (2/2)

1 TB SATA desktop

Bit error rate 10^-14

RAID 1 7.68E-02

RAID 5 (n+1) 3.29E-01 6.73E-01 8.93E-01

~RAID 6 (n+2) 1.60E-14 1.46E-13 6.05E-13

~triple mirror 8.00E-16 8.00E-16 8.00E-16

1TB SATA enterprise

Bit error rate 10^-15

RAID 1 7.96E-03

RAID 5 (n+1) 3.92E-02 1.06E-01 2.01E-01

~RAID 6 (n+2) 1.60E-16 1.46E-15 6.05E-15

~triple mirror 8.00E-18 8.00E-18 8.00E-18

450GB FCBit error

rate 10^-16

RAID 1 4.00E-04

RAID 5 (n+1) 2.00E-03 5.58E-03 1.11E-02

~RAID 6 (n+2) 7.20E-19 6.55E-18 2.72E-17

~triple mirror 3.60E-20 3.60E-20 3.60E-20

5 14 28 5 14 28

10TB SATA enterprise

Bit error rate 10^-15

RAID 1 7.68E-02

RAID 5 (n+1) 3.29E-01 6.73E-01 8.93E-01

~RAID 6 (n+2) 1.60E-15 1.46E-14 6.05E-14

~triple mirror 8E-17 8E-17 8E-17

Experience with NetApp at CERN IT/DB - 14

Data loss probability for different disk types and groups

Page 15: Experience with NetApp at CERN IT/DB

Key features

• Flash cache• RaidDP• Snapshots• Compression

Experience with NetApp at CERN IT/DB - 15

Page 16: Experience with NetApp at CERN IT/DB

Snapshots

Experience with NetApp at CERN IT/DB - 16

• T0: take snapshot 1

Page 17: Experience with NetApp at CERN IT/DB

Snapshots

Experience with NetApp at CERN IT/DB - 17

• T0: take snapshot 1• T1: file changed

Page 18: Experience with NetApp at CERN IT/DB

Snapshots

Experience with NetApp at CERN IT/DB - 18

• T0: take snapshot 1• T1: file changed• T2: take snapshot 2

Page 19: Experience with NetApp at CERN IT/DB

Snapshots for backups

• With data growth, restoring databases in reasonable amount of time is impossible using “traditional” restore/backup techniques

• 100TB, 10GbE, 4 tape drives• Tape drive restore performance ~120MB/s• Restore ~ 58 hours (but it can be much longer)

Experience with NetApp at CERN IT/DB - 19

Page 20: Experience with NetApp at CERN IT/DB

Snapshots and Real Application Testing

Capture

insert… PL/SQL

update …

delete …Original

Clone 10.2 11.2

Upgrade Replay

insert… PL/SQL

update …

delete …

Experience with NetApp at CERN IT/DB - 20

Page 21: Experience with NetApp at CERN IT/DB

Snapshots and Real Application Testing

Capture

insert… PL/SQL

update …

delete …Original

Clone 10.2 11.2

Upgrade Replay

insert… PL/SQL

update …

delete …

SnapRestore®

Replay

insert… PL/SQL

update …

delete …

Replay

insert… PL/SQL

update …

delete …

Experience with NetApp at CERN IT/DB - 20

Page 22: Experience with NetApp at CERN IT/DB

Key features

• Flash cache• RaidDP• Snapshots• Compression

Experience with NetApp at CERN IT/DB - 21

Page 23: Experience with NetApp at CERN IT/DB

NetApp compression factor

' Uncompressed GB Compressed GB Compression RatioOne day AISDB Prod redolog 281.3 100.7 2.8Recent one day ACCLOG datafile 118.1 49.4 2.4CMSR full backup 997.3 297.7 3.4

Experience with NetApp at CERN IT/DB - 22

Page 24: Experience with NetApp at CERN IT/DB

Compression: backup on disk

RMANFile backup

1x tape copy

+

Disk bufferRaw: ~1700 TiB (576 3TB disks)

Usable: 1000 TiB(to hold ~2PiB uncompressed data)

Experience with NetApp at CERN IT/DB - 23

Page 25: Experience with NetApp at CERN IT/DB

Future: OnTap Cluster Mode

• Non-disruptive upgrades/operations: the immortal cluster

• Interesting new features– Internal DNS load balancing– Export policies: fine-grained access for NFS exports– Encryption and compression at storage level– NFS 4.1 implementation, parallel NFS

• Scale-out architecture: up to 24 (512 theoretical)• Seamless data moves for capacity, performance

rebalancing or hardware replacement

Experience with NetApp at CERN IT/DB - 24

Page 26: Experience with NetApp at CERN IT/DB

Architecture view – Ontap cluster mode

Experience with NetApp at CERN IT/DB - 25

Page 27: Experience with NetApp at CERN IT/DB

Possible implementation

Experience with NetApp at CERN IT/DB - 26

Page 28: Experience with NetApp at CERN IT/DB

Logical components

Experience with NetApp at CERN IT/DB - 27

Page 29: Experience with NetApp at CERN IT/DB

pNFS

• NFS 4.1 standard (client caching, Kerberos, ACL)• Coming with Ontap 8.1RC2

• Not natively supported by Oracle yet• In RHEL 6.2

• Control protocol: provides synchronization among data and metadata server

• pNFS between client and MDS, get where information is store

• Storage access protocols: file-based, block-based and object- based

pNFS

Storage access protocols

Experience with NetApp at CERN IT/DB - 28

Page 30: Experience with NetApp at CERN IT/DB

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

Summary

• Good reliability– Six years of operations with minimal downtime

• Good flexibility– Same setup for different uses/workloads

• Scales to our needs

Experience with NetApp at CERN IT/DB - 29

Page 31: Experience with NetApp at CERN IT/DB

CERN IT Department

CH-1211 Geneva 23

Switzerlandwww.cern.ch/

it

Q&A

Thanks!

[email protected]@cern.ch

Experience with NetApp at CERN IT/DB - 30