12
September 17, 2018 Library of Congress Storage Environment Update 2018 Carl Watts Information Technology Specialist IT Services Operations / Operations and Maintenance / Unix Systems 1 September 2018

Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

September 17, 2018

Library of Congress Storage EnvironmentUpdate 2018

Carl WattsInformation Technology SpecialistIT Services Operations / Operations and Maintenance / Unix Systems

1September 2018

Page 2: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

Converged Storage Tiers

2September 2018

Page 3: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

Content Storage

3September 2018

Content is equal to single copy of a digital object and it’s associated derivative(s)

Preservation Copies (currently) Standard Collections – two (2) copies distributed across two (2) datacenters

Special Collections – two (2) different platforms holding two (2) copies distributed across two (2) datacenters

Presentation Copies Currently single online copy

Near future – two (2) copies across (2) datacenters

Future – multiple copies across datacenters and “cloud” providers

Page 4: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

Content Growth – Preservation

September 2018 4

Unique File Count:410M Total Files

1,447.43

3,061.85

2,709.44

1,906.93

2014 2015 2016 2017 2018

Annual Growth (in TB)

6,856.40

8,303.83

11,365.67

14,075.11

15,982.04

2014 2015 2016 2017 2018

Longterm Storage (single copy in TB)

Page 5: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

Content Growth – Preservation

September 2018 5

14,286

21,438

25,032

28,085

34,361

2014 2015 2016 2017 2018

OVERALL LONG-TERM STORAGE GROWTH (ALL COPIES IN TB)

Page 6: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

Content Growth – Presentation

September 2018 6

236.40

546.60

1,095.10

1,620.70

2,086.30

2,572.46

2013 2014 2015 2016 2017 2018

Access Storage (in TB)

2013 2014 2015 2016 2017 2018

236.40

310.20

548.50525.60

465.60

504.19

2013 2014 2015 2016 2017 2018

Annual Terabyte Growth

Annual Terabyte Growth

Unique File Count:344M Total Files

Page 7: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

Migrations Continue

7September 2018

Consolidating Preservation Storage Combining resource to reduce cost

Migrating Data Centers Completed migration of presentation storage to new location and system

Preparing to replicate data to new data center (2019)

Migrating Tape Technology Preparing to migrate IBM TS1140 tape to TS1155 tape (2019)

Page 8: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

8September 2018

Quad ‘P’ Dataflow (Current)

Procure Preserve Process Present

Wor

kflo

w E

ngin

e(s)

esubmit.loc.gov(external push)

Media Exchange(external push)

Signiant Workflow(internal pull)

Media Shuttle(push/pull)

CTS via ingest servers

Fetcher(internal pull)

Transitory Storage

Pool

Transitory StoragePools

External Client

Transitory StoragePools

Transitory StoragePools

Delivered Content

(portable HD)

Transitory Storage

Pool

Transitory StoragePools

Client

Client

sFTP

Web Site

In House Digitization

Processing VM

Transitory StoragePools

Client

CTS Workflow Engine

Signiant Manager

Oracle HSM

IBM LTFS(Special Collections)

CTS VMsProcessing

StoragePools

CTS Scheduler

Processing VMs

Online Content Storage(xSTOR)

CDN

WebArchive

ChronAm

Web Server(s)

Web Server(s)

Web Server(s)

Web Server(s)

Other

DMS Workflow

PCWA

House Video Encoders

Transitory StoragePools

House Recording Studio

Page 9: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

Looking to add Content Abstraction Layer

9September 2018

Content Abstraction Layer (CAL) would provide: Manage the procurement of data from multiple sources Manage the preservation of content:

File fixity checking File validation / usability

Manage the automation of content processing Manage the movement / orchestration of data across multiple

Systems Data centers Cloud providers External entities

Provide a persistent namespace and access method to data

Page 10: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

10September 2018

Quad ‘P’ Dataflow (Proposed)

Procure Preserve Process Present System Backup

Wo

rkfl

ow

En

gin

e(s

)

esubmit.loc.gov(external push)

Media Shuttle(push/pull)

CTS via ingest servers

Fetcher(internal pull)

Transitory Storage

Pool

Transitory StoragePools

Transitory StoragePools

Delivered Content

(portable HD)

Transitory StoragePools

Client

sFTP

Web Site

In House Digitization

Processing VM

Transitory StoragePools

Client

On-Prem Object Storage(Storage-as-a-Service)

Processing StoragePools

Processing VMs

CDN

Web Capture

ChronAmer.

Web Server(s)

Web Server(s)

Web Server(s)

Web Server(s)

Other

DMS Workflow

PCWA

House Video Encoders Transitory

StoragePools

House Recording Studio

Content Abstraction Layer

Long-term Storage(Large File and Special Collections)

Tape Tech

Off-Site Cloud Storage(DC5)

[AWS, Azure, Google, other…)

Off-Site Cold Cloud Storage(DC5)

[AWS, Azure, Google, other…)

Policy Management

Object Discovery & Classification

Quota Management

Storage Analytics

Public Datasets Cloud Storage

(DC5)[AWS, Azure, Google, other…)

Shared Datasets [Agency, Academia,

other...)

Object Audit

Workflow Engine

Data Tiering

sFTP

NFS S3

SM

B/C

IFS

HTTP

S

REST

Data Validation

and Verification

eCO NAS

eCO Submitter

Server

VMs

DB

BackupServer

Page 11: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

11September 2018

Data Center 1 StorageData Center 2 Storage

Data Center 3 Storage Data Center 4 Storage

DC5

Cloud Provider A

DC5

Cloud Provider BDC5

Cloud Provide ...

Web Services EnvironmentBack-up Environment

Preservation Systems

Procurement Systems

Processing Systems

Content Abstraction Layer

Page 12: Library of Congress Storage Environment...Library of Congress Storage Environment Update 2018 Carl Watts ... Backup Server. September 2018 11 Data Center 1 Storage Data Center 2 Storage

Thank you

12September 2018