21
Proposed Data Centre and base Infrastructure access strategy John Young & Jon Blake May 2007

Proposed Data Centre and base Infrastructure access strategy

  • Upload
    teness

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Proposed Data Centre and base Infrastructure access strategy. John Young & Jon Blake May 2007. Questions for IMC members. Does the approach to DR outlined here cover your needs? Can you prioritise your systems into Gold, Silver and Bronze categories and specify critical periods? - PowerPoint PPT Presentation

Citation preview

Page 1: Proposed Data Centre and base Infrastructure access strategy

Proposed Data Centre and base Infrastructure access strategy

John Young & Jon BlakeMay 2007

Page 2: Proposed Data Centre and base Infrastructure access strategy

2

Questions for IMC members

• Does the approach to DR outlined here cover your needs?

• Can you prioritise your systems into Gold, Silver and Bronze categories and specify critical periods?

• Can you make the case for investment to cover the selected level of Gold, Silver, Bronze rating, if not what would be the impact of a loss of service e.g. Bronze level ?

• Can you endorse this approach, noting the base resilience / Disaster Recovery actions?

Page 3: Proposed Data Centre and base Infrastructure access strategy

3

Contents

Background and issues

Principles

Two-phase approach

Current and planned network and datacentre layout

Data centre sourcing options

Benefits

Three-tier DR approach

Base infrastructure actions and impacts

Summary

Required actions

Page 4: Proposed Data Centre and base Infrastructure access strategy

4

• ONS has 5 Data Centres (inc. Siemens) plus 2 server rooms, with different states of

– security, – capacity– appropriateness

• The March 2007 Capability review identified significant risks in – DR– Data Centre infrastructure

• DR capability – limited to external site DR for the PPI and RSI systems on NUMA, and

Model 204. – Can only be accessed from service suppliers premises– Other systems protection limited to tape backup– Odyssey requirement unspecified

• Odyssey capacity requirements need to be determined• We need to find a solution to these complex issues and propose a

two phase approach– Phase 1: Base Infrastructure– Phase 2: Modernised System Requirements

Background

Page 5: Proposed Data Centre and base Infrastructure access strategy

5

Proposed Principles

• Resilience – to provide appropriate availability across ONS

• Disaster Recovery Capability – appropriate to ONS business needs

• Good VFM for ONS balancing costs versus risks• Simplify the deployment of equipment taking a shared

holistic approach• Resolve areas of weakness i.e.

– Data Centre environments and protection– Remote Access Support access and reslience– Firewall security– GSI network access(currently 1 very busy link) for mail and internet

Page 6: Proposed Data Centre and base Infrastructure access strategy

6

Phase 1: Base Infrastructure

Consists of:datacentres networks, external connectivity

Objectives– provide a logical design which can support Disaster

Recovery requirements as they are determined for each business application

– identify gaps and weaknesses in the capability of the base infrastructure to support ONS systems and the means to communicate with them

– identify options to implement the logical solution

Page 7: Proposed Data Centre and base Infrastructure access strategy

7

Phase 2: Modernised System requirements

A joint Odyssey / IM activity

Objectives – Identify the capacity requirements of the Odyssey

systems (including Atlas) for the next 3 years – Obtain the performance, service level and DR

requirements of each of these systems from the business

– Propose technical designs (including costs) which satisfy the above

– Subsequently establish a migration plan in conjunction with Phase 1 to realise the approach

Page 8: Proposed Data Centre and base Infrastructure access strategy

8

Titchfield

Today- physical

Cardiff Bay Data Centre

Newport Data Centre

WAN

London Data Centre

Titchfield Data Centre

NewportDrummond Gate

Myddleton St

•Lacks expansion

•Equipment proximity

•Excellent power and environment•Limited space

•No fire protection

•Adequate power protection

•No fire protection•Poor environment•No power backup

•Closing

Back-up WAN Link (Non active)

Siemens Data Centre

Backed-up WAN link

WAN Link – no back-upDirect link

Page 9: Proposed Data Centre and base Infrastructure access strategy

9

Newport Data Centre 2007

72 Wintel Servers

Odyssey Dev., Notes, File and Print, etc

Main Data Centre

Room C002

Grid Link 1

Grid Link 2

HP EVA SAN

Local Mixed

7 Tb

IBM NUMA C8IDBR, Common Software, PPI etc.

DYNIX Ingres

IBM NUMA C7Legacy Maintenance

DYNIX

Ingres

Firewall

Remote Access

Internet

WWW

GSi

 

ServiceMonitoring &

Reporting

IBM P55Q

RPI Replatform

AIX

Ingres

No Fire Suppression

Page 10: Proposed Data Centre and base Infrastructure access strategy

10

Titchfield Data Centre 2007

IBM P690

NeSS and Odyssey Development & Test

AIX

Oracle

IBM Z Series

Social and Vital Statistics

OS 390

Model 204

Hitach Lightning SAN

NeSS, Odyssey Dev. & Test, Other Mixed

16.6Tb

195 Wintel Servers

NeSS, Odyssey Dev. & Test, Census 2001, Notes, File and Print, etc

Main Data Centre

Room nnn

Back-up Generator

Grid Link 1

Hitach AMS1000 SAN

Census 2001, less demanding apps.

7.2Tb

No Fire Suppression

Firewall

Remote Access

Internet

Page 11: Proposed Data Centre and base Infrastructure access strategy

11

Cardiff Bay Data Centre 2007

IBM P690

Odyssey

AIX Oracle

Hitach Lightning SAN

ATLAS, Odyssey, GRIS, Gender Recognition

9.4 Tb

47 Wintel Servers

Odyssey, GRIS, Gender Recognition, Metis, E-learning

BT Cardiff Bay

Back-up Generators

Grid Link 1

IBM P590 AIX Oracle

ATLAS, Odyssey, GRIS, Gender Recognition

Highly secure

Fire Suppression

Page 12: Proposed Data Centre and base Infrastructure access strategy

12

Titchfield

March 2008 – physical with network enhancement

• WAN resilience for Titchfield and Cardiff Bay

• Major computing equipment from DG moved to Titchfield and South Wales

• Subject to funding approval

WAN

Newport

Myddleton St

EthernetExtension

Siemens Data Centre

Main WAN linkBackup WAN Link

Direct link

Cardiff Bay Data Centre Titchfield

Data Centre

Newport Data Centre

Page 13: Proposed Data Centre and base Infrastructure access strategy

13

What we want to achieve - logical

Newport

Titchfield

Myddleton St

Data Centre South

Back-up

WAN

Data Centre West

Main WAN linkSecond (active) WAN Link

Remote Access

GSi

Remote Access

GSi

Web-hosting Site

Page 14: Proposed Data Centre and base Infrastructure access strategy

14

Data Centre Sourcing Options Set-up Cost

Running Cost p.a.

Timescale

Seek more capacity at Cardiff Bay and move all Newport equipment to there

? £640k 6-9 months

Refurbish Titchfield old Data Centre and move all on site equipment to it –

£1m?

? 18 Months

Refurbish Newport Data Centre and bring back equipment from Cardiff Bay

£1m?

? 18 Months

Move all Data Centre requirements to Isaac only if ‘Core+’ adopted

? ? 6-9 months

• Permutations of the above

Page 15: Proposed Data Centre and base Infrastructure access strategy

15

Benefits

• Logical Data Centre and DR service approach provides

– Predetermined DR service models for developers to use propose and develop solutions tailored to business needs

– Framework against which physical Data Centre solutions can be established and reviewed

– Standardised framework with potential economies of scale

– Economies of scale for equipment and operation– Clarity of management

Page 16: Proposed Data Centre and base Infrastructure access strategy

16

Example National Accounts, Atlas RPI,PPI

Data Secured Last hour or better Overnight Overnight

Application secured Weekly and on change Weekly/monthly

Restoration facilities (hardware)

Dedicated Transferred from other users None

Restoration time -limited users

Hours DaysFollowing hardware

procurement (weeks)

Restoration time –further users

1 week

Indeterminate

Restoration time - full service

4-8 weeks

Other Features

Seek to establish priority replacement facilities with suppliersSupport services split between Data Centres to minimise DR impact

Systems only installed on shared corporate equipment

Aggressive server virtualisation to minimise / insulate from hardware risks

RequirementsPredetermined schedule to prioritise recovery of systems with varying criticality through the year

Gold Silver Bronze

Illustrative 3 Tier DR approach

Page 17: Proposed Data Centre and base Infrastructure access strategy

17

Business evaluation of DR requirements • ONS needs:

– a common process across businesses;– a single body to prioritise DR requirements.

Recovery period

Business Process

F / R 4 hours

24 hours

48 hours

72 hours

5 days

7 days

14 days

28 days

Process 1

Financial 1 1 2 3 3 3 3 3

Reputational 1 2 3 4 5 5 5 5

Process 2

Financial

Reputational

Example Method

• Financial and Reputational impact assessed on a 1-5 scale for each period– e.g. recovering Process 1 in less than 24 hours could have a low Financial and

reputational impact, but within 48 hours would begin to have adverse affects, at 72 hours would be more damaging, and 5 or 7 days is unacceptable.

Page 18: Proposed Data Centre and base Infrastructure access strategy

18

Production

RPI

Production

PPI

Development environments

/ Shared critical period

backup

Illustrative Gold and Silver arrangement

Production 1NA

Production

RPI

Production

PPI

Data Centre West

Data Centre South

Standby 1 NA

Standby 2 Atlas

Production 2Atlas

Quiesced

Page 19: Proposed Data Centre and base Infrastructure access strategy

19

Base infrastructure actions and impactsUpgrade Links between offices and Data Centres

Provides High resilience

Provides DR but DR events limited to 1-2 days

•£100k setup and similar pa running costs •3 months to implement . •Do as part of London move

Extend GSI, RAS, Firewall mechanisms

Provides needed extra capacity and resilience

Provides DR if more than 1 Data Centre

•Costs tba

Establish twin datacentre strategy

Can be used as a base for resilience

Provides effective base for DR strategy

•Costs tba but offset by existing costs•18 months if ONS housed •6-9 months if via Isaac

Page 20: Proposed Data Centre and base Infrastructure access strategy

20

Summary

• ONS DR facilities are extremely limited and requirements are not established

• ONS Data Centre capability is fragmented and ranges from very poor to excellent in its capability

• ONS other base infrastructure is reasonably sound, however it requires additional resilience and capacity to be installed, e.g. RAS, GSI, WAN links

• A logical model to rationalise this is proposed• A 2nd piece of work is in hand to determine the Odyssey

requirements (inc. Atlas)• Further work is urgently required for NUMA based services

to consider such systems as RPI and CSDB• Several Data Centre sourcing options are proposed for

investigation

Page 21: Proposed Data Centre and base Infrastructure access strategy

21

Required Actions

• IMC– Confirm the strategy provides a basis to support their

business objectives and risk management needs– Members arrange the provision of the necessary input

information about the business needs in their areas– Support is given for a project to establish the respective

Data Centre cost evaluations and selection– Note that projects will be brought forward to NWEB to

realise ….

• EMG – Note and endorse this proposal if supported by IMC