DCIM: An Integral Part of the Software Defined Data Centre

  • View
    1.082

  • Download
    4

  • Category

    Software

Preview:

DESCRIPTION

Do you know what DCIM is? Discover with Concurrent Thinking how to improve your data centre efficiency, how to overcome challenges in data centres and what the future of DCIM is. Find out more here: http://www.concurrent-thinking.com/

Citation preview

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

DCIM: An Integral Part of the Software Defined Data Centre

Michael Rudgyard, CTO & Founder, Concurrent Thinkingmichael@concurrent-thinking.com

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Who we are and what we do…

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

• Young, dynamic company, formed in 2010– Based in Birmingham, UK

– Private; funded by venture capital

• We develop an intuitive, end-to-end DCIM solution

• Company Vision– To establish Concurrent COMMAND as the DCIM of choice for both the

technical and commercial management of data centres

Company profile

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

USPs / Key Differentiators

• An end-to-end approach– Integration across all systems delivers a complete view of data centre performance

• Vendor neutral– Supports all of your existing and future infrastructure

• Architected to scale– Meets the needs of both small, large and multi-site data centres

• An intuitive and highly dynamic GUI– Key Performance Indicators drive efficiency– Critical alerting, technical fault-finding and planning

• An open framework– Ensures that managers can customise the product to meet their precise requirements

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

So what is DCIM ?

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Building & Facilities Management

Facilities IT Systems

DCIM arguably started as Data Centre Facilities Management …

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Power Chain / Energy Monitoring

Building & Facilities Management

Facilities IT Systems

Then energy management…

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management

Facilities IT Systems

With an initial focus on cooling…

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management

Facilities IT Systems

Server health monitoring

Vendors then realised that servers provided lots of power & environmental data themselves …

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management OS/VM monitoring

Facilities IT Systems

Server health monitoring

While reports claimed that the biggest waste of energy in the data centre was due to underutilised servers…

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Asset Management

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management OS/VM monitoring

Facilities IT Systems

Server health monitoring

Which required knowing what and where these servers were…

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Asset Management

Cable Management

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management

Capacity Planning

OS/VM monitoring

Facilities IT Systems

Server health monitoring

Knowing what is where, as well as space, power, cooling and network requirements, allows you to plan…

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Asset Management

Cable Management

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management

Capacity Planning

Network & storage monitoring

Application monitoring

OS/VM monitoring

Facilities IT Systems

Server health monitoring

Knowing the power, and CPU, network, IO (and even application) usage, allows you to truly understand end-to-end efficiency …

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Evolution of DCIM

Asset Management

Cable Management

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management

Capacity Planning

Network & storage monitoring

Application monitoring

OS/VM monitoring

Facilities IT Systems

Server health monitoring

VM migration

While active control of cooling, power distribution and IT resources can bring even greater savings …

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The business case for DCIM

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Improve resilience and reduce risk

Increase operational efficiencies

Drive energy savings

Business Drivers

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Reducing Risk

Asset Management

Cable Management

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management

Capacity Planning

Network & storage monitoring

Application monitoring

OS/VM monitoring

Facilities IT Systems

Server health monitoring

VM migration

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Improving Operational Efficiencies

Asset Management

Cable Management

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management

Capacity Planning

Network & storage monitoring

Application monitoring

OS/VM monitoring

Facilities IT Systems

Server health monitoring

VM migration

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Driving Energy Savings

Asset Management

Cable Management

Environmental Monitoring

Power Chain / Energy Monitoring

Building & Facilities Management

Capacity Planning

Network & storage monitoring

Application monitoring

OS/VM monitoring

Facilities IT Systems

Server health monitoring

VM migration

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Data Centre Efficiency

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

It’s all about virtualization

It’s all about cooling

It’s all about planning

It’s about staff efficiency

What defines an efficient Data Centre ?

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Energy

“Most data centers, by design, consume vast amounts of energy in an incongruously wasteful manner”

Sep 22nd, 2012 Power, Pollution and the Internet

• The Energy Problem: – Energy is a critical issue for the fast-growing data centre industry

– Cost of energy is substantial and growing fast (1.5-2% of global electricity)

– Significant political pressure to reduce carbon emissions

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Reducing ‘Facility’ overheads; PUE

• The data centre industry initially focussed on reducing cooling (and other) overheads

• A measure of this is the Power Usage Effectiveness:

Total power used by the Data Centre

PUE = ---------------------------------Power used by IT equipment

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Virtualisation

• Virtualisation shifts the focus to server rationalisation

• However, virtualisation often takes place:– With few changes to the power and cooling infrastructure (the PUE increases !!)– With little historical knowledge of server utilisation pre-virtualisation

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Design vs. Operational Efficiency

• Most new data centres are currently designed against PUE targets– For a given IT hardware capacity, PUE is a good planning metric

• But what if the servers are not doing any useful work in practise ??• PUE is actually a very poor operational metric

• We really need a measure of IT Usage Effectiveness

– ie. how effective is the use of power to deliver necessary IT services

– Against which optimisation can be performed for maximum effect

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

0

0.2

0.4

0.6

0.8

1

ComputeUtilisation

Effectiveness

StorageUtilisation

Effectiveness

NetworkUtilisation

Effectiveness

• The industry has struggled to define ‘standard’ metrics that are meaningful (eg. PUE, ITUE, ITEE, FVER..)

• DCIM is a tool that should enable a customer to use any standard, and even define his own KPIs

Data Centre KPIs

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Optimising for change

DCIM provides hard evidence for making business decisions

• Which servers should be replaced, virtualised or retired ?– Compare utilisation across the estate

• Which servers are better at delivering a particular service ?– Provides useful procurement information

• When should equipment be retired ?– Sweating IT and cooling assets is often a very bad idea indeed !– DCIM can combine power, utilisation, and asset information (eg. depreciation) and

provide solid CAPex vs. OPex arguments for replacing/upgrading assets

• Do I really need to invest in new equipment or a new data centre ?

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The Challenges for DCIM

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Commercial Challenges

• Facilities and IT are often managed independently– Complicated sell for end-to-end solution, mitigated by having a modular application– However, future data centres are likely to follow a more unified management approach

(cf. Google, Yahoo, Facebook, etc…)

• Little ‘C’-level visibility of datacentre risks, costs & efficiency – Power is not ‘charged’ to IT; CAPEx decisions are made without evidence; etc…– Ironically, this is what DCIM sets out to achieve !

• Staff do not have the time to implement DCIM – Ironically, DCIM relieves them of many manual tasks once it is in place

• Co-location providers have different needs to their customers– Need for unified DCIM solutions that target both users as this sector grows

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Technical Challenges – Product Breadth

• The ever-increasing scope of DCIM– Likely to leave non-specialist (eg. smaller hardware-focussed) providers behind.– Requires a well-thought out product strategy

• Need to support diverse equipment from multiple vendors– Drives a standards-based, agentless approach: eg. SNMP, Modbus, BACnet, 1-wire,

IPMI, WMI etc.

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Manage everything, from anywhere

BACnetnetwork

ModbusRS485

‘IT’ Ethernet network

Management Ethernet network

1-wire semsornetwork

Protocolconverter

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Manage everything, from anywhere

BACnetnetwork

ModbusRS485

‘IT’ Ethernet network

Management Ethernet network

1-wire semsornetwork

Facilities management : BMS, branch circuits; UPS systems, generators, CRAC units, environmental sensors etc..

Protocolconverter

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Manage everything, from anywhere

BACnetnetwork

ModbusRS485

‘IT’ Ethernet network

Management Ethernet network

1-wire semsornetwork

Facilities management : BMS, branch circuits; UPS systems, generators, CRAC units, environmental sensors etc..

Protocolconverter

Environmental monitoring: low-cost, 1-wire sensors

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Manage everything, from anywhere

BACnetnetwork

ModbusRS485

‘IT’ Ethernet network

Management Ethernet network

1-wire semsornetwork

Facilities management : BMS, branch circuits; UPS systems, generators, CRAC units, environmental sensors etc..

PDU & sensor management

Protocolconverter

Environmental monitoring: low-cost, 1-wire sensors

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Manage everything, from anywhere

BACnetnetwork

ModbusRS485

‘IT’ Ethernet network

Management Ethernet network

1-wire semsornetwork

Facilities management : BMS, branch circuits; UPS systems, generators, CRAC units, environmental sensors etc..

PDU & sensor management

Protocolconverter

Rack cooling & access management

Environmental monitoring: low-cost, 1-wire sensors

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Manage everything, from anywhere

BACnetnetwork

ModbusRS485

‘IT’ Ethernet network

Management Ethernet network

1-wire semsornetwork

Facilities management : BMS, branch circuits; UPS systems, generators, CRAC units, environmental sensors etc..

PDU & sensor management

Server health management Protocolconverter

Rack cooling & access management

Environmental monitoring: low-cost, 1-wire sensors

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Manage everything, from anywhere

BACnetnetwork

ModbusRS485

‘IT’ Ethernet network

Management Ethernet network

1-wire semsornetwork

Facilities management : BMS, branch circuits; UPS systems, generators, CRAC units, environmental sensors etc..

PDU & sensor management

Server health management

OS, VM and application monitoring

Protocolconverter

Rack cooling & access management

Environmental monitoring: low-cost, 1-wire sensors

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Technical Challenges – Scale Out

• Imagine a high density data-centre with ‘just’ 10,000 servers– ie. 300-500 racks and a similar number of PDUs and sensors

– and up to (say) 16 VMs per server

• You might want to monitor (derive reports from etc..)– 300-1500 environmental sensors

– 20-30 data-points per server (IPMI, Power) = 200k-300k points

– 20-100 data-points per OS/VM (eg. SNMP, WMI) = 3.2M-16M points

– … as well as user and application data.

• That’s of a lot of information if you sample every 10 to 60s!– But ‘scale-out’ data centres can be ten times this size…

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Technical Challenges – Scale Out

Things that won’t work (and that we don’t do !):

• Using a ‘single-instance’ software architecture – Information will need to be processed in a distributed manner

• Putting unrefined data in a standard SQL data-base– or you’ll need a data-centre to store, process & retrieve this data !

• Expecting simple GUIs (eg. lists and trees) to be effective– Visualisation becomes a key aspect to usability

– Increased need for automation and data consolidation

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Technical Challenges – Scale Across

• Co-location / cloud providers are interested in providing their customers with portals for managing their own systems

• This provides further challenges for DCIM providers:– Providing customers with relevant ‘facilities’ data

– Granular monitoring of data (eg. power usage) in highly dynamic clouds

– Systems that are capable of scaling across many users

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

The future of DCIM

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Where DCIM Meets Cloud

Facilities IT Systems

DCIM & Cloud Infrastructure Management is likely to merge…

DCIM

Cloud Infrastructure Management

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

DCIM & the software-defined data centre

• In the future, we will move to the autonomous data centre– Emphasis moves from monitoring to automated management by software

– Potential for very significant operational and energy savings…

• Real-time optimisation of complete service delivery– Migration of virtual machines based on usage; active power control; localised

cooling etc…

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

DCIM & the software-defined data centre

• There are numerous potential issues:– Software platforms will need to talk to even more diverse systems

– The software will need to be scalable and able to deal with highly dynamic environments

– The control mechanisms will need to be defined by the data centre managers and IT team, using simple interfaces that abstract complexity

– There are many optimisation constraints relating to: physical issues; the IT, network and storage infrastructure; the required QoS etc..

• But one of the biggest issues is perceived risk..

– Data centres are ‘mission critical’ and highly conservative

Increased Resiliency - Improved Operational Efficiencies - Reduced Energy Costs

Data Centre Infrastructure Management

Questions ?

Visit our website http://www.concurrent-thinking.com/ for more information

Or fill in our contact form with any enquiries and we will endeavour to reply as quickly as possible.

Recommended