Upload
quocirca
View
218
Download
0
Embed Size (px)
Citation preview
8/13/2019 Bringing FM and IT together – Volume II
1/18
Copyright Quocirca © 2014
Clive Longbottom
Quocirca Ltd
Tel : +44 118 9483360
Email: [email protected]
Bringing FM and IT together – Volume II
A series of ar ti cles br ought together from Quoc ir ca ’ s writings for SDC during
2013
January 2014
Quocirca continued to write articles for SearchDataCenter throughout 2013,
looking at how facilities management (FM) and information technology (IT)
professionals were needing to work together more closely than ever. This
report pulls those articles together as a two-volume set for ease of access.
mailto:[email protected]:[email protected]:[email protected]:[email protected]
8/13/2019 Bringing FM and IT together – Volume II
2/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 2 -
Bringing FM and IT together – Volume II A series of articles brought together from Quocirca’s writings for SDC during 2013
Designing the data
centre for tomorrow
Starting a new data entre design now is likely to be different to any data centre that you may
have been involved with in the past. A new way of thinking is required; one where the facility
can be just as flexible as the IT equipment held within it.
DCIM – a story of
growing, overlapping
functionality
Data centre infrastructure management (DCIM) software has come a long way. It is now
beginning to threaten existing IT systems management software in many ways – and this could
be where it runs into trouble. Just how far can, and should, DCIM go?
The data centre for
intellectual property
management.
It may have cost millions to build your latest data centre, and it may house some expensive
equipment. However, none of this really matters to the organisation. It is the intellectual
property that matters to it – and the IT platform and the facility it is in must be architected to
reflect this.
The data centre and
the remote office.
Organisations are increasing diffuse – decentralised through remote offices with mobile and
home workers. Yet all of these need to be well-served by the organisation’s technology. How
can this be best provided without breaking the bank – or compromising security?
Cascading data
centre design
Are you a “server hugger”? If so, probably time to review your position and your worth to the
organisation. The future will be around hybrid IT with different data centre facilities playing their
part. Some of these may be under your control; others will be under various levels of control
from others.
Disaster recovery and
business continuity –
chalk and cheese?
Disaster recovery and business continuity should not be treated as a single entity. These two
distinct capabilities need their own teams working on them to ensure that the organisation gets
an overall approach that meets its own risk profile and is managed within the cost parameters
that the business can operate in.
Managing IT –
converged
infrastructure,
private cloud or
public cloud?
IT can be architected in many different ways. In some cases, a physical, one-application-per-
(clustered)server may still be the way forwards, whereas for others, it may be virtualisation or
cloud. The hardware that underpins these different approaches may also be changing, from rack-
mount, self-built systems to modular converged systems. Cloud computing throws more
variables into the mix – just what is an organisation to do?
The Tracks of my
Tiers
There is a concept of the “tiering” of data centres which can be used by an organisation to see
whether an external facility will offer the levels of availability that it requires for certain IT
workloads. What are these tiers, and what do they mean to an organisation?
What to look for
from a Tier III data
centre provider
If you decide to go for an Uptime Institute Tier III data centre provider, what should you look out
for? As accredited Tier III facilities are few and far between, are there other things that can be
looked out for that will enable a non-Uptime Institute facility to be chosen instead?
8/13/2019 Bringing FM and IT together – Volume II
3/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 3 -
Designing the data centre for tomorrow
Since the dawning of the computer era in the 1960s, data centre design has essentially been evolutionary. Sure, therehave been moves from mainframes to distributed computers; from water cooling to air cooling; from monolithic UPS
and power systems to more modular approaches. Yet the main evolution has been from small data centres to large
data centres.
Even where an organisation comes to the conclusion that the cost of the next data centre is too large for itself, the
move has been to a co-location facility where future growth can be allowed for.
Now, the world is changing. Application rationalisation, hardware virtualisation and consolidation have led to
organisations finding themselves with a large facility and a need to only house 50% or less of what they were running
previously. New, high-density server, storage and network equipment, along with highly engineered systems, such as
VCE VBlocks, Cisco UCSs, IBM Pure Systems and Dell Active Systems mean that less space is required for more effective
direct business compute power.
And then, cloud computing comes in. Suddenly, data centre managers and systems architects are no longer just
having to decide how to best support a workload, but also through what means. A workload that would normally sit
on a stack totally owned by the organisation may now be put into co-location, or be outsources through infrastructure,
platform or software as a service (I, P or SaaS).
Even where decisions are made to keep specific workloads in-house, it makes no sense to design a data centre to
house that workload in the long term. Cloud is still an immature platform, but within the next few years, it is likely to
become the platform of choice for the mainstream, and those organisations that have built a data centre for hosting
a specific application over a long period of time could see themselves at a disadvantage.
To design a data centre for the future, there are the two parts to consider – the facility itself and the IT hardware that
it houses. From a hardware point of view, a full IT lifecycle management (ITLM) approach can ensure that a dynamic
infrastructure is maintained (reference previous article on ITLM). Use of the hardware assets can grow and shrink as
the needs change, with excess equipment being sold to recoup some cost. Through the use of subscription pricing,
software licenses can also be controlled, through signing up or shutting down subscriptions as required.
The main issues revolve around the facility. A data centre is a pretty fixed chunk of asset – if it is built to house 10,000
square feet of space and the business finds that it only needs 5,000 square feet, the walls cannot be that easily moved
to serve only this area. Even where new walls can be implemented, for example to create new office space, this is
only a small part of the problem solved.
A data centre facility is often built with a designed and relatively fixed layout for the services offered. Power
distribution units will be hard-fitted to the walls and other areas of the data centre; CRAC-based cooling systems will
be fixed to the roof in specific places and UPSs and auxiliary generators will be sized and sited to suit the original data
centre design.
So, a new approach to the facility is the key to designing and building tomorrow’s data centre.
The first place to start is with the physical design. If sloping sub floors and raised data centre floors are preferred to
deal with any flooding issues (either natural or through the use of liquid-based fire suppressant), then make this
multiple gullies, rather than a single “V” shaped system. This way, if downsizing is required, there will be the raised
walls marking off each gulley that can be used to build new walls from without impacted the capabilities of the sub
floor to allow drainage for the data centre itself.
8/13/2019 Bringing FM and IT together – Volume II
4/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 4 -
Next is the cabling within the data centre. This will need to be fully structured, with data and power being carried
through separate paths and with an easy means of re-laying any cables should the layout of the data centre change.
Then, there is power distribution itself. Rather than build these against walls or pillars, it may be better to make them
free standing with power feeds coming from structured cabling from the roof. This way, should a redesign be
required, the power distribution is as flexible as the rest of the IT equipment and can be easily relocated.
With cooling systems, a move to free-air cooling or other low-need systems will mean that less impact will be felt in
redesigning the cooling when the data centre changes size. If combined with effective hot and cold aisle approaches
with ducted cooling, the cooling system can be better sized appropriately and placement is less of an issue.
Even where a CRAC-based system is perceived to be needed, a move to a more modular system with multiple,
balanced, variable speed CRAC units will make life easier if the data centre needs to be resized.
The same goes for UPSs and auxiliary generators – a monolithic system could leave an organisation looking at a need
to buy a completely new unit if the needs of the data centre changes, or having a massively over-engineered system
in place if they carry on using the same old UPS or generator when the data centre shrinks. As most UPS systems used
these days will be in-line, every single percentage loss of efficiency could be against the rating of the UPS – not againstthe actual power used by the equipment in the data centre. With a generator, its fuel usage will be pretty much in
line with its rating, so even when running below its rated power, it will use a lot more fuel than one which is correctly
engineered for the task.
If your organisation is reaching a point where a new data centre is seen to be a necessity, bear in mind that the IT
world is going through a bigger change than it has ever done before. Planning to embrace this change will save money
in the mid- to long-term, and it will provide a far more flexible platform for the business.
DCIM – a story of growing, overlapping
functionality An organisation’s technical environment can be seen to be of two distinct parts – the main IT components of servers,
storage and networking along with the facility or facilities that are required to house the IT. Historically, these two
areas have fallen under the ownership and management of two different groups: IT has fallen under the IT
department while the facility has fallen under facilities management (FM) group.
This leads to problems as FM tend to see the data centre as just another building to be managed alongside all the
other office and warehouse buildings, whereas IT tend to see the data centre as the be all and end all of their purpose
in life. One group’s priorities may not match with the other groups – and the language that each group speaks can be
subtly (or not so subtly) different.
Another problem is emerging due to cloud. In the past, the general direction for a data centre has been for it to grow
as the business grows; cloud can now make it that the IT equipment required within the data centre could shrink
rapidly as workloads are pushed out to public cloud – and yet managing this where the facilities equipment (such as
UPS, CRAC units and power distribution systems) may be monolithic items.
In order to ensure that everything runs optimally and supports the business in the way that is required, a single form
of design, maintenance and management is required that pulls FM and IT together that also enables “what-if”
scenarios to be run so that future planning can be carried out effectively. This has been emerging over the past few
years as data centre infrastructure management (DCIM).
8/13/2019 Bringing FM and IT together – Volume II
5/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 5 -
DCIM systems started off as far more of a tool for the FM team as more of a part of a building information modelling
(BIM) tool. BIM software enables a building to be mapped out and the major equipment to be placed within a physical
representation, or schematic of the facility. DCIM made this specific to the needs of a data centre, holding information
about power distribution, UPS and cooling systems, along with power cabling and environmental monitoring sensors
and so on. The diagrams could be printed out for when maintenance was required, or given to the IT team so thatthey could then draw in the IT equipment knowing where the facilities bits were.
It soon became apparent that allowing the IT equipment to be placed directly in the schematic was useful for both IT
and FM. This led to a need for DCIM systems to bring in asset discovery systems alongside databases of the physical
appearance and the technical description of the IT equipment so that existing data centre layouts could be more easily
created.
This brought DCIM systems into competition with the asset discovery and management systems that were part of the
IT systems management software. Interoperability between the two systems is not always available, yet a common
database, along the lines of a change management data base (CMDB) makes sense to provide a single true view of
what is in a data centre.
A differentiation between DCIM systems is often how good their databases of equipment are – some will not be
updated with new equipment details on a regular basis; others will use “plate values” for areas such as power usage.
The difference between using a plate value (just taking the rated power usage) and the actual energy usage measured
in real time can be almost an order of magnitude, which can lead to over-engineering of power, backup and cooling
systems.
2D schematics have moved over in many DCIM systems to be 3D so that rack-based systems can be engineered in situ
and viewed from multiple directions to make sure that pathways for humans remained traversable. 3D schematics
also allow for checking to see if new equipment can be brought directly into a spot in the data centre, or if there are
too many existing objects in the way.
From this came the capability to deal with “what if?” scenarios. For example, would placing this server in this rack
here cause an overload on this power distribution block? Would placing these power transformers here cause a hot
spot that could not be cooled through existing systems? Again, such capabilities help both FM and IT work together
to ensure that the data centre is optimally designed and gives the best support to the business.
With 3D visual representations and granular details of the systems involved along with real time data from
environmental sensors, the use of computational fluid dynamics then comes into play. Using empirical data from the
DCIM system to see what happens to cooling flows as systems are changed and new equipment added ensures that
hot spots are avoided right from the start.
The problems for DCIM lay mainly in trying to give a single tool that covers two different groups. The FM team will
often have their own BIM systems in place, and see the data centre as “just another building” with a few additional
needs. To the IT team, the data centre is the centre of their universe, but they tend to see it as a load of interesting
bits surrounded by a building. The need for the two teams to not only talk, but work from common data sources to
create an optimal solution is not always seen as a priority. Even where DCIM is seen as being a suitable way forward,
there will be a need to integrate it into existing systems so as not to replicate too much and create a whole new set
of data silos.
Vendors have also been part of the problem – the main IT vendors have been poor on covering the facility, preferring
to stick with archetypal systems management tools that just look at the IT equipment. It has been down to the
vendors of the UPSs and other “facilities” equipment alongside smaller new-to-market vendors to come up with full-
service DCIM tools and try and create a market.
8/13/2019 Bringing FM and IT together – Volume II
6/18
8/13/2019 Bringing FM and IT together – Volume II
7/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 7 -
Public free cloud – SaaS or function as a service (FaaS), such as Google or Bing Maps where a function is
taken on a best efforts support basis.
This mix of data centres also leads to a mix in areas where data and information will lie. No longer can an organisation
simply centralise all its data into a single storage area network (SAN).
On top of this is the lack of capability for the organisation to draw a line around a specific group of people and say
“this is the organisation”. The need for organisations to work across an extended value chain of contractors,
consultants, suppliers (and their suppliers), logistics companies, customers (and often their customers) means that
data and information flows are often moving into area where the organisation has less control.
This is all made more complex through the impact of bring your own device (BYOD). The unstoppable tide of end
users expensing their own devices and expecting them to work with the enterprise’s own systems, and then
downloading consumer apps from appstores and so creating data and information in extra places unknown to the IT
department means that the value of data and information is being increasingly diluted.
IT now has to accept that the data centre itself is just part of the equation, and start to move to a model that pays far
more attention to the data and information the organisation is dependent upon.
To manage this, it is a waste of time looking at how firewalls should be deployed – after all, just where should this
wall be positioned along the extended value chain? It is equally wrong to look at applying security just at the
application or hardware levels, as as soon as someone manages to breach that security, they will have free will to
roam around the rest of the information held in that information store.
No, data and information now has to be secured and managed at a far more granular level, with users being identified
by different types through them as an individual, to their role within a team to their level of corporate security
clearance. On top of this needs to be contextual knowledge, such as where the person is accessing the data from and
from what sort of device. Then the data itself needs to be classified against an agreed security taxonomy – which
could be as simple as tagging data and information as being “Public”, “Commercial in confidence” or “For your eyes
only”.
Touchpoints need to be implemented such that the organisation can see who is attempting to access information
assets – this is best done through virtual private networks (VPNs) and hybrid virtual desktops, which can enforce the
means in which corporate assets are accessed. Through these touch points, information security such as encryption
of data at rest and on the move, along with data leak prevention (DLP) and digital rights management (DRM) can be
applied alongside information rules based on access rights for the person and their context.
Mobile device management (MDM) can help to keep an eye on what devices are attaching to the network, and can
help to air lock them from full access to systems until appropriate identification of the individual using the device has
been made. This may require multi-level identification going well beyond the normal challenge and response
username/password pair, maybe to include single use access codes or biometrics.
All of this then means that information assets are only accessible by the right people in the right place. Even ifsomeone else can get hold of the digital representation of the asset, it will still be useless to them, as it will be
encrypted and controlled by a DRM certificate where necessary.
All of this needs changes in how the data centre operates – each aspect of the above will require new systems, new
applications and agreements with the business of what information security means to them. Much of this can now
be done outside of the corporately owned data centre – managed security providers are appearing which can provide
the functions required on a subscription basis without the need for massive capital investment by your organisation.
The heading to this article was “The data centre for intellectual property management”. As such, the title is
completely wrong. What needs to be put in place is an architectural platform for intellectual property management
8/13/2019 Bringing FM and IT together – Volume II
8/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 8 -
– and this will transcend the single facility and move far over into a hybrid mix of needs across a range of different
facilities.
The data centre and the remote office. The remote office has always been a bit of a problem when it comes to technology. The employees in these offices
are still just as dependent on technology as their counterparts in the main offices – but they have little to no
qualifications to look after any technology that is co-located with them. Therefore, the aim has tended to be to
centralise the technology and provide access to the remote employees as required.
This has not tended to work well. Slow response and poor connectivity availability has pushed users away from
sticking with the preferred centralised solution, instead working around the systems with processes and solutions
they have chosen themselves.
As bring your own device (BYOD) has become more widespread, each individual has become their own IT expert –
unfortunately, with a little knowledge being a dangerous thing. The choice of consumer apps to carry out enterprise
tasks is leading to a new set of information silos – ones that central IT has no capability of managing; ones where
pulling together the disparate data for corporate analysis and reporting is impossible.
Architecting a new platform that meets everyone’s needs should now be possible – it just requires a little bit of give
and take.
Each individual has to accept that what pays their salary is a much greater entity – the organisation. If they do not
work in a manner that helps the organisation, the capability for the salary to be paid could be impacted. Therefore,
working in a manner that is organisation-centric is a basic requirement of having a job – and I don’t care that the
millennials scream that they won’t work for any organisation that doesn’t allow them 7 hours a day time for posting
on Facebook.
What IT has to look at is how best to put in place the right platform to meet the organisation’s and the individual’s
needs. This should start with a need for centralisation of the data – as long as the organisation can access all data and
information assets, it can analyse these and provide the findings through to those in the organisation who can then
make better informed decisions against all the available information.
Therefore, data and files should be stored within a single place – eventually. This does not stop enterprise file sharing
systems, such as Huddle, Box or Citrix ShareFile from being used; it just means that the information held within these
repositories needs to be integrated into the rest of the platform. Capturing the individual’s application usage is
important – being able to steer them in the direction of corporate equivalents of consumer applications can help
minimise problems at a later date when security is found to be below the organisation’s needs, or the lack of the
capability to centralise data leads to a poor decision being made.
It may well be that remote users would be best served through server-based computing approaches such as virtual
desktop infrastructure (VDI). Using modern acceleration technologies such as application streaming or Numacent’s
Cloudpaging will provide very fast response for the remote user, while allowing them to travel between remote offices
and larger offices and still have full access to their specific desktop. Citrix, Centrix Software and RES Software also
provide the capabilities for these desktops to be accessible from the user’s BYOD devices – and apply excellent levels
of enterprise security to the system as well. What an organisation should be looking for is the capability to “sandbox”
the device – creating an area within any device which is completely separate to the rest of it. Through this means,
any security issues with the device can be kept at bay; enterprise information can be maintained within the sandbox
with no capability for the user to cut and paste from the corporate part to the consumer part of the device. Should
the user leave the organisation, the sandbox can be remotely wiped without impacting the user’s device itself.
8/13/2019 Bringing FM and IT together – Volume II
9/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 9 -
For remote offices of a certain size or which are in a geographic location where connectivity may be too slow for a
good end user experience, it may be that a “server room” may be warranted to hold specific applications that the
office needs, and maybe to run desktop images for them locally. Data and information created can be replicated in
the background, using WAN acceleration from the likes of Veeam, Symantec, Silver Peak or Riverbed, ensuring that it
is still all available centrally.
Where such a server room is put in place, it is important to ensure that it can be run “lights out” from a more central
place. Depending on the person at the remote office who may have the biggest PC at home is no way to support a
mission critical or even business important environment. Dedicated staff with the right qualifications must be able to
log in remotely and carry out actions on the systems as required. Wherever possible, patching and updating should
be automated with full capability to identify which systems may not be able to take an upgrade (for example due to
a lack of disk space or an old device driver) and either remediate the issue or roll back any updates as required. Here,
the likes of CA and BMC offer good software management systems built around solid configuration management
databases (CMDBs).
The increasing answer for many organisations, however, is to outsource the issue. As systems such as Microsoft’s
Office 365 become more capable, many service providers are offering fully managed desktops that provide a full officesuite, along with Lync telephony, alongside other software. Some offer the capability for organisations to install their
own software on these desktops, so enabling any highly specific applications, such as old accountancy or engineering
software packages to be maintained for any individual’s usage. Cloud-based service providers should be able to
provide greater levels of systems availability and better response times and SLAs through their scalability – and should
be better positioned to maintain their platforms to a more up-to-date level.
With connectivity speed and availability improvements continuing, a centralised approach to remote offices should
be back on the data centre manager’s agenda. However, the choice has to be as to how that centralisation takes
place. For the majority, the use of cloud-base service provision of a suitable platform will probably be better than the
use of a server room or centralisation directly to an existing owned data centre. Quocirca’s recommendation is to
look to outsourcing wherever possible: use the existing data centre for differentiated core, mission critical activities
only.
Cascading data centre design
Back in the early days of computing, a concept of “time sharing” was common. Few organisations could afford the
cost or had the skills to build their own data centre facility, and so they shared someone else’s computer in that
organisation’s facility through the use of terminals and modems.
As computing became more widespread, the use of self-owned, central data centre facilities became more the norm.
The emergence of small distributed servers led to server rooms in branch offices – and even to servers under an
individual’s desk. Control of systems suffered; departments started to buy their own computer equipment andapplications. The move to a fully distributed platform was soon being pulled back together to a more centralised
approach – but often with a belt and braces, sticking plaster result. The end result for many was a combination of
multiple different facilities, each running to different service levels with poor overall systems utilisation and a lack of
overall systems availability.
Virtualisation – touted as the ultimate answer – may just have made things worse, as the number of virtual machines
(VMs) and live applications not being used have spiralled out of control. Cloud computing – again, another “silver
bullet” – means that the organisation is now having to deal not only with its own issues around multiple facilities, but
also other organisations’.
8/13/2019 Bringing FM and IT together – Volume II
10/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 10 -
Increased mobility of the workforce, both through home working and the needs of the “road warrior” has led to a
need for “always on” access to enterprise applications – and also to a bring your own device (BYOD) appetite for using
apps from other environments.
It’s all a bit of a mess. Just what can be done to ensure that things get better, not worse?
The first thing that has to be done is a full audit of your own environment. Identify every single connected server
within your network, and every single application running on them – there are plenty of tools out there to do this.
Once you have this audit, you will need to identify the physical location of each server. This may be slightly more
difficult, but there is one way that is pretty effective where you cannot identify exactly where a server is. Deny access
for it to the main network – within a few minutes, there will be a call to the help desk from a user complaining: they
will know where it is.
Now you have a physical asset map of where the servers are, and you know what applications are running on them.
First, identify all the different applications that are doing the same job. You may find that you have three or four
completely different customer relationship management (CRM) systems. Make sure that you identify your strategic
choice, and arrange with those using the non-strategic systems to migrate over as soon as possible. Now, identify all
the different instances of the same application that are running. Consolidate these down as far as possible – there
may be 5 different instances of the same enterprise resource planning (ERP) application in place.
Such functional redundancy is not just bad for IT in the cost of servers, operating system licences, maintenance and
power that are required to keep them running, but also for the business. These systems will generally be running
completely separate to each other, and this means that the business does not have a single view of the truth.
Consolidation has to be carried out – for everyone’s sake.
At this point, you have a more consolidated environment, but there will still be lots of applications that are being run
by the organisation that could be better sourced through a SaaS model. Software that is providing functionality that
is highly dependent on domain expertise – for example, payroll and expense management - is much better outsourced
to a third party, as they can ensure that all the legal aspects of the process.
This then leads to dealing with your organisation’s overall IT platform in a more controlled yet flexible manner.
The overall internal IT platform for the organisation should be smaller than it was previously. Consolidation,
particularly when carried out with a fully planned virtualisation strategy, should reduce the amount of IT equipment
required by up to 80%. All the equipment can now be placed where you want it. But – should this be all in an owned
facility?
Probably not. There are problems in building and managing a highly flexible data centre. Power distribution and
cooling tend to be designed and implemented to meet specific needs. Further shrinkage of internal platform can lead
to issues with the facility’s power utilisation effectiveness (PUE) score growing, rather than shrinking. The always-on
requirement means that multiple different connections from the facility to the outside world will be required.
No – a cascade design of data centres is what is required. There may be applications that for any reason (long-term
investment in the application and/or IT equipment, fears over data security) will be required to remain in an owned
facility. There will be many more applications that can be placed into a co-location facility. Here, someone else is
providing and managing the facility – they have the responsibility for connectivity, cooling, power distribution and so
on. You just have to manage the hardware and software in your part of the facility. Should your needs grow, the
facility owner can give you more space, power and cooling. Should your needs shrink, then you can negotiate a smaller
part of the facility.
SaaS based solutions take this even further – you have no responsibility whatsoever for the facility, hardware or
software. This is all someone else’s problem: you can concentrate on the business’ needs.
8/13/2019 Bringing FM and IT together – Volume II
11/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 11 -
Ensuring that a cascaded data centre design works, consisting of an owned facility in conjunction with a co-location
facility and public cloud functionality, means having in place the right tools to manage the movement of workloads
from one environment to another. It also requires effective monitoring of application performance with the capability
to switch workloads around to maintain service levels. The more that is kept within an owned facility, the more
availability becomes an issue, and multiple connections to the outside world will be required.
However, getting it right will provide far greater flexibility at both a technical and business level. Quocirca strongly
recommends that IT departments do start on this process: start carrying out a complete and effective audit now –
and plan as to how your IT platform will be housed and managed in the years to come.
Disaster recovery and business continuity –
chalk and cheese?
Most organisations will have an IT disaster recovery (DR) plan in place. However, it was probably created some timeback and will, in many cases, be unfit for purpose.
The problem is that DR plans have to deal with the capabilities and constraints of a given IT environment at any one
time, so a DR plan created in 2005 would hardly be designed for virtualisation, cloud computing and fabric networks.
The good thing is that the relentless improvements in IT have created a much better environment – one where the
focus should now really be away from DR to business continuity (BC).
At this stage, it is probably best to come up with a basic definition of both terms so as to show how they differ.
Business Continuity – a plan that attempts to deal with the failure of any aspect of an IT platform in a manner
that still retains some capability for the organisation to carry on working.
Disaster recovery – a plan that attempts to get an organisation back up and working again after the failure
of any aspect of an IT platform.
Hopefully, you see the major difference here – BC is all about an IT platform coping with a problem: DR is all about
bringing things back when the IT platform hasn’t coped.
Historically, the costs and complexities of putting in place technical capabilities for BC meant that only the richest
organisations with the strongest needs for continuous operation could afford BC: now, it should be within the reach
of most organisations; at least to a reasonable extent.
Business continuity is based around the need for a high availability platform, something that was covered in an earlier
article (insert link to “Uptime – the heart of the matter”). By the correct use of “N+M” equipment alongside well
architected and implemented virtualisation, cloud and mirroring, an organisation should be able to ensure that some
level of BC can be put in place to provide BC for the majority of cases.
Note the use of the word “majority” here. Creating a full BC-capable IT platform is not a low-cost project. The
organisation must be fully involved in how far the BC approach goes – by balancing its own risk profile against the
costs involved, it can make the decision as to at what point a BC strategy becomes too expensive for the business to
fund.
This is where DR still comes in. Let’s assume that the business has agreed that the IT platform must be able to survive
the failure of any single item of equipment in the data centre itself. It has authorised the investment of funds for an
N+1 architecture at the IT equipment level, and as such, the IT team has now got one more server, storage system
and network path per system than is needed. However, as the data centre is based on monolithic technologies, the
8/13/2019 Bringing FM and IT together – Volume II
12/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 12 -
costs of implementing an N+1 architecture around the UPS, the cooling system and the auxiliary generation systems
were deemed too high.
Therefore, the DR team has to look at what will be needed should there be a failure of any of these items, as well as
what happens if N+1 is not good enough.
The first areas that have to be agreed with the business are around how long it will take to get to a specified level of
recovery of function, and what that level of function is. These two points are known as the recovery time objective
(RTO) and the recovery point objective (RPO). This is not something that an IT team should be defining – the business
has to be involved and must fully understand what the RTO and RPO mean. In particular, the RPO defines how much
data has to be accepted as being lost – and this could have a knock-on impact on how the business views its BC
investment.
For example, in an N+1 architecture, the failure of a single item will have no direct impact on the business, as there is
still enough capacity for everything to keep running. Should a second item fail, then the best that will happen is that
the speed of response to the business for the workload or workloads on that set of equipment will be slower. The
worst that can happen is that the workload or workloads will fail to work. In the former case, the RPO will be to regain
the full speed of response within a stated RTO – which would generally be defined as the time taken for replacement
equipment to be obtained, installed and fully implemented. Therefore, the DR plan may state that a certain amount
of spares inventory have to be held, or that agreements with suppliers have to be in place for same-day delivery of
replacements – particularly for the large monolithic items such as UPSs. The plan must also then include all the steps
that will be required to install and implement the new equipment – and the timescales that are acceptable to ensure
that the RTO is met.
In the latter case where the workload has been stopped, then the RPO has to include a definition of the amount of
data that could be lost over specified periods. In most cases this will be per hour or per quarter hour; in high-
transaction systems, it could be per minute or per second. The impact on the RTO is therefore dependent on the
business’ view of how many “chunks” of data loss it believes it can afford. The DR team has to be able to provide a
fully quantified plan as to how to meet the RPO within the constraints of the business-defined RTO – and if it is aphysical impossibility to balance these two, then it has to go back to the business which will have to decide whether
to invest in a BC strategy for this area, or to lower its expectations on the RPO so that a reasonable RTO can be agreed.
In essence, BC has to be the main focus for a business: it is far more important to create and manage an IT platform
in a manner for the organisation to maintain a business capability. The DR plan is essentially a safety net: it is there
for when the BC plan fails. BC ensures that business continues, even if process flows (and therefore cash flows) are
impacted to an extent. DR is there to try and stop a business from failing: as a workload has or workloads have been
stopped, the process flows are no longer there.
The two elements of BC and DR are critical to have within an organisation – the key is to make sure that each
compliments and feeds into and back against each other to ensure that there are no holes in the overall strategy.
Managing IT – converged infrastructure,
private cloud or public cloud?
The days of taking servers, storage and network components, putting them together and running applications on
them on an essentially physical one-to-one basis is rapidly passing by. The uptake of virtualisation means that
workloads are sharing many resources, and the emergence of “as a service” means that the underlying resources have
to be more flexible and easy to implement and use than ever before.
8/13/2019 Bringing FM and IT together – Volume II
13/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 13 -
However, this still leaves a lot of choice to an end-user organisation. Should they go for a converged or engineered
system, such as a Cisco UCS, an IBM PureSystem or a Dell VRTX, or should they go for a more cloud-like scale out
model based on “commodity” servers? Should they go the whole way and forget about the physical platform itself
and just go for a public cloud based service?
Each has its own strengths – and its own weaknesses.
Converged systems take away a lot of the technical issues that organisations run up against when attempting to use
a pure scale out approach. By engineering from the outset how the servers, storage and networking equipment within
a system will work together, the requirements for management are simplified. However, expansion is not always
easy, and in many cases may well require over-engineering through implementing another converged system
alongside the existing one just to gain the desired headroom. There is also the issue of managing across multiple
systems – this may not be much of a problem if a homogeneous approach is taken, but if the fabric network consists
of multiple different vendors, or if there are converged systems from more than one vendor in place, it may be difficult
to ensure that everything is managed as expected.
A private cloud environment may then be seen as a better option. Although private cloud can (and generally should)
be implemented on converged systems, the majority of implementations Quocirca sees are based around the use of
standard high volume (SHV) servers built into racks and rows with separate storage and network systems. Adding
incremental resources is far simpler in this approach – new servers, storage and network resources can be plugged in
and embraced by the rest of the platform in a reasonably simple manner, provided that the management software
has the capabilities contained within it.
Provided that this right management software is in place, this can work. However, skills will be required that
understand not only the technical aspects of how such a platform works, but also expertise in areas such as how to
populate a rack or a row in such a manner so as not to cause issues through hot spots or a requirement to draw too
much power through any one spot.
Those choosing either of these paths must also make sure that any management software chosen does not just focus
on one aspect of the platform: the virtual environment has dependencies on the physical, and these must be
understood by the software. For example, the failure of a physical disk system will impact any virtual data stores that
sit on that system: the management software must be able to understand this – and ensure that backups or mirrors
are stored on a different physical system. It must also ensure that on any failure, the aim is for business continuity,
minimising any downtime and automating recovery as much as possible through the use of hot images, data mirroring
and network path virtualisation across multiple physical network interface cards (NICs) and connections.
Public cloud, whether this is infrastructure, platform or software as a service (I/P/SaaS) would seem to offer the means
for removing all the issues around needing to manage the platform. However, ensuring that you have visibility at the
technical level can help in seeing if there any trends that your provider has missed (for example, are storage resources
running low, is end-user response suffering?) and is needed to help lay the “what if?” scenarios that organisations
need to be able to run these days.
In reality, the majority of organisations will end up with a hybrid mix of the above options. This brings in further issues
– whereas a single converged system may be pretty much capable of looking after itself, once it needs to interact with
a public cloud system, extra management services will be required.
Whatever platform an organisation goes for, the software really should be capable of looking at the system from end-
to-end. To the end user, one major issue will always be the response of a system. A converged system will report
that everything is running at hyper-speed, as it tends to look inwardly and will be monitoring performance at internal
connection speeds. The end user may be coming in from a hand-held device over a public network and using a mix
of functions from the converged system and a public cloud: the management software must be able to monitor all of
this and be able to understand what is causing any problems. It must then be able to try and remediate the problem
8/13/2019 Bringing FM and IT together – Volume II
14/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 14 -
– for example, by using a less congested network, by offloading workload to a different virtual machine or by applying
more storage. It must understand that by providing more network, this could mean that the higher IOPS could require
a different tier of storage to be used, or for more virtual cores to be thrown at the server. All of this needs to be
carried out in essentially real time – or as near as makes no difference to the end user.
The existing systems management vendors of CA, IBM and BMC are getting there with their propositions, with HP
lagging behind. The data centre infrastructure management (DCIM) vendors, including nlyte and Emerson Network
Power are making great strides in adding to existing systems management tools through including monitoring and
management of the data centre facility and its equipment into the mix. EMC is making a play for the market through
its software defined data centre (SDDC) strategy, but may need to be bolstered by a better understanding of the
physical side as well as the virtual.
One things is for sure – the continued move to a mix of platforms for supporting an organisation’s needs will continue
to drive innovation in the systems management space. For an IT or data centre manager, now is the time to ensure
that what is put in place is fit for purpose and will support the organisation going forward no matter how the mix of
platforms evolves.
The Tracks of my Tiers
It’s time for change. Your old data centre has reached the end of the road, and you need to decide whether to build
a new one or to move to a co-location partner. What should you be looking for in how the data centre is put together?
Luckily, a lot of the work has already been done for you. The Uptime Institute (uptimeinstitute.com) has created a
simple set of tiering for data centres that describes what should be provided in the areas of overall availability through
a particular technical design of a facility.
There are four tiers, with Tier I being the most simple and least available, and Tier IV being the most complex and
most available. The Institute uses Roman numerals to try and avoid facility owners trying to say that they exceed one
tier but aren’t quite the next tier and using nomenclature of, for example, “Tier 3.5”. However, Quocirca has seen
instances of facility owners saying that they are “Tier III+”, so the plan hasn’t quite worked.
It would be fair to say that in most cases, costs also reflect the tiering – Tier I should be the cheapest, with Tier IV
being the most expensive. However, this is not always the case, and a well implemented, well run Tier III or IV facility
could have costs that are comparable to a badly run lower Tier facility.
A quick look at the tiers gives the following as basic descriptors, with each tier having to meet or exceed the
capabilities of the previous tier:
Tier I: Single non-redundant power distribution paths for serving IT equipment with non-redundant capacity
components, leading to an availability target of 99.671%. Capacity components are items such as UPS,cooling systems, auxiliary generators and so on. Any failure of a capacity component will result in downtime,
and scheduled maintenance will also require downtime.
Tier II: A redundant site infrastructure with redundant capacity components, leading to an availability target
of 99.741%. The failure of any capacity component can be manually managed by switching over to a
redundant item with a short period of downtime, and scheduled maintenance will still require downtime.
Tier III: Multiple independent distribution paths serving IT equipment; at least dual power supplies for all IT
equipment; leading to an availability target of 99.982%. Planned maintenance can be carried out without
downtime. However, a capacity component failure still requires manual switching to a redundant
component and will result in downtime.
8/13/2019 Bringing FM and IT together – Volume II
15/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 15 -
Tier IV: All cooling equipment to be dual powered; a complete fault tolerant architecture leading to an
availability target of 99.995%. Planned maintenance and the failure of a capacity component are dealt with
through automated switching to redundant components. Downtime should not occur.
Bear in mind that these availability targets are for the facility – not necessarily for the IT equipment within there.
Organisations must ensure that the architecture of the servers, storage and networking equipment, along withexternal network connectivity provide similar or greater levels of redundancy to ensure that the whole platform meets
the business’ needs.
The percentage facility availabilities may seem very close and very precise – however, a Tier I facility will allow for the
best part of 30 hours of downtime per annum, whereas a Tier IV facility will only allow for under half an hour.
The majority of Tier III and IV facilities will have their own internal targets of zero unplanned downtime, however –
and this should be an area of discussion when talking with possible providers or when designing your own facility.
It is tempting to look at the Tiers as a range of “worst -to-best” facilities. However, it really comes down more to the
business requirements that drive the need. For example, for a sub-office using a central data centre for the majority
of its critical needs, but having an on-site small server room for non-critical workloads, a Tier III data centre could be
overly expensive for its needs, and a Tier I or Tier II facility could be highly cost-effective. Although Tier I and Tier II
facilities are not generally suitable for mission critical workloads, if there are over-riding business reasons and the
risks are fully understood and plans are in place to manage how the business continues during downtime, then Tier I
could still be a solution.
It is Tiers III and IV where organisations should be looking for placing their more critical workloads. Tier III facilities
will still require a solid set of procedures in how to deal effectively with capacity component failures, and these plans
will need to be tested on a regular basis. Even with Tier IV, there is no case for assuming that everything will always
go according to plan. A simple single redundancy architecture (each capacity component being backed up by one
more) can still lead to non-availability. If a single capacity component fails, the facility is now back down to a non-
redundant configuration. If the failed component cannot be replaced rapidly, then a failure of the active component
will result in downtime.
Therefore, plans have to be in place as to whether replacement components are held in inventory, or whether there
is an agreement in place with a supplier to get a replacement on site – and probably installed by them – within a
reasonable amount of time. For a Tier IV facility, this should be measured in hours, not days.
If designing your own facility, the Uptime Institute’s facility Tiers give a good basis for what is required to create a
suitable data centre facility with requisite levels of availability around the capacity components. It will not provide
you with any reference designs – areas such as raised v. solid floors, in-row v. hot/cold aisle cooling and so on are not
part of the Institute’s remit.
If you are looking for a co-location partner, then the Institute runs a facility validation and certification process. Watch
out for co-location vendors who say that there facility is Tier III or Tier IV “compliant” – this is meaningless. If they
want to use the Tier nomenclature, then they should have gone through the Institute and become certified. A full list
of facilities that have been certified can be seen on the Institute’s site here:
http://uptimeinstitute.com/TierCertification/certMaps.php
http://uptimeinstitute.com/TierCertification/certMaps.phphttp://uptimeinstitute.com/TierCertification/certMaps.phphttp://uptimeinstitute.com/TierCertification/certMaps.php
8/13/2019 Bringing FM and IT together – Volume II
16/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 16 -
What to look for from a Tier III data centre
provider
The Uptime Institute provides a set of criteria for the tiering of data centre facilities that can help when looking to use
either a co-location facility or an infrastructure, platform or software as a service (I/P/SaaS) service.
The idea of the tiers are to provide indications of the overall availability of the facility – a Tier I facility is engineered
to have no more than 28.8 hours of unplanned downtime per annum, a Tier II 22 hours, a Tier III 1.6 hours and a Tier
IV 0.8 hours. As can be seen, there is a big jump from Tier II to Tier III – and this is why organisations should look for
a Tier II facility when looking for a new facility to house their IT within.
A Tier III facility offers equipment redundancy in core areas, such that planned maintenance can be made while
workloads are still on-line, and where the failure of a single item will not cause the failure of a complete area. Tier IV
takes this further to provide multi-redundancy, but will only be required by those who have a need for maximumavailability of the facility and the IT platform within it. For most, Tier III will be sufficient.
However, there are lots of co-location, hosting and cloud vendors out there who indicate that they are “Tier III” (or
more often, “Tier 3” – which the Uptime Institute do not like), many of which are not fully compliant with the
guidelines. It is a case of caveat emptor – buyer beware – but there are certain steps that can be taken to ensure that
what you are getting is fit for purpose.
If you really and truly require an Uptime Institute Tier III facility, then it is really quite simple. A facility can only call
itself Uptime Institute Tier III if it is certificated accordingly.
The Uptime Institute provides three different types of certification – and these require expense by the facility owner.
The only way to become certified is to go to the Uptime Institute Professional services company and get them to audityour plans, your operational approach or your physical data centre. Just having the plans audited is a quicker way of
audit, and results in a Tier Certification of Design Documents. This gives the facility owner a certificate and they can
be listed on the Uptime Institute’s site as a certificated member.
The certification of the physical data centre can only be obtained after the data centre plans have been certificated.
The Uptime Institute Professional Services company will then catty out a site visit and a full audit of the physical facility
to ensure that the build is in line with the plans. If this is the case, then the facility owner will get a Tier Certificate of
Constructed Facility – with a plaque to go on the vendor’s offices or wherever, as well as listing on the Institute’s site.
With the Operational Sustainability Certification, an on-site visit is made to evaluate the effectiveness of components
of the management and operations and building characteristics. These are compared to the specific requirements
outlined in the Institute’s document, Tier Standard: Operational Sustainability. Once validated, the facility owner gets
a certificate, plaque and listing on the Institute’s site.
Therefore, the first place to start when looking for a Tier III facility is the Uptime Institute’s site, as all certificate
owners will be listed there.
Does this mean that all of those who are not on the Institute’s site should be avoided? By no means. There are those
who believe that the Uptime Institute is too self-centred and that its certification process is not open enough. There
are those who object to having to pay for the certification process, and others who just do not see the point of having
an Uptime Institute Tiering at all.
The Telecoms Industry Association (TIA) came up with a similar 4 levels of facility tiering (Tiers 1-4) in 2005, under its
tiering requirements in document ANSI/TIA-942. These requirements have been modified in 2008, 2010 and 2013 to
8/13/2019 Bringing FM and IT together – Volume II
17/18
Bringing FM and IT together –
Volume II
© Quocirca 2014 - 17 -
reflect changes and advancements in data centre design. The tiers roughly equate with the Uptime Institute’s tiers,
and as such, anyone using the TIA’s system should also be looking for a Tier 3 facility.
For those facilities that do not have either an Uptime Institute nor TIA tiering, then it is down to the buyer to carry
out due diligence. Quocirca recommends that the buyer uses either the Uptime Institute’s or the TIA’s documents to
pull out the areas that they believe to be of the largest concern to them and insist that the facility owner shows how
they meet the needs of these.
Don’t let them fob you off with responses like “Of course – but we do it differently” – challenge them; get them to
quantify risks and show how they will ensure defined availability targets; get them to put financial or other penalty
clauses with a service level agreement (SLA) so that they become more bought in to the need to manage availability
successfully. When you carry out your own site visit, ask questions – where’s the second generator; what happens if
that item fails; how do multiple power distribution systems come in and distribute around the facility?
Only through satisfying yourself will you be able to rest easy. Taking responses at face value could work out very
expensive – and it is in the nature of many facility owners to promise almost anything to get higher levels of occupancy
in their facility. They know that once you are in the facility, it is difficult to move out again.
Certainly, the Uptime Institute’s certification is the “Gold Standard” as it is based against a rigorous evaluation of
plans, facility and operational processes against a set of solid requirements. The TIA is a more open approach which
does put more of the weight of due diligence on the buyer to ensure that the requirements have been fully followed.
A facility stating that it is “built to Tier III standards” requires yet more diligence – and an understanding of the
requirements.
Lastly – remember that these tiers only apply to the facility itself – they do not define how the IT equipment itself
needs to be put together to give the same or higher levels of availability. Ensuring that overall availability is high
requires yet more work to cover how the IT equipment is configured…
8/13/2019 Bringing FM and IT together – Volume II
18/18
About Quocirca
Quocirca is a primary research and analysis company specialising in the
business impact of information technology and communications (ITC).With world-wide, native language reach, Quocirca provides in-depth
insights into the views of buyers and influencers in large, mid-sized and
small organisations. Its analyst team is made up of real-world practitioners
with first-hand experience of ITC delivery who continuously research and
track the industry and its real usage in the markets.
Through researching perceptions, Quocirca uncovers the real hurdles to
technology adoption – the personal and political aspects of an
organisation’s environment and the pressures of the need for
demonstrable business value in any implementation. This capability to
uncover and report back on the end-user perceptions in the market
enables Quocirca to provide advice on the realities of technology adoption,not the promises.
Quocirca research is always pragmatic, business orientated and conducted
in the context of the bigger picture. ITC has the ability to transform businesses and the processes that drive them, but
often fails to do so. Quocirca’s mission is to help organisations improve their success rate in process enablement
through better levels of understanding and the adoption of the correct technologies at the correct time.
Quocirca has a pro-active primary research programme, regularly surveying users, purchasers and resellers of ITC
products and services on emerging, evolving and maturing technologies. Over time, Quocirca has built a picture of
long term investment trends, providing invaluable information for the whole of the ITC community.
Quocirca works with global and local providers of ITC products and services to help them deliver on the promise thatITC holds for business. Quocirca’s clients include Oracle, IBM, CA, O2, T-Mobile, HP, Xerox, Ricoh and Symantec, along
with other large and medium sized vendors, service providers and more specialist firms.
Details of Quocirca’s work and the services it offers can be found at http://www.quocirca.com
Disclaimer:
This report has been written independently by Quocirca Ltd. During the preparation of this report, Quocirca may have
used a number of sources for the information and views provided. Although Quocirca has attempted wherever
possible to validate the information received from each vendor, Quocirca cannot be held responsible for any errors
in information received in this manner.
Although Quocirca has taken what steps it can to ensure that the information provided in this report is true and
reflects real market conditions, Quocirca cannot take any responsibility for the ultimate reliability of the details
presented. Therefore, Quocirca expressly disclaims all warranties and claims as to the validity of the data presented
here, including any and all consequential losses incurred by any organisation or individual taking any action based on
such data and advice.
All brand and product names are recognised and acknowledged as trademarks or service marks of their respective
holders.
REPORT NOTE:This report has been writtenindependently by Quocirca Ltd
to provide an overview of theissues facing organisationsseeking to maximise theeffectiveness of today’sdynamic workforce.
The report draws on Quocirca’sextensive knowledge of thetechnology and businessarenas, and provides advice onthe approach that organisationsshould take to create a moreeffective and efficient
environment for future growth.
http://www.quocirca.com/http://www.quocirca.com/http://www.quocirca.com/http://www.quocirca.com/