62
Twitter: @datacenterworld 1 Data Center Infrastructure Operational Best Practices Steven Shapiro, P.E., ATD Partner

Data Center Infrastructure Operational Best Practicesfiles.informatandm.com/uploads/2019/3/122BC_-_W_-_12.40_-_Shapi… · • ISO/IEC 24764 • HIPPA - Health Insurance Portability

  • Upload
    others

  • View
    3

  • Download
    2

Embed Size (px)

Citation preview

Twitter: @datacenterworld 1

Data Center InfrastructureOperational Best Practices

Steven Shapiro, P.E., ATDPartner

Information Classification: General2

Data Center World – Certified Vendor Neutral

Each presenter is required to certify that their presentation will be vendor-neutral.

As an attendee you have a right to enforce this policy of having no sales pitch within a session by alerting the speaker if you feel the session is not being presented in a vendor neutral fashion. If the issue continues to be a problem, please alert Data Center World staff after the session is complete.

Information Classification: General

Agenda• General/Guideline Requirements• Site Selection• Building Construction• Space Planning• Physical Security• Mechanical Systems• Water and Plumbing• Fire Protection• Electrical Systems• Facility Policies and Procedures

Information Classification: General

Energy Efficiency Standards• LEED• Energy Star• Green Globes• CEEDA

General Guidelines/Requirements

Information Classification: General

Uptime/Reliability Design and Infrastructure Standards• Uptime Institute Tier Certification: Tier I to IV• ANSI/BICSI 002-2014 – Class F0 to F4• TIA 942 – Tier 1-4• ICREA• UL• EN 50600 series 1, 2-1 to 2-6 – Availability Class 1-4• EN 50173-5• ISO/IEC 24764• HIPPA - Health Insurance Portability and Accountability Act• SOX – Sarbanes Oxley - 2002• SAS 70 – Type I or II• Gramm-Leach Bliley Act (GLBA)

General Guidelines/Requirements

Information Classification: General

Operations Standards• Uptime Institute: Operational Sustainability• ISO 9000 - Quality System• ISO14000 - Environmental Management System• IS0 27001 - Information Security• PCI – Payment Card Industry Security Standard• SAS70 & ISAE 3402 or SSAE16 (USA) - Assurance controls• AMS-IX - Data Centre Business Continuity Standard• EN50600-2-6 Management and Operational Information

General Guidelines/Requirements

Information Classification: General

• Facility Reliability Must Match the Requirements Detailed in the Business Case for the Facility

• Total Cost of Ownership (TCO)-Approach that will Balance Capital and Operational Expenditures to your Company’s Needs. Part of this Process is Determining Design Criteria and Performance Characteristics, Tailored Specifically to Your Organization

• When Operations Teams are Excluded from Facility Design, Modification and Repairs often become Necessary

• CXA Team Participation is Essential from Concept to Final Pull the Plug Testing

General Guidelines/Requirements

Information Classification: General

Economic Incentives• Tax Incentives• Peak Shaving or Cogeneration Incentives

Sustainability Opportunities/Incentives• Green Power Availability• Utility Rebates to Energy Efficient Equipment• Free Cooling Opportunities

Data Center “Friendly” Municipalities• Expedited Approval and Other Helpful Processes • Friendly Noise and Emissions Ordinances• No Fuel Oil Storage Limitations

Site Selection

Information Classification: General

Natural Disaster Potential• Seismic• Tornado/Hurricane• Flood• Tsunami

Utility Availability/Reliability/Costs• Power, Water, Sewer• Power Cost, Power Cost And

Power Cost!• Multiple Fiber Carriers

Site Selection

Information Classification: General

Stay Away from Hostile Adjacencies…• Airports/Flight Paths• Railway Lines• Nuclear and Other Power Plants• Hazardous/Explosive Materials And Facilities• Residential Neighborhoods

But still close enough to…• Airports• Freeways• Emergency Services• Data Center Equipment Technicians• Public Transportation

Site Selection

Information Classification: General

Needed Utilities• Buried and Reliable Utilities – System

Average Interruption Duration Index (SAIDI)

• Diverse Fiber And Power Availability• Reliability of Power Grid• Sufficient Water Availability

• City Water• Well Water• Water Storage

Site Selection

Information Classification: General

• Fire Ratings of Areas• Block or Sheet Rock Walls• Clear Roof• No Roof Drains• No Windows• Hurricane Resistant• Single or Multi-story• Structural Steel or Reinforced Concrete Shell• Thickened Existing Slabs to Support Heavier Loads.• Roof to Withstand 120 Mph Winds• Floor to Roof Structure• Roof Structure to Withstand Hanging Loads• Large Column – Less Areas• Floor Loading• Facility Time Horizon – Expected Useful Life

Building Construction

Information Classification: General

• Load Density• White Space Utilization• True White Space Square Footage• CRAH Galleries/Mechanical – Electrical Corridor• Tray Elevations/Power/Copper/Fiber• Ceiling or No Ceiling• Floor/No Floor• Hot Aisle/Cold Aisle – Containment, Blanking Panels• Systems Isolation/Segregation To Redundancy Level• Feeder/Piping Path Isolation• Cabinet/Rack Layout – 42U, 72U, 19” Wide, 24” Wide, 2 Post,

W/ Wire Manager• POEs With Dedicated Power/Rectifiers• MPOs, MDF, IDF• Door Sizes• Equipment Delivery Path

Space Planning

Information Classification: General

People and Technician Space Definition• Test/Assembly Area• Lab Area• Storage• Vendor Storage/Offices• Uncrating Area• Dumpsters

ElectronicsPaperCardboardWoodTrash

Space Planning

Information Classification: General

Security Threat Analysis • Bombs – Bag/Car• Rockets/Planes• Direct Attack• Disgruntled Employee• Corporate Espionage

Zones of Protection• Cabinet• White Space• Gray Space and Hallways• Office• Site• Perimeter

Physical Security

Information Classification: General

• Crime Prevention Through Environmental Design (CPTED)

• Segregated POE/Vendor Areas• Devices• Card Readers• Biometrics• Mantraps• Cameras• Dogs• Berms• EPO• Escorts

Physical Security

Information Classification: General

• Contractor Badgeso Keyso Escortso Rules

• Copper/Fiber Secure Wireways• Cages Under Floor/Above Ceiling• Bars in Ductwork• Bollards• Active Vehicle Barriers• No Signage on Exterior

Physical Security

Information Classification: General

• Energy Efficiency-PUE• What Is PUE/DICE?• Elevated Data Center Temperature Maximizes Return

Air Temperature but Accelerates Critical Equipment Failure• So says Los Alamos Heat-to-failure Study with 10

Degree C Rise the Failure Rate Doubles (APC White Paper)

• Containment – Hot/Cold Aisle• Humidification Levels In Data Center

ASHRAE is Opening Window

Mechanical Systems

Information Classification: General

ASHRAE 9.9 and 9.1

Allowable Range

Recommended Range

Mechanical Systems

Information Classification: General

Air Side Economizer Hours Possible in ASHRAE Allowable Range

Mechanical Systems

Information Classification: General

Air Side Economizer Hours Possible in ASHRAE Recommended Range

Mechanical Systems

Information Classification: General

• Variable Speed Server Fans• EC Plug Fans – No VFD Required• Upflow/Downflow/Freeflow

Mechanical Systems

Information Classification: General

• 2 Pipe/4 Pipe• N+1, N+20%• Continuous Cooling

Mechanical Systems

Information Classification: General

• CBRN – Chemical, Biological, Radiological, Nuclear Filtration/Threat

• Water Storage• Redundant Piping, Loop With Dual Risers,

Redundant Coils• Water Treatment• Fuel Storage – 48, 72, 96 Hours, Above or

Below, Day Tanks, Belly Tanks• Fuel Treatment – Polishing• Fuel Delivery/Testing on Delivery• Fuel Sampling

Mechanical Systems

Information Classification: General

Cooling TechnologiesChilled Water• Air Cooled Chillers or Water Cooled

Chillers• Economizer

–Water Side Economizer – Heat Exchanger• Series/Parallel to Chiller

• Cooling Tower - Drycooler-Fluid Cooler-Dry Towers

Information Classification: General

DX• Condenser• Drycooler• Pumped Refrigerant• Air Side Economizer • Direct - Needs Corrosion Monitoring Systems Due to

Particulate Introduced into the Data Center• Indirect – Maintains Air Quality in the Data Center• Evaporative – Used in Direct and Indirect Airside

Economization• Trimming Systems – Mechanical Cooling Required for

Air Side Economizer in Certain Geographies due to Local Weather, DX, Chilled Water

Cooling Technologies

Information Classification: General

• Outside Air for Pressurization has Humidification for Entire Facility

• Indoor and Outdoor Air Quality• MERV and His Filters - 8 For Recirculation and 11

for Outdoor Air Economizers• Separate Comfort Cooling (People Spaces)

From Critical Cooling due to Temperature Requirements

• Primary Pumping with VSDs• Primary/Secondary Pumping with VSDs• Reuse of Data Center Waste Heat• In-Row Cooling

Cooling Technologies

Information Classification: General

• In the Aisle – Suspended in Aisle or in Row with Cabinets

• Room Perimeter/Mechanical/CRAC Corridor• Mechanical Equipment on UPS – Fans/Chilled Water

Pumps – Continuous Cooling Requirements Depending on Load Density and Generator Restart Times

• Thermal Capacity in Piping, Access Flooring, Slab –Calculate and Test

• Chilled Water Storage for Emergency and Peak Shaving

• Ice Storage for Emergency and Peak Shaving• Spot Coolers – Move-n-cool, Provide Power

Connections throughout Data Center• Rack Level Cooling Doors/Refrigerant/Fan Door

Cooling Technologies

Information Classification: General

• Liquid Cooling – Bathtub/Tank Cooling – Mineral Oil, NOVEC, etc.

• Server Level Cooling• Chip Level Cooling• Data Centers Must Meet an ISO Class 8 Clean Room

Standard.• Reheat/Humidification In CRACS - No• CFD Analysis• Containment• Chimneys• End Doors• Cold Aisle• Hot Aisle• Partial Containment• Fire Protection Issues with Containment

Cooling Technologies

Information Classification: General

Containment

Information Classification: General

• BMS/BAS – Smart Controllers• Controls at Units (Local)• Master/Slave – Remote then Local• VFD/VSD – Bypass, Harmonic Filtration

on UPS Power and when on Generators• Smart VFDs Match to Pump Operation.• Data Center Airflow/Variable Flow

Based on Static Pressure Needed• Rack Level Monitoring – Temperature

Monitoring, Static Pressure Monitoring From Bottom to Top of Rack

Controls

Information Classification: General

Conservation:• Gray Water Storage/Use• Irrigation Water• Rainwater Management

Plumbing Issues:• Sewer/Septic – Issues with Cooling Tower

Blowdown• Trap Primers Or Plugs – Issues with Gas

Suppression• Well, City Water• N, 2N Supplies

Water/Plumbing

Information Classification: General

• Clean Agent – Gas – CO2, FM200, Inergen, NOVEC

• Sprinkler – Wet/Dry/Preaction-Single/double interlock

• Diesel Fire Pump• Mist – Generators/Data Center/Electrical

Rooms/Mechanical Rooms

Fire Protection

Information Classification: General

• ASSD – Air Sampling Smoke Detection –Above and Below Floor, In Ceiling

• Spot Detection• Cross Zoned – Software• Photoelectric/Ionization/Cable• Rated for Underfloor CFM

Fire Protection

Information Classification: General

Onsite CogenerationUtility

• N• 2N• Voltage

UPS Configurations• N• N+1• 2N• 2(N+1)• Distributed Redundant• Block Redundant• Isolated-Redundant• Stranded Capacity

Electrical Systems

Information Classification: General

UPS with transformer or transformer free• Efficiency• Ground Current IssuesUPS Modules With Internal Redundancy• Internal Redundancy but Same Single Input/OutputUPS System Static Switch• Module level• System levelUPS Technology• Double Conversion• Offline• Line interactive• Delta Conversion• Rotary• DRUPS – Diesel Rotary• Hybrid• Low Voltage• Medium Voltage

Electrical Systems

Information Classification: General

Energy Storage• Flooded• VRLA• Flywheel• Coupling

Central UPS vs Distributed UPS

Maintenance Load Bank Substation – Alternate Source for MBP

Permanent Load Bank• Portable connection for inductive• For UPS• For Generator

Electrical Systems

Information Classification: General

Eco mode• Good - Efficiency• Bad – UPS OfflineITIC Curve/CBEMA CurveUPS Distribution voltage• 480V, 120/208V, 415V,• 3 Phase 3 Wire,

3 Phase 4 WireDC Systems and Voltage – 380VDC

Electrical Systems

Information Classification: General

Battery Monitoring• Cell• Jar• System• Loading• Impedance• Networked

Electrical Systems

Information Classification: General

• Harmonics• Scalable• Generator Configurations

• Capacity• Voltage – Low, Medium• Prime• Continuous• Standby• Generator Distribution• Single• Parallel

Electrical Systems

Information Classification: General

Generator Protection• Differential Protection

• Surge Arrestors

• R-C Snubber Circuits

Dedicated Life Safety Generator or Battery Emergency Lighting, Fire Alarm

Noise Issues

Electrical Systems

Information Classification: General

Power to the Rack• 120/208VAC, 415VAC• PDUs• RPPs• Busway• Branch Circuit Monitoring• Above the Floor• Below the Floor• Short Circuit Withstand• Arc Flash• Keep Both Cords Live

Electrical Systems

Information Classification: General

• Maintenance PDU

• RPP tie - RPP alternate main

• 4 pole breaker or Neutral contactor for 4 wire transfer

• Transformer Efficiencies and DOE 2016

• Transformer Inrush Current

• Switchgear/Switchboards – Quality – Resiliency

• Draw out, Bolt on, Plug in CB

• Overcurrent Protection and Code

• Solid State Trip

• Zone Selective Interlock

• Maintenance Mode for Reduced Arc Flash

• Fuses

Electrical Systems

Information Classification: General

Electrical Systems

Information Classification: General

MV Circuit Breakers • Vacuum - Snubbers• SF6

IR• Manual Camera• Internal Camera• Windows• Thermal Couples

Electrical Systems

Information Classification: General

EPMS – Monitoring, Time Stamping – GPS Timing

Arc Flash/Coordination Issues and PPE Requirements

Rack level PDUs/MOAs Multi Outlet Assembly – Power Strips• Smart/Dumb• Metering• Short Circuit Rating

Electrical Systems

Information Classification: General

• Green and Energy Savings• Solar• Fuel Cells• Micro-Turbines• Wind• Distributed Generation and Load Curtailment• Emissions – Prime/Standby/Scrubbers• Generator - Lower Block Heater Temp to Save

Energy• Demand Peak Shaving• Lighting Controls

Electrical Systems

Information Classification: General

Other Electrical Concerns• Grounding/Bonding/Lightning Protection

• Surge Suppression Devices, SPD, TVSS

• Power Factor Correction

• Portable Generator Connection

• Portable Load Bank Connection

• Load Bank Circuit Breaker

o Load Bank Switchgear

o Generator

o UPS – After the Static Switch

Electrical Systems

Information Classification: General

• Color Coded and Labeled Power Paths• Snow/Rain and Generators• Spare Part Storage – Power, UPS, Gens,

HVAC• Concurrent Maintainability• Fault Tolerance

Electrical Systems

Information Classification: General

• A High Level of Redundancy Does Not Justify Lack of a Proper Operations and Maintenance Program

• Staffing Needs Should be Based on Your Risk Profile and Budget

• Workforce Scheduling:• Emergency Response

• Equipment Maintenance

• Vendor Management

• Daily Minimums

• 7x24

• Weekends

• Contractor Support from First Call

• Vendor Support Timeline from First Call

• Size of Facility and Level of Automation will Change Requirements

Facility Policies and Procedures

Information Classification: General

Training:• Program Effectively Provides and Verifies Proper Training

• Increases the Level of Expertise for All Individuals

• The Cost and Effort on Training Program Development Offset by:

• Increased Uptime

• Lower Maintenance Cost

• Decreased Employee Turnover

• Ongoing Training Must Be Viewed

as an Investment in the Overall Business

• Implement Methods to Help Retain Valued, Dedicated Staff Members

• Whether In-house or Outsourced

Facility Policies and Procedures

Information Classification: General

Training:• Each Year of Experience Gained Increases

the Likelihood an Individual will Make the Best Decision in a Crisis and Effectively Follow the Appropriate Procedure when Performing a Scheduled Task

• Provide Training Videos – From Commissioning Process

• Invest in Training Simulators for M and E Systems

• Failure to Overlay Your Operations Program with Documented Processes and Procedures Leads to Failure

Facility Policies and Procedures

Information Classification: General

For Technicians: • Benchmark Compensation and Benefits

• Monetary Incentives

• Ownership of Processes and Systems/Rotate to Broaden Expertise

• Solicit Ideas for Personnel/Feedback

• Drill and Test Skills

• Create Drills for Emergency Procedures

• Develop Theory of Operation for Major Equipment/Systems

• Create Training Modules for Operating and Maintenance Procedures

• Develop Exams for Various Training Levels

Facility Policies and Procedures

Information Classification: General

Operations:• Change Control Process

• SOPs

• MP – Maintenance Procedure

• EOPs

• MOPs

Facility Policies and Procedures

Information Classification: General

Operations:• Vendor Management – OEM Recommended Maintenance is the

Minimum.

Stick to the Minimum for Your Mission?

• Emergency Response Needs Dedicated Training

• Detailed Incident Reporting

• Failure Analysis/Modeling

• Lessons Learned

• QA/QC/QI – Review and Provide Documented Feedback on Procedures

• Use Software Management Tools – DCIM, CMMS (Computerized Maintenance Management System, Asset Management), EPMS, Infrastructure Capacity Management, Single Pane of Glass/Dashboard

Facility Policies and Procedures

Information Classification: General

Operations:• Operational Audits to Validate Work Done

• DMS – Document Management System

• Updated Basis of Design Document

• As-built and Record Drawings

o Updated Floor Plans, One Lines and Flow Diagrams

• Asset Database

• Preventative Maintenance Scope of Work

• Maintenance Schedule

• Critical Facility Work Rules

• Safety Program

• Facility Reports

• Walkthrough Checklist

Facility Policies and Procedures

Information Classification: General

Operations Cont.:• Regulatory Conformance – Audits• Internal• 3rd Party• Safety Program• Equipment Labeling That Gives Location and

System Type by Color• Dedicated Maintenance

Shop• Lockers/Showers/Cots

Copyrighted by Capitoline LLP

Facility Policies and Procedures

Information Classification: General

Program must be based on the risk profile of the data center:

• When Can Maintenance be Performed – 7x24 Operation, All Critical All the Time?

• Concurrent Maintainability• Fault Tolerance

Facility Policies and Procedures

Information Classification: General

• Contractor Work Rules/Site Trainingo Mundane (No Food or Beverages in Critical Areas)o Life Safety (Appropriate Arc Flash Gear will be

Worn During Certain Electrical Work Activities)o Use of Dedicated Bathroomso Cutting and Burning in the Data Centero Fire System Disable and Enable During/After

Work. o Lock Out Tag Out – LOTO

• Procedures and Guidelines for Safely Performing Work in an Active Data Centero Uptime Institute White Paper

Facility Policies and Procedures

Information Classification: General

Maintenance: • Break-fix, Preventive, Predictive Maintenance/Trending• Lifecycle Strategy to Maintain Full Life of Equipment –

Rotation, PM, Trending of Operation for Predictive Maintenance

• Vibration Trending and Analysis –All Rotating Equipment

• IR Scanning• Energy Management• Daily Inspections

Siemens Study on Electrical Maintenance 2012

Facility Policies and Procedures

Information Classification: General61

5 Key Things You Have Learned During this Session

1. Codes and Standards in Design and Operation of a Data Center.

2. Physical Security Best Practices3. Mechanical Systems Best Practices4. Electrical Best Practices5. Operations Best Practices.

Information Classification: General62

Thank you

Steven Shapiro, P.E., ATDPartner

(914) [email protected]

http://www.linkedin.com/in/stevenshapiropeTwitter: @stevenshapirope

Twitter: @datacenterworld