Upload
lamxuyen
View
218
Download
0
Embed Size (px)
Citation preview
Automated Failure Management
PUTTING TECHNOLOGY TO WORK IN BUSINESS CONTINUITY
Indra Mohan, CEO, Enigmatec Corp
© Enigmatec Corporation 20052
AgendaAgendaAgenda
• Company Introduction• Business Continuity Planning - The IT Challenge• Automated Failure Management - Use-Case• Our Product• Technology Trends• Summary
© Enigmatec Corporation 20053
Enigmatec Company IntroductionEnigmatec Company IntroductionEnigmatec Company Introduction
• Founded in 2001 to commercialise research pioneered at Edinburgh and Cambridge University
• Offices in London, New York & Palo Alto
• Policy-based Automation Product released June 2004 • We automate the Data Center “Run Book”
• Partners include:• Intel• Sun Microsystems• VMware
© Enigmatec Corporation 20054
Business Continuity Planning The IT ChallengeBusiness Continuity Planning Business Continuity Planning The IT ChallengeThe IT Challenge
OVERALL BUSINESS GOALS: LESS DOWNTIME
EXTERNAL DRIVERS
COMPETITIONCOMPETITION
NEW REGULATIONSNEW REGULATIONS
Number ofNumber of FailuresFailuresINTERNAL
DRIVERSRecovery TimeRecovery Time
DO MORE WITH LESSDO MORE WITH LESS
SINGLE SYSTEM VIEWSINGLE SYSTEM VIEW
© Enigmatec Corporation 20055
Business Continuity PlanningThe Situation TodayBusiness Continuity PlanningBusiness Continuity PlanningThe Situation TodayThe Situation Today
• Data Center Business Continuity Today• Hardware is Over-Configured• Management Tools Were Built To Monitor, Not RESPOND• Too Much Human Intervention
• As A Result, Failure Response is • Inconsistent & Error-Prone• Too Slow
• Current Solutions Are Inadequate• Scripts and Manual Procedures Do Not Scale• High-Availability Clustering Is Expensive
© Enigmatec Corporation 20056
Business Continuity PlanningHow Technology Will HelpBusiness Continuity PlanningBusiness Continuity PlanningHow Technology Will HelpHow Technology Will Help
RECENT EVOLUTION OF IT MANAGEMENT SOFTWARE
• PROVISIONING: CONFIGURATION OF APPLICATIONS• VIRTUALIZATION: SERVER CONSOLIDATION• POLICY-BASED AUTOMATION: RUNNING OF APPLICATIONS
2001 2002 2003 2004 2005 2006
© Enigmatec Corporation 20057
Automated Failure Management Use CaseMajor Investment BankAutomated Failure Management Use CaseAutomated Failure Management Use CaseMajor Investment BankMajor Investment Bank
Application Environment• Trading System • Compute Grid
The Challenge• Failure Response is Slow and Prone to Error• Hardware is over-configured and Under-utilized
TRADING SYSTEM & COMPUTE GRID
SUN HARDWARE
SOLARIS O/S
DATABASE
APP SERVER
CLIENT-BUILT COMPUTE GRID
SYSTEMS M
ON
ITOR
INTEL HARDWARE
LINUX O/S
DATABASE
APP SERVER
CLIENT-BUILT TRADING APPS
SYSTEMS M
ON
ITOR
© Enigmatec Corporation 20058
Automated Failure Management Use CaseData Center EnvironmentAutomated Failure Management Use CaseAutomated Failure Management Use CaseData Center EnvironmentData Center Environment
Run Books Scripts ManualAd Hoc
Alerts LogsWarnings
• The Traditional Approach • 3 Data Centers• 4 Environments (Production, BC, Test & Dev.)• Application recovery procedures are manual and
script-based
• Goals• Repeatable Failure Policies: Process Patterns• Reduce Application Recovery time from 1 hour to
15 minutes • Eliminate BC Data Center
NY - Production NJ - Test, Dev.
Delaware - BC
ServersNetwork Disk/SAN
ServersNetwork Disk/SAN
ServersNetwork Disk/SAN
© Enigmatec Corporation 20059
Automated Failure Management Use CaseThe Enigmatec SolutionAutomated Failure Management Use CaseAutomated Failure Management Use CaseThe Enigmatec SolutionThe Enigmatec Solution
SLA Monitor
• Enigmatec Solution
• Enigmatec Detects failure in the Production Data Center
• Repurposes the Test & Dev. Data Center into BC
• Restarts the application stack• Manual procedures automated using
extensible policy-driven workflow
NY - Production NJ - Test, Dev.
ServersNetwork Disk/SAN
ServersNetwork Disk/SAN
© Enigmatec Corporation 200510
Automated Failure Management Use Case Benefits and Next StepsAutomated Failure Management Use Case Automated Failure Management Use Case Benefits and Next StepsBenefits and Next Steps
• Benefits• Reduced Application Recovery Time
• 15 Minutes vs. 1 Hour +• Doing More With Less
• Over $1M in Hardware Savings• Over $2M per year OpEx Savings
• Next Steps• Roll Out To Additional Business Units• Automate Additional IT Procedures:
• Scale-In / Scale-Out• Validation and Testing of Production Systems
• Single Systems Management View
© Enigmatec Corporation 200511
The Enigmatec Product – Policy-Based AutomationKey FeaturesThe Enigmatec Product The Enigmatec Product –– PolicyPolicy--Based AutomationBased AutomationKey FeaturesKey Features
• Extensible Policy Execution Language • Self-organized deployment of policies across the network• Automated service discovery• Changes can be deployed “on-the-fly”
• No Single Point of Failure• Distributed Agents• Full peer-to-peer architecture• No centralized server
• Monitoring and Execution of Policies to SLA’s• Simultaneously platform and application aware• Full service-oriented architecture
© Enigmatec Corporation 200512
How We Do ItCore ConceptsHow We Do ItHow We Do ItCore ConceptsCore Concepts
WORKFLOWS
IT ELEMENTS
POLICIES
Elements: “What” to Automate• All Leading Platforms• Pre-Built Interfaces
Workflows: “How” to Automate• Replace Scripts and Manuals• Device Specific
Policies: “When, Where, and Why”• Easy to Design• East to Manage
+
=
© Enigmatec Corporation 200513
How We Do ItA Distributed Service GridHow We Do ItHow We Do ItA Distributed Service GridA Distributed Service Grid
1. Design Policies & Set SLA/RTOs
Web FarmsClusters App Servers Blades
Racks Network
Disk/SAN
Design Repository
PHYSICAL RESOURCES
ENIGMATEC SERVICE GRID
2.Test/Verify 3.Deploy to DataCenter
4.Agents Monitor for Failure and Execute Policies
5.Agents Report SLA and RTO Behavior to Console
© Enigmatec Corporation 200514
Data Center Macro TrendsWhere We Are GoingData Center Macro TrendsData Center Macro TrendsWhere We Are GoingWhere We Are Going
• Shift from Dedicated to Shared Compute• Similar to Evolution of Storage Area Networks• Dynamic Re-Allocation of Resources
• Modular Data Centers• SLA-Based Management• Utility “Dial-Tone” Computing
…and Automated Failure Management is the First Step
© Enigmatec Corporation 200515
Thank You!