Improving_the_Performance_of_IT_Operations.pdf

Embed Size (px)

Citation preview

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    1/29

    Improving the Performance of IT

    OperationsCreating, Managing, and Improving Your

    IT Metrics

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    2/29

    10/23/2008 Improving the Performance of IT Operations

    Overview

    Share our experiences with managing operational

    performance via metrics in IT

    Our approach

    Choosing the right metrics

    Method of sharing

    Developing action plans and improvement

    Examples

    Mean Time to Repair (Support Groups)

    Average Speed of Answer (Help Desk)

    Enterprise Data Warehouse Data Currency

    Q&A

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    3/29

    10/23/2008 Improving the Performance of IT Operations

    Historical Approach to Metrics SPC

    Monthly Ops Reviews

    Large group formats

    Metrics driven by ticketing system

    Less About More About

    1. Explaining month to

    month variances

    2. Praises when better

    than target and

    dissatisfaction when

    worse

    3. Reacting to bad newswithout analysis

    1. Understanding the

    natural variation in aprocess

    2. Knowing how performing

    against customer

    desires

    3. Developing thoughtful

    action plans

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    4/2910/23/2008 Improving the Performance of IT Operations

    New Approach

    Focus on metrics that are meaningful to both the

    support organization and their customers

    Supportive and targeted

    Individualized Less about charting, more about analysis and

    improvement

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    5/2910/23/2008 Improving the Performance of IT Operations

    Metric Development

    Bring group management, key employees and

    customers together Brainstorm ways that demonstrate that we:

    Dont let it break

    If it breaks, we fix it fast

    We fix it right the 1st time

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    6/2910/23/2008 Improving the Performance of IT Operations

    Metric Development

    Dont let it break

    Customer Focused Back Office Focused

    Reactive Proactive Reactive Proactive

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    7/2910/23/2008 Improving the Performance of IT Operations

    Variants

    Project Management Dont let it be late

    It its late, minimize the impact fast

    Plan it right the first time

    Architecture Dont let the design be incorrect

    It its incorrect, fill the gap quickly

    Design it right the first time

    Security Dont let them compromise data

    If they do, catch them fast

    Plan security right the 1st time

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    8/2910/23/2008 Improving the Performance of IT Operations

    Sample IT Network Engineering & Support

    Dont let it break If it breaks, we fixit fast We fix it right the1st time

    Traffic blocked at firewall

    Repeat troubles for

    Customers or devices

    Repeat TAC calls to

    vendor

    Certifying device & configstandards

    % of successful

    changes

    Mean Time to RepairSpanning Tree

    Convergence Time

    WAN interface

    utilization

    3rd party response

    times

    Ticket rout ing &

    Assignment accuracy

    Memory utilization

    On key network

    devices

    Trunk utilization

    End of Life/Support

    Hardware Interface Errors Call quality for VOIP

    Traffic Analysis /

    utilization

    % routers with

    unsaved configs

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    9/2910/23/2008 Improving the Performance of IT Operations

    Network Engineering & Support Comparison

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    10/29

    10/23/2008 Improving the Performance of IT Operations

    Networking Engineering & Support -Improving on Mean Time to Repair (MTTR)

    Problem: It was taking too long to repair low

    priority problems.

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    11/29

    10/23/2008 Improving the Performance of IT Operations

    Why-Because Pursuit

    Fifty reasons why not capable, but only 6 common themes

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    12/29

    10/23/2008 Improving the Performance of IT Operations

    Common Themes

    Understanding of requests

    Ticket monitoring

    External noise

    Varying effort Priorities

    Timing of ticket entry

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    13/29

    10/23/2008 Improving the Performance of IT Operations

    NES MTTR Control Chart and Process Break

    Months

    MTTR(Hours)

    AugustJulyJuneMayAprilMarchFebJan

    125

    100

    75

    50

    25

    0

    -25

    -50

    _X=20.5

    UCL=31.7

    LCL=9.4

    January - August 2008 MTTR

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    14/29

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    15/29

    10/23/2008 Improving the Performance of IT Operations

    IT Help Desk Improving Average Speed ofAnswer (ASA)

    Problem: The IT Help Desk was not answering

    employee phone calls quickly enough

    Through Why-Because pursuit, identified four

    areas of opportunity:

    Incentives

    HR policies

    Shift schedule changes

    Additional customer choices

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    16/29

    10/23/2008 Improving the Performance of IT Operations

    ASA Control Chart and Process Breaks

    1

    2

    3

    4

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    17/29

    10/23/2008 Improving the Performance of IT Operations

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    18/29

    10/23/2008 Improving the Performance of IT Operations

    Results

    Process now operates at 3 sigma

    Reduced wait time by 4 minutes

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    19/29

    10/23/2008 Improving the Performance of IT Operations

    System Administration Improving Mean Timeto Repair (MTTR)

    Problem: The System Administration group was

    not solving problems quickly enough

    Identified five areas of opportunity:

    Staff levels

    Improved ticket management

    Weekly awareness of metrics

    Optimized process for elevated permissions

    Developed incentive plan

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    20/29

    10/23/2008 Improving the Performance of IT Operations

    System Administration Control Chart

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    21/29

    10/23/2008 Improving the Performance of IT Operations

    Capability Comparison Over Time

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    22/29

    10/23/2008 Improving the Performance of IT Operations

    Results

    Problem appeared to be workload related, but

    proved to be both process and workload Now operates at > 4 sigma

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    23/29

    10/23/2008 Improving the Performance of IT Operations

    Enterprise Data Warehousing Improving LoadTimes

    Problem:

    The EDW Nightlies Informatica data load needs to be more

    consistent on a day-to-day basis resulting in less load errors,

    faster reaction times to errors, and overall improved load

    completion times.

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    24/29

    10/23/2008 Improving the Performance of IT Operations

    Enterprise Data Warehousing Improving LoadTimes

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    25/29

    10/23/2008 Improving the Performance of IT Operations

    Illustration of Improvement

    NOT DONE

    YET!!

    NOT DONE

    YET!!

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    26/29

    10/23/2008 Improving the Performance of IT Operations

    Problem Statement

    The EDW Nightlies Informatica data load needs to

    be more consistent on a day-to-day basis resulting in

    less load errors, faster reaction times to errors, and

    overall improved load completion times.

    Action Plan:

    A. Modi fy OBIEE parameter sett ings to reduce extra

    database processes from application.

    B. Modify PEDW data parameter to increase shared

    memory allocation to database thus eliminating

    database lockups.

    C. Add SiteScope Alerts on cr it ical load fai lures (w/

    Command Center callouts to EDW Support) by end

    of Sept

    D. Add SiteScope Alerts when cr it ical loads do not

    start on time (w/Command Center callouts to EDW

    Support) by middle of Oct

    E. Upgrade PEDW to 10g on RAC by end of Oct

    F. Upgrade Informat ica to 8.6 by end of Nov

    Current Status / UpdatesOBIEE changes done and work well.

    PEDW shared memory allocation parameters

    modified (8/20) and appear to have solved the

    lockup problems. We will continue monitoring.

    SiteScope alert on failure in progress, working

    out bugs involving Command Center action.SiteScope alert on load schedule miss is being

    designed.

    PEDW upgrade to 10g/RAC planned for week

    of Oct 6 with Oct 13 completion date. SEDW

    testing going well.

    Informatica 8.6 upgrade still being planned.

    Expected Results

    EDW Nightly load times and load error

    metrics will be capable as a result of these

    changes.

    The upgrade to 10g/RAC and Infa 8.6 should

    improve the completion times by at least 30%

    over the current 9i and Infa 7.1.3infrastructure.

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    27/29

    10/23/2008 Improving the Performance of IT Operations

    Lessons Learned

    Those that volunteer first have been most

    successful

    If youre not happy with the metrics, start over

    Involve the front line employees Review in small groups, monthly at a minimum

    Share the results with your customers

    Document your successes!

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    28/29

    10/23/2008 Improving the Performance of IT Operations

    Wrap Up

    Matured our metrics program

    Collaborative approach to metric selection

    Focused on action planning and improvement

    Individual success stories

  • 8/14/2019 Improving_the_Performance_of_IT_Operations.pdf

    29/29

    10/23/2008 Improving the Performance of IT Operations

    Q&A