View
232
Download
0
Category
Tags:
Preview:
Citation preview
Climb Out of the Hole
CPTE 433 Chapter 2Adapted by John Beckett from
The Practice of System & Network Administration
by Limoncelli, Hogan, & Chalup
The Problem
You are too busy fighting fires to
get things done right
You have lots of fires
because you aren’t
doing things right
What is “Doing Things Right?”
• Use a trouble-ticket system• Manage quick requests properly• Adopt time saving policies• Start every host in a known state• …
Use a Trouble-Ticket System
• Assure that each ticket goes to completion– User, dispatcher, and tech sign off
• Documents current state of unresolved tickets
• Provides important historical data for management/planning
Dash-board
Management Interface
Service
Trouble Ticket Cycle
TroubleTicket
System
SA Dispat
ch
SAs
SAMgt
Clients
Admin
A trouble ticket system is the core of SA management
Trouble Ticket System Messages
• SA Dispatch> Enter ticket> Reactivate ticket< Ticket status
• SAs< Ticket
information> Tasks done> Needs> Add’l tickets
• SA Mgt< Dashboard> Decisions
• Client< Status
information> Add’l data
• Administration< Value delivered> Money needed
Managing Quick Requests
• Have a shield– SA assigned to quick requests
• You may want to rotate this duty• Are all SAs equally trained or do they
specialize?• Precursor tasks (e.g. password
resets) need priority
Trouble Ticket Response Time is
(sort of) Like Network Packets• Latency
– How long it takes for a problem to reach the person who can solve it.
• Bandwidth– How quickly a person can solve a
problem
Time-Saving Policies
• Define how people are to get help from your group
• Define scope of responsibility for your team
• Define emergency• Define “quick request”• These policies are the responsibility
of SA management– Individual SA discretion needs to be
defined
Policy Tradeoffs
Individuals define their own policies
Individuals stick with group policies
Users gravitate toward “loose” individuals as long as they have success, then chaos grows
Users get consistent resultsSAs are not as productive
Loose policies Tight policies
Quick response for trivial requestsPoor response for longer requests because of interruptions
Consistent response for all requestsSequencing problems
Start Every Host in a Known State
• Have standard build methods for servers and clients.– Make this a key part of your technology
platform• Limit the number of options
– Record the options in that workstation’s entry in your inventory/ticketing system
• Automate the build process• New projects start with a standard build
– Document steps to final state
Other tips
• Make email work well– Stable, reliable, functional
• Fix the biggest time drain– Identify what’s bleeding you, give it
necessary resources to solve it– “Rinse and repeat”
• Quick Fixes– Get production going– Cost more to fix properly later on
More tips
• Sufficient power and cooling– People and technology malfunction if
they aren’t comfortable• Simple monitoring
– Email-enable systems that might fail– Set up a Web-based dashboard
• Work to make it more comprehensive• Practice using it• Update it as reality changes
Beckett’s Tips• Clearly identify and label each device and service
with a unique name that does not collide with any other namespace– Never, never, never re-designate (rooms, devices, etc.)
• Solo operation with no trouble ticket system? Start with 3x5 cards– Perhaps print one side with basic info like who reported it,
phone number, resource failing or needing upgrade, and 1-line description
• Expect to outgrow a trouble ticket system– If it’s comprehensive enough for the future, you might
never get it off the ground in the first place• Work through your trouble ticket system, not around
it
Recommended