Rebooting the Team - Surge 2013

Preview:

DESCRIPTION

A year ago, our software development team ended up in a funk. Simply put, we had some bugs in our processes, relationships and environment that were preventing us from being the best team we could be. So we did what any good dev team does when it encounters bugs: we deconstructed the problems, determined the root causes and implemented some fixes. I’ll share our story and discuss the lessons we learned along the way. You’ll take away ideas and tools that can help you explore these critical, but often tricky, topics in order to prepare your team to really scale.

Citation preview

Lessons Learned in an Introspective Year

Rebooting the Team

Fran FabrizioIT Director, Minnesota Population Center

Twitter: @franfabrizio Email: fran@umn.edu

Motivation for Rebooting the Team

DISCLAIMER

This is not a talk specific to scalability.

So, why am I here?

It goes like this...

2

3

4

5

6

7

8

9

10

11

Motivation for Rebooting the TeamSCALABILITY EDITION!

The Surge organizers contacted me and said “We came across your talk about rebooting your team. We’re filling out our organizational scalability track - would your talk work well there?”

This was an interesting question. I embarked upon a “pondering walk”...

12

13

14

15

16

In the next 50 minutes we’ll review our case study, including:

• Why we need reboots• Gut feelings --> specific symptoms• The collaborative process used to get to

the root causes• Leveraging the insight to build consensus

for change and action• Where we are and what we learned• Q&A

17

Audience TakeawayTechniques, tools, activities, processes, ideas, sparks.... stuff you can use to help figure out what’s ailing your team, whether your team needs a reboot, and how to collaboratively enact lasting change.

Not going to be a silver bullet - I’m giving you design patterns. You will have homework!

18

What’s a Reboot?

What’s a reboot?

A conscious decision to engage in deeper, more radical change than just incremental improvements.

A reboot typically impacts staff structure, work processes and communication patterns for your team and organization.

20

Why Do We Need Reboots?

21

DEV TEAM POWER METER100% 0%

Team firing on all cylinders, shipping great code, everyone contributing, team

greater than sum of its parts!

Why Do We Need Reboots?

22

DEV TEAM POWER METER100% 0%

Enlightened organizations keep the needle at 100% by proactivelyanticipating the changes in theirenvironment and responding to

them gracefully over time.

Why Do We Need Reboots?

23

DEV TEAM POWER METER100% 0%

Most organizations are not that enlightened.The needle begins to slide as the org is slow to respond to changes in their environment.

Why Do We Need Reboots?

24

DEV TEAM POWER METER100% 0%

By the time there is awareness and consensusfor change, the amount of change needed is often too great to achieve incrementally.

Wetware Reboots are Hard!• Wetware is wonderfully, messily analog,

nondeterministic, and mysterious - it doesn’t respond predictably to change.

• Reboots are costly and disruptive - we wouldn’t do them if we didn’t have to

• An engineer’s wetware skillset is typically less developed than their software/hardware skills

25

The Dev Manager’s Mission Something feels wrong. The team’s not working as well as it used to. You’re not quite sure what it is yet, or more importantly why it’s happening, or how to fix it. But you’re the one everyone’s looking to for a fix.

How do you get to the whats, whys, and hows? You need...

26

Organizational DebuggingA framework for turning observed behaviors and stakeholder input into a clear understanding of where your team has deficiencies and how to address them.

1.Observe behavior and get stakeholder views2.Distill into themes3.Dig until you converge on root causes 4.Execute action plans for each root cause

27

Do we need a reboot?

Symptoms Smells Root Causes Actions

Concrete

Abstract

Diagnosis Treatment

DecisionPoint

Themes

How is this Different than Normal Management?

29

It’s amplified.

How to Approach a Reboot• Respect the day job • Change is scary. Be consistent,

overcommunicate, and check in often• You are a facilitator: listen more than talk• HRT - Humility, Respect and Trust - is

essential. (from the book Team Geek)

30

Why HRT is so Important

• When organizational debugging is done collaboratively and with HRT, it produces momentum for change.

• When it’s not, it produces resentment and friction, and gets in the way of the organization executing needed change.

31

Observing the Symptoms

Observing the Symptoms

• Recall this typically starts as a gut feeling that “something’s not right”

• Start by writing down a list of symptoms that are giving you this gut feeling.

• Then put on your facilitator hat and start asking others (inside and outside your team) targeted but open-ended questions

33

Good Questions to Ask

• How did your week go?• What are your biggest pain points right

now? • What do you think are the most important

things for us to be doing / thinking about?• Do you need anything from me?• How are things with <insert customer>?

34

Listen MindfullyAs you’re having these conversations, listen for symptoms...

“Well, I spent the first half of the week setting up my dev environment on my new system, and then I had to put out a lot of fires on that one project. By the time I had any breathing room it was Friday and I couldn’t get any time with Mark, so I didn’t do any new feature development on my main project. Have you heard from the product team? I was wondering whether they want that new UI widget now, or if they want us to work on optimizing the query performance first?...” 35

Avoid

• What do you think is wrong with the team?– Sets off alarm bells and people feel compelled

to answer even if they weren’t thinking it • Try to avoid diving into solutions just yet

– Suggestions fine, but there’s risk of treating symptoms and not the root cause at this stage.

• Don’t go too deep–First pass over your stakeholders, just getting

a feel for things right now. Should feel casual.36

You’ve been committed to things without your knowledge

Ship dates slip

Expected to do 5 things at once

No 1:1 meetings

Small changes take longer than they should

Setting up a dev environment takes 3 days

No one person knows how to do a full deployment

Every deployment results in a big mess

People miss key pieces of info

Sick time is spiking

You get a pit in your stomach when you walk in the door

New requirements appear late in projects

Staff working on similar problems not collaborating

I can’t move code between projects easily.

Staff members reluctant to share knowledge

Documentation is out of date

Status meetings turn into bitchfests

People lament the quality of their work

People are quitting!

Can’t say no to anything

Customers don’t trust what IT says.

Estimates are unreliable

Projects getting later and later

Secondary projects fall through the cracks

Build is always broken!

You’re on a death march, and everyone knows it.

People are struggling with their tools.Every decision is made by committee

No consensus on priorities

Nobody knows what ‘done’ meansTeam focusing on peripheral issues

37

Job descriptions no longer match reality

Surprises are commonUptime is decreasing

Recruitment is getting more difficult

Long term goals aren’t getting closer

Exploratory work has stopped

I cannot determine the status of our systems.

I’m doing all the same things I was a year ago.

Vacations are disruptive

It’s too quiet!

Grouping Symptoms

• Go back to your symptom list and see if they group into related themes.

• Themes are “anchors” for discussion - rather than focusing on specific symptoms, which can get bogged down in the weeds

38

You’ve been committed to things without your knowledge

Ship dates slip

Expected to do 5 things at once

No 1:1 meetings

Small changes take longer than they should

Setting up a dev environment takes 3 days

No one person knows how to do a full deployment

Every deployment results in a big mess

People miss key pieces of info

Sick time is spiking

You get a pit in your stomach when you walk in the door

New requirements appear late in projects

Staff working on similar problems not collaborating

I can’t move code between projects easily.

Staff members reluctant to share knowledge

Documentation is out of date

Status meetings turn into bitchfests

People lament the quality of their work

People are quitting!

Can’t say no to anything

Customers don’t trust what IT says.

Estimates are unreliable

Projects getting later and later

Secondary projects fall through the cracks

Build is always broken!

You’re on a death march, and everyone knows it.

People are struggling with their tools.Every decision is made by committee

No consensus on priorities

Nobody knows what ‘done’ meansTeam focusing on peripheral issues

39

Job descriptions no longer match reality

Surprises are commonUptime is decreasing

Recruitment is getting more difficult

Long term goals aren’t getting closer

Exploratory work has stopped

I cannot determine the status of our systems.

I’m doing all the same things I was a year ago.

Vacations are disruptive

It’s too quiet!

Teams are too silo’ed

You’ve been committed to things without your knowledge

Ship dates slip

Expected to do 5 things at once

No 1:1 meetings

Small changes take longer than they should

Setting up a dev environment takes 3 days

No one person knows how to do a full deployment

Every deployment results in a big mess

People miss key pieces of info

Sick time is spiking

You get a pit in your stomach when you walk in the door

New requirements appear late in projects

Staff working on similar problems not collaborating

I can’t move code between projects easily.

Staff members reluctant to share knowledge

Documentation is out of date

Status meetings turn into bitchfests

People lament the quality of their work

People are quitting!

Can’t say no to anything

Customers don’t trust what IT says.

Estimates are unreliable

Projects getting later and later

Secondary projects fall through the cracks

Build is always broken!

You’re on a death march, and everyone knows it.

People are struggling with their tools.Every decision is made by committee

No consensus on priorities

Nobody knows what ‘done’ meansTeam focusing on peripheral issues

40

Job descriptions no longer match reality

Surprises are commonUptime is decreasing

Recruitment is getting more difficult

Long term goals aren’t getting closer

Exploratory work has stopped

I cannot determine the status of our systems.

I’m doing all the same things I was a year ago.

Vacations are disruptive

It’s too quiet!

Ops Tools Deficient

You’ve been committed to things without your knowledge

Ship dates slip

Expected to do 5 things at once

No 1:1 meetings

Small changes take longer than they should

Setting up a dev environment takes 3 days

No one person knows how to do a full deployment

Every deployment results in a big mess

People miss key pieces of info

Sick time is spiking

You get a pit in your stomach when you walk in the door

New requirements appear late in projects

Staff working on similar problems not collaborating

I can’t move code between projects easily.

Staff members reluctant to share knowledge

Documentation is out of date

Status meetings turn into bitchfests

People lament the quality of their work

People are quitting!

Can’t say no to anything

Customers don’t trust what IT says.

Estimates are unreliable

Projects getting later and later

Secondary projects fall through the cracks

Build is always broken!

You’re on a death march, and everyone knows it.

People are struggling with their tools.Every decision is made by committee

No consensus on priorities

Nobody knows what ‘done’ meansTeam focusing on peripheral issues

41

Job descriptions no longer match reality

Surprises are commonUptime is decreasing

Recruitment is getting more difficult

Long term goals aren’t getting closer

Exploratory work has stopped

I cannot determine the status of our systems.

I’m doing all the same things I was a year ago.

Vacations are disruptive

It’s too quiet!

Overpromising / Underdelivering

Themes that We FoundUnfulfilled team members

Bad office vibe

Routine things are complicated

Something’s always on fire

Lack of trust in the developers

Every day is a surprise party

Ops Tools Deficient

Single points of failure aboundOverpromising/Underdelivering

Leadership is distractedThe team is too silo’ed

42

Getting to Root Causes and Solutions

Rooting out Root Causes

Second, much deeper pass with stakeholders.

Goals:1. Build awareness that these problems exist2. Achieve consensus re: root causes3. Create momentum for change

44

TheEstimates

Story

45

Peeling off the Layers• “IT isn’t good at estimates” really meant

“We need to work towards more transparency and communication with our customers and management.”   

• Deeper problem: communication disconnect between dev, management, and our customers. We need to treat this problem, not the “IT isn’t good at estimates” symptom.

46

The (Not So) Big Secret!When you dig into a wetware problem, you’re always going to find at least one of these things:

TRUST

But “We suck at communicating” is not actionable.  

Get to more specific root causes.

COMMUNICATION PROCESS

47

Engaging the Stakeholders

To get the insight you need, people must be engaged in a way that’s meaningful to them.

Tailor approach, content and detail.

Understand their motivations, whatthey can uniquely contribute, and

meet them there!

48

Engaging the Dev Team• Their motivations:

– Autonomy, Mastery and Purpose– Daniel Pink’s 2009 TED talk

www.danpink.com/ac/ted-talk/– Build awesome stuff with awesome people

• What they can contribute:–They do root cause analysis all the time–Evidence-based view of the world–And of course, deep understanding of tech bits

49

Engaging the Dev Team• Your Goals:

–Give the team as much ownership of the reboot as possible• Only fair, this is where the brunt of reboot change will land

–Give visibility to long term strategy and external dependencies on the team• “State of the Outside World”

50

Dev Team Engagement Tactics• Create dedicated time/space to foster group

dialogue about the themes and root causes– Focus Friday– Forums/chat rooms

• “Squishy” book discussion. Example: Team Geek

• Meet privately with individuals regularly (should be doing thisalready).

51

Engage Management• Their motivations: Exploiting market

opportunities, mitigating risk, efficient use of resources - “macro things”

• What they can contribute:–org strategy–deep understanding of what other areas of the

org are doing–better awareness of external opportunities and

pressures52

Engage Management• Your goals: Provide visibility to issues,

incorporate their big picture perspectives, obtain support for change

• Example Activities:– Roundtables - ask the “macro” questions– Vision & perception interview with key people– Use concise, powerful tools, such as...

53

INFRASTRUCTURE+

PROCESS+

FEATURE+DEV+

Q1+ Q2+ Q3+ Q4+

2013+MPC+IT+MAJOR+PROJECT+ROADMAP+

NHGIS+

Rails+2+>>+3+Conversions+

Ruby+1.8+>>+1.9+

ATUS+>>+IPUMS+IntegraMon+

TerraPop+Prototype+

IPUMS+ TerraPop+ Miscellaneous+MulMple+Projects+

Minimal+AggregaMon+Nominal+IntegraMon+ InterpolaMon+

Solaris+>>+Linux+

Storage+Upgrade+

Bibliography+Rewrite+

Admin+Core+Database+

Archive+Upgrade+

Subversion+>>+Git+Conversions+

TeamCity+>>+Jenkins+Conversions+

Separate+Webapp+from+Extract+

IDHS+Metadata+Tools+

Time+Series+Support+

TerraPop+ProducMon+

IDHS+Webapp+Development+

User+AuthenMcaMon+&+AuthorizaMon+Service+

MDT+Metadata+Tool+Development+

General+IPUMS+Feature+Development+

StaMc+>>+Drupal+MigraMons+

Column>Store+Database+MigraMon+

SDA+Extensions+&+ModificaMons+

NHGIS+Data+Ingest+

v1.0+[2/19/2013]+

1960+

StaMc+Asset+Management+

Systems+Monitoring+and+ReporMng+

Hadoop+&+Big+Data+Technologies+R&D+

54

Management Engagement Tactics

55

Management Engagement Tactics

Develop a Product Vision Picture

Engaging the Customer

• Their Motivations:–Want solutions which solve their problems–Want to have transparency into the process

• What they can contribute:–User centric design and prioritization–More honest feedback - they don’t have as

many relationships or as much baggage to protect

56

Engaging the Customer

• Your Goals:–Bring the customer closer into the team–Get them to help your org prioritize–Make them happy :-)

57

Customer Engagement Tactics

• Roundtables with our customer teams• Retrospectives after each major release• Example tool:  Mapping the

communication flow

58

Example of a Communications Map

59

Converging on Root Causes

Dev Team Insight: “We get a lot of requests that pull us away from our core projects.”

Management Insight: “We want to help campus folks who are doing demographic research, but we don’t have a process for vetting or prioritizing requests.”

Customer Insight: “The MPC agreed to do my research web site, but when I call over there it seems that there’s always something more pressing.”

Suggests root causes might be Lack of Focus on Mission

60

Why do we think “Something’s Always on Fire?”

Some of our Root Causes• Lack of Focus on Core Mission

• Team Poorly Aligned with Strategy

• Operational Deficits

• Lack of Customer-Developer Transparency

61

Getting to Solutions

• For each root cause, collaboratively design an action plan that mixes easy wins with longer-term fixes

Quick sampling of our reboot action plans...

62

Solutions: Lack of Focus on Mission

• Early Wins:– Defined the mission!– Made all hidden work visible to management– Outsourced or killed fringe projects and other

distractions and aligned effort to the core mission

• Long-term Work–Align projects to org vision, not vice-versa–Team structure realignment

63

Solutions: Poor Alignment with Strategy

• Early Wins:– Product Vision document– Some early hires not tied to specific projects

• Long-term Work– Conway’s Law and its application

64

Conway’s Law

“Organiza)ons  which  design  systems  are  constrained  to  produce  designs  which  are  copies  of  the  communica)on  

structures  of  these  organiza)ons.”

Melvin  Conway1968

65

66

Align to the thing you want to build...

Solutions: Operational Deficits

• Early Wins:–Evolve tools: SVN → Git, TeamCity → Jenkins,

Chef, Vagrant–Management support for 20% tech debt time–HipChat

• Long-term Work:–Ongoing technical debt reduction–System Monitoring API

67

Solutions: Lack of Customer-Developer Transparency

• Early Wins:–Formalize a quarterly planning process–Open work tracking tool to customers

• Long-term Work:–Refactoring the communications model

68

Messier, but more effective!69

New Comm Model

Measuring Outcomes

• Probably the most important one is qualitative: “Hey, things feel better now!”  

• Quantitative indications will eventually come. Focus on measures that have meaning to your org’s culture (KPIs?)

• Share metrics and results widely to keep momentum going

70

Unintended Consequences

Be prepared to iterate! 71

Where are we now?

• We feel good about...– Management support for the development team– Long term vision and goals– Awareness of the need to prioritize/focus efforts– Communication patterns

• We are still working on...– Acting holistically within a project-specific funding model– Aligning the staff to best support the desired product– More effective use of ops tools to automate routine work

72

Lessons Learned• Know who you are as an organization

– History, context, strengths, weaknesses, constraints, strategy• An Org-first AND People-first approach is possible• People are messy

– HRT underlies successful org change– Assume that everyone has good intentions– Don’t hide behind technology - wetware issues require face

time• Change is scary. Expect resistance. This is a process of

influence - you can’t make others change directly.• Don’t fear going against the grain - do what’s right for

your org, don’t be a slave to any particular process, system, methodology

73

Lessons Learned

Avoid reboots if at all possible!

74

Incorporate your newfound tools and techniques into your daily workflow to pay more attention to how change is

impacting your team, engage your colleagues, and respond to it proactively before you need another reboot!

They’re expensive, disruptive and tricky to execute.

Instead...

Thank You!Continue the conversation...

Twitter: @franfabrizioEmail: fran@umn.edu

Special thanks to Peter Clark (@pclark) for his contributions to an earlier version of this presentation.

75