Upload
annis
View
50
Download
0
Embed Size (px)
DESCRIPTION
Scaling The Software Development Process: Lessons Learned from The Sims Online. Greg Kearney, Larry Mellon, Darrin West Spring 2003, GDC. Talk Overview. Covers: Software Engineering techniques to help when projects get big Code structure Work processes (for programmers) Testing - PowerPoint PPT Presentation
Citation preview
1
Scaling The Software Development Process:
Lessons Learned fromThe Sims Online
Scaling The Software Development Process:
Lessons Learned fromThe Sims Online
Greg Kearney, Larry Mellon, Darrin West
Spring 2003, GDC
Greg Kearney, Larry Mellon, Darrin West
Spring 2003, GDC
2
Talk OverviewTalk Overview
• Covers: Software Engineering techniques to help when projects get big– Code structure– Work processes (for programmers)– Testing
• Does Not Cover:– Game Design / Content Pipeline– Operations / Project Management
• Covers: Software Engineering techniques to help when projects get big– Code structure– Work processes (for programmers)– Testing
• Does Not Cover:– Game Design / Content Pipeline– Operations / Project Management
3
How to Apply it.How to Apply it.
• We didn’t do all of this right away• Improve what you can• Don’t change too much at once• Prove that it works, and others will
take up the cause• Iterate
• We didn’t do all of this right away• Improve what you can• Don’t change too much at once• Prove that it works, and others will
take up the cause• Iterate
4
Match Process to ScaleMatch Process to Scale
Team Efficiency
Process for 5 to 15 programmers
Team Size
Process for 30 to 50 programmers
0
“Meeting Hell”“Everything’s Broken Hell”
Change to a new process
+tve
5
What You Should Leave With
What You Should Leave With
• TSO “Lessons Learned” – Where we were with our software process– What we did about it– How it helped
• Some Rules of Thumb– General practices that tend to smooth
software development @ scale– Not a blueprint for MMP development– Useful “frame of reference”
• TSO “Lessons Learned” – Where we were with our software process– What we did about it– How it helped
• Some Rules of Thumb– General practices that tend to smooth
software development @ scale– Not a blueprint for MMP development– Useful “frame of reference”
6
Classes of Lessons Learned & Rules
Classes of Lessons Learned & Rules
• Architecture / Design: Keep it Simple– Minimizing dependencies, fatal couplings– Minimizing complexity, brittleness
• Workspace Management: Keep it Clean– Code and directory structure– Check in and integration strategies
• Dev. Support Structure: Make it Easy, Prove it– Testing– Automation
• Architecture / Design: Keep it Simple– Minimizing dependencies, fatal couplings– Minimizing complexity, brittleness
• Workspace Management: Keep it Clean– Code and directory structure– Check in and integration strategies
• Dev. Support Structure: Make it Easy, Prove it– Testing– Automation
-All of these had to change as we scaled up.-They eventually exceeded the team’s ability to deal with (using existing tools & processes).
-All of these had to change as we scaled up.-They eventually exceeded the team’s ability to deal with (using existing tools & processes).
7
Non-Geek AnalogyNon-Geek Analogy
Bad flashbacks found at:
http://www.easthamptonhigh.org/cernak/
http://www.hancock.k12.mi.us/high/art/wood/index.html
Bad flashbacks found at:
http://www.easthamptonhigh.org/cernak/
http://www.hancock.k12.mi.us/high/art/wood/index.html
–Sharpen your tools.–Clean up your mess.–Measure twice, cut once.–Stay with your buddy.
–Sharpen your tools.–Clean up your mess.–Measure twice, cut once.–Stay with your buddy.
8
• High “Churn Rate”: large #coders times tightly coupled code equaled frequent breaks– Our code had a deep
root system– And we had a forest of
changes to make
• High “Churn Rate”: large #coders times tightly coupled code equaled frequent breaks– Our code had a deep
root system– And we had a forest of
changes to make
“Big root ball” found at: http://www.on.ec.gc.ca/canwarn/norwich/norsummary-e.html
Key Factors Affecting Efficiency
Key Factors Affecting Efficiency
9
Evolve
Make It SmallerMake It Smaller
10
• “Key Logs”: some issues were preventing other issues from even being worked on
• “Key Logs”: some issues were preventing other issues from even being worked on
Key Factors Affecting Efficiency
Key Factors Affecting Efficiency
11
Key Factors Affecting Efficiency
Key Factors Affecting Efficiency
• A chain of single points of failure took out the entire team
• A chain of single points of failure took out the entire team
Sit on a chair
Buy the chair
Enter a house
Buy a house
Enter a city
Create an avatar
Login
12
So, What Did We Do That Worked
So, What Did We Do That Worked
• Switched to a logical architecture with less coupling
• Switched to a code structure with fewer dependencies
• Put in scaffolding to keep everyone working
• Developed sophisticated configuration management
• Instituted automated testing• Metrics, Metrics, Metrics
• Switched to a logical architecture with less coupling
• Switched to a code structure with fewer dependencies
• Put in scaffolding to keep everyone working
• Developed sophisticated configuration management
• Instituted automated testing• Metrics, Metrics, Metrics
13
So, What Did We Do That Didn’t?
So, What Did We Do That Didn’t?
• Long range milestone planning• Network emulator(s)• Over engineered a few things (too general)• Some tasks failed due to:
– Not replanning, reviewing long tasks– Not breaking up long tasks
• Coding standard changed part way through• …
• Long range milestone planning• Network emulator(s)• Over engineered a few things (too general)• Some tasks failed due to:
– Not replanning, reviewing long tasks– Not breaking up long tasks
• Coding standard changed part way through• …
14
What we were faced with:What we were faced with:
• 750K lines of legacy Windows code• Port it to Linux• Change from “multiplayer” to
Client/Server• 18 months• Developers must remain alive after
shipping• Continuous releases starting at Beta
• 750K lines of legacy Windows code• Port it to Linux• Change from “multiplayer” to
Client/Server• 18 months• Developers must remain alive after
shipping• Continuous releases starting at Beta
15
Go To FinalArchitecture
ASAP
Go To FinalArchitecture
ASAP
16
Go to final architecture ASAP
Go to final architecture ASAP
ClientSim
ClientSim
ClientSim
ClientSim
Multiplayer:Multiplayer:
Here beSyncHell
Evolve
Client/Server:Client/Server:
Client
Sim
Client
Client
NiceUndemocratic
Request/Command
17
Final Architecture ASAP:“Refactoring”
Final Architecture ASAP:“Refactoring”
• Decomposed into Multiple dll’s – Found the Simulator
• Interfaces• Reference Counting• Client/Server subclassing
• Decomposed into Multiple dll’s – Found the Simulator
• Interfaces• Reference Counting• Client/Server subclassing
How it helped:–Reduced coupling. Even reduced compile times!–Developers in different modules broke each other less often.–We went everywhere and learned the code base.
How it helped:–Reduced coupling. Even reduced compile times!–Developers in different modules broke each other less often.–We went everywhere and learned the code base.
18
Final Architecture ASAP:It Had to Always Run
Final Architecture ASAP:It Had to Always Run
• But, clients would not behave predictably
• We could not even play test• Game design was demoralized
• We needed a bridge, now!
• But, clients would not behave predictably
• We could not even play test• Game design was demoralized
• We needed a bridge, now!? ?
19
Final Architecture ASAP:Incremental Sync
Final Architecture ASAP:Incremental Sync
• A quick temporary solution…– Couldn’t wait for final system to be
finished– High overhead, couldn’t ship it
• We took partial state snapshots on the server and restored to them on the client
• A quick temporary solution…– Couldn’t wait for final system to be
finished– High overhead, couldn’t ship it
• We took partial state snapshots on the server and restored to them on the client How it helped:
–Could finally see the game as it would be.
–Allowed parallel game design and coding
–Bought time to lay in the “right” stuff.
How it helped:–Could finally see the game as it would be.
–Allowed parallel game design and coding
–Bought time to lay in the “right” stuff.
20
Final Architecture ASAP:Null View
Final Architecture ASAP:Null View
• Created Null View HouseSim on Windows– Same interface– Null (text output) implementation
• Created Null View HouseSim on Windows– Same interface– Null (text output) implementation
How it helped–No ifdef’s!
–Done under Windows, we could test this first step.
–We knew it was working during the port.
–Allowed us to port to Linux only the “needed” parts.
How it helped–No ifdef’s!
–Done under Windows, we could test this first step.
–We knew it was working during the port.
–Allowed us to port to Linux only the “needed” parts.
21
Final Architecture ASAP:More “Bridges”
Final Architecture ASAP:More “Bridges”
• HSB’s: proxy on Linux, pass-through to a Windows Sim.
• Disabled authentication, etc.
• HSB’s: proxy on Linux, pass-through to a Windows Sim.
• Disabled authentication, etc.
How it helped–Could exercise Linux components before finishing HouseSim port.
–Allowed us to debug server scale, performance and stability issues early.
–Make best use of Windows developers.
–Allowed single platform development. Faster compiles.
How it helped–Could exercise Linux components before finishing HouseSim port.
–Allowed us to debug server scale, performance and stability issues early.
–Make best use of Windows developers.
–Allowed single platform development. Faster compiles.
How it helped–Could keep working even when some of the system wasn’t available.
How it helped–Could keep working even when some of the system wasn’t available.
22
Mainline *Must* Work!Mainline *Must* Work!
23
If Mainline Doesn’t Work,Nobody Works
If Mainline Doesn’t Work,Nobody Works
• The Mainline source control branch *must* run
• Never go dark: Demo/Play Test every day
• If you hit a bug, do you sync to mainline, hoping someone else fixed it? Or did you just add it?
• The Mainline source control branch *must* run
• Never go dark: Demo/Play Test every day
• If you hit a bug, do you sync to mainline, hoping someone else fixed it? Or did you just add it?
–If mainline breaks for “only” an hour, the project loses a man-week.–If each developer breaks the mainline “only” once a month, it is broken every day.
–If mainline breaks for “only” an hour, the project loses a man-week.–If each developer breaks the mainline “only” once a month, it is broken every day.
24
Mainline must work:Sniff Test
Mainline must work:Sniff Test
• Mainline was breaking for “simple” things.– Features you “didn’t touch” (and didn’t test).
• Created an auto-test to exercise all core functions.
• Quick to run. Fun to watch. Checked results.• Mandated that it pass before submitting code
changes. • Break the build: “feed the pig”.
• Mainline was breaking for “simple” things.– Features you “didn’t touch” (and didn’t test).
• Created an auto-test to exercise all core functions.
• Quick to run. Fun to watch. Checked results.• Mandated that it pass before submitting code
changes. • Break the build: “feed the pig”.
How it helped–Very simple test. Amazing difference.
–Sometimes we got lazy and trusted it too much.
How it helped–Very simple test. Amazing difference.
–Sometimes we got lazy and trusted it too much.
Doh!
25
Mainline must work:Stages to “Sandboxing”
Mainline must work:Stages to “Sandboxing”
1. Got it to build reliably.2. Instituted Auto-Builds: email all on failure.3. Used a “Pumpkin” to avoid duplicate
merge-test cycles, pulling partial submissions,...
4. Used a Pumpkin Queue when we really got rolling
1. Got it to build reliably.2. Instituted Auto-Builds: email all on failure.3. Used a “Pumpkin” to avoid duplicate
merge-test cycles, pulling partial submissions,...
4. Used a Pumpkin Queue when we really got rolling
How it helped–Far fewer thumbs twiddled.
–The extra process got on some people’s nerves.
How it helped–Far fewer thumbs twiddled.
–The extra process got on some people’s nerves.
26
Mainline must work:Sandboxing
Mainline must work:Sandboxing
5. Finally, went to per-developer branching.– Develop on your own branch.– Submit changes to an integration
engineer.– Full Smoke test run per
submission/feature.– If it worked, integrated to mainline in
priority order, or else it is bounced.
5. Finally, went to per-developer branching.– Develop on your own branch.– Submit changes to an integration
engineer.– Full Smoke test run per
submission/feature.– If it worked, integrated to mainline in
priority order, or else it is bounced.
How it helped–Mainline *always* runs. Pull any time.
–Releases are not delayed by partial features.
–No more code freezes going to release.
How it helped–Mainline *always* runs. Pull any time.
–Releases are not delayed by partial features.
–No more code freezes going to release.
27
Support StructureSupport Structure
28
Background: Support Structure
Background: Support Structure
• Team size placed design constraints on supporting tools– Automation: big win in big teams– Churn rate: tool accuracy / support cost
• Types of tools– Data management: collection / corrolation– Testing: controlled, sync’ed, repeatable
inputs– Baselines: my bug, your bug, or our bug?
• Team size placed design constraints on supporting tools– Automation: big win in big teams– Churn rate: tool accuracy / support cost
• Types of tools– Data management: collection / corrolation– Testing: controlled, sync’ed, repeatable
inputs– Baselines: my bug, your bug, or our bug?
29
Overview: Support Structure
Overview: Support Structure • Automated testing: designs to
minimize impact of churn rate• Automated data collection /
corrolation– Distributed sytem == distributed data– Dashboard / Esper / MonkeyWatcher
• Use case: load testing– Controlled (tunable) inputs, observable
results– “Scale&Break”
• Automated testing: designs to minimize impact of churn rate
• Automated data collection / corrolation– Distributed sytem == distributed data– Dashboard / Esper / MonkeyWatcher
• Use case: load testing– Controlled (tunable) inputs, observable
results– “Scale&Break”
30
Problem: Testing Accuracy
Problem: Testing Accuracy
• Load & Regression: inputs must be– Accurate – Repeatable
• Churn rate: logic/data in constant motion– How to keep testing client accurate?
• Solution: game client becomes test client– Exact mimicry– Lower maintenance costs
• Load & Regression: inputs must be– Accurate – Repeatable
• Churn rate: logic/data in constant motion– How to keep testing client accurate?
• Solution: game client becomes test client– Exact mimicry– Lower maintenance costs
31
Test Client == Game Client
Test Client == Game Client
Game GUITest Control
Commands
StateState
Client-Side Game Logic
Presentation Layer
Test Client Game Client
32
Game Client: How Much To Keep?Game Client: How Much To Keep?
Game Client
View
Logic
Presentation Layer
33
What Level To Test At?What Level To Test At?
Game Client
MouseClicks
Presentation Layer
Regression: Too Brittle (pixel shift)Load: Too Bulky
Regression: Too Brittle (pixel shift)Load: Too Bulky
View
Logic
34
What Level To Test At?What Level To Test At?
Game Client
InternalEvents
Presentation Layer
Regression: Too Brittle (Churn Rate vs Logic & Data)
Regression: Too Brittle (Churn Rate vs Logic & Data)
View
Logic
35
Semantic AbstractionsSemantic Abstractions
NullView ClientView
LogicPresentation LayerBuy Lot Enter Lot
Use Object
…Buy Object
~ ¾
~ ¼
Basic gameplay changes less frequently than UI or protocol implementations.Basic gameplay changes less frequently than UI or protocol implementations.
36
Scriptable User Play Sessions
Scriptable User Play Sessions
• Test Scripts: Specific / ordered inputs– Single user play session– Multiple user play session
• SimScript– Collection: Presentation Layer
“primitives”– Synchronization: wait_until, remote_command
– State probes: arbitrary game state• Avatar’s body skill, lamp on/off, …
• Test Scripts: Specific / ordered inputs– Single user play session– Multiple user play session
• SimScript– Collection: Presentation Layer
“primitives”– Synchronization: wait_until, remote_command
– State probes: arbitrary game state• Avatar’s body skill, lamp on/off, …
37
Scriptable User Play Sessions
Scriptable User Play Sessions
• Scriptable play sessions: big win– Load: tunable based on actual play– Regression: walk a set of avatars thru
various play sessions, validating correctness per step
• Gameplay semantics: very stable– UI / protocols shifted constantly– Game play remained (about) the same
• Scriptable play sessions: big win– Load: tunable based on actual play– Regression: walk a set of avatars thru
various play sessions, validating correctness per step
• Gameplay semantics: very stable– UI / protocols shifted constantly– Game play remained (about) the same
38
Automated Test: Team Baselines
Automated Test: Team Baselines
•Hourly “critical path” stability tests–Sync / clean / build / test–Validate Mainline / Servers
•Snifftest weather report–Hourly testing–Constant reporting
•Hourly “critical path” stability tests–Sync / clean / build / test–Validate Mainline / Servers
•Snifftest weather report–Hourly testing–Constant reporting
39
How Automated Testing Helped
How Automated Testing Helped
• Current, accurate baseline for developers
• Scale&break found many bugs• Greatly increased stability
– Code base was “safe”– Server health was known (and better)
• Current, accurate baseline for developers
• Scale&break found many bugs• Greatly increased stability
– Code base was “safe”– Server health was known (and better)
40
Tools & Large TeamsTools & Large Teams
• High tool ROI – team_size * automation_savings
• Faster triage– Quickly narrow down problem– across any system component
• Monitoring tools became a focal point• Wiki: central doc repository
• High tool ROI – team_size * automation_savings
• Faster triage– Quickly narrow down problem– across any system component
• Monitoring tools became a focal point• Wiki: central doc repository
41
Monitoring / DiagnosticsMonitoring / DiagnosticsWhen you can measure what you are speaking about and
can express it in numbers, you know something about it. But when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind." - Lord Kelvin
• DeMarco: You cannot control what you cannot measure.
• Maxwell: To measure is to know. • Pasteur: A science is as mature as its
measurement tools.
• DeMarco: You cannot control what you cannot measure.
• Maxwell: To measure is to know. • Pasteur: A science is as mature as its
measurement tools.
42
DashboardDashboard
• System resource & health tool– CPU / Memory / Disk / …
• Central point to access – Status – Test Results– Errors– Logs– Cores– …
• System resource & health tool– CPU / Memory / Disk / …
• Central point to access – Status – Test Results– Errors– Logs– Cores– …
43
Test Central / Monkey Watcher
Test Central / Monkey Watcher
• Test Central UI– Control rig for developers & testers
• Monkey Watcher – Collects & stores (distributed) test results– Produces summarized reports across tests– Filters known defects– Provides baseline of correctness– Web frontend, unique IDs per test
• Test Central UI– Control rig for developers & testers
• Monkey Watcher – Collects & stores (distributed) test results– Produces summarized reports across tests– Filters known defects– Provides baseline of correctness– Web frontend, unique IDs per test
44
EsperEsper
• In-game profiler for a distributed system
• Internal probes may be viewed – Per process / machine / cluster– Time view or summary view
• Automated data management– Coders: add one line probe– Esper: data shows up on web site
• In-game profiler for a distributed system
• Internal probes may be viewed – Per process / machine / cluster– Time view or summary view
• Automated data management– Coders: add one line probe– Esper: data shows up on web site
45
Use Case: Scale & BreakUse Case: Scale & Break
• Never too early to begin scaling– Idle: keep doubling server processes– Busy: double #users, dataset size
• Fix what broke, start again• Tune input scripts using Beta
data
• Never too early to begin scaling– Idle: keep doubling server processes– Busy: double #users, dataset size
• Fix what broke, start again• Tune input scripts using Beta
data
Load Testing: Data FlowLoad Testing: Data Flow
ClientMetrics
Game Traffic
ResourceMetrics
Debugging Data
Test Driver CPU
Load Control Rig
Server Cluster
Load Testing Team
System Monitors
Internal Probes
Test ClientTest
ClientTest
Client
Test Driver CPU
Test ClientTest
ClientTest
Client
Test Driver CPU
Test ClientTest
ClientTest
Client
47
Outline: WrapupOutline: Wrapup
• Wins / Losses• Rules: Analysis & Discussion• Recommended reading• Questions
• Wins / Losses• Rules: Analysis & Discussion• Recommended reading• Questions
48
Process: Wins / LossesProcess: Wins / Losses
• Wins – Module decomposition
• Logical: client / server architecture• Physical: code structure
– Scaffolding for parallel development– Tools to improve workflow– Automated Regression / Load
• Wins – Module decomposition
• Logical: client / server architecture• Physical: code structure
– Scaffolding for parallel development– Tools to improve workflow– Automated Regression / Load
49
Process: Wins / LossesProcess: Wins / Losses
• Losses– Early lack of tools– #ifdef as a cross-platform port– Single points of failure blocked entire
development team
• Losses– Early lack of tools– #ifdef as a cross-platform port– Single points of failure blocked entire
development team
50
Not Done Yet:More ChallengesNot Done Yet:
More Challenges• How to ship, and ship, and ship…• How to balance infrastructure cleanup
against new feature development• …
• How to ship, and ship, and ship…• How to balance infrastructure cleanup
against new feature development• …
51
Rules of Thumb (1)Rules of Thumb (1)
• KISS: software and processes• Incremental changes
– <Inhale><Hold It><Exhale>– <Say>:“Baby-Steps”
• Continual tool/process improvement
• KISS: software and processes• Incremental changes
– <Inhale><Hold It><Exhale>– <Say>:“Baby-Steps”
• Continual tool/process improvement
52
Rules of Thumb (2)Rules of Thumb (2)
• Mainline has got to work• Get something on the ground. Quickly.
• Mainline has got to work• Get something on the ground. Quickly.
53
Rules of Thumb (3)Rules of Thumb (3)
• Key Logs: break up quickly, ruthlessly• Scaffolding: keep others working• Do important things, not urgent things • Module separation (logically,
physically) • If you can’t measure it, you don’t
understand it
• Key Logs: break up quickly, ruthlessly• Scaffolding: keep others working• Do important things, not urgent things • Module separation (logically,
physically) • If you can’t measure it, you don’t
understand it
54
Final Rule: “Sharpen The Saw”
Final Rule: “Sharpen The Saw”
• Efficiency impacted by – Component coupling / team size– Compile / load / test / analyze cycle
• Tool Justification in large teams– Large ROI @ large scale– 5% gain across 30 programmers– “Fred Brooks”: 31st programmer…
• Efficiency impacted by – Component coupling / team size– Compile / load / test / analyze cycle
• Tool Justification in large teams– Large ROI @ large scale– 5% gain across 30 programmers– “Fred Brooks”: 31st programmer…
55
Recommended ReadingRecommended Reading
• Influences– Extreme Programming– Scott Meyers: large-scale software
engineering– Gamma et al: Design Patterns
• Caveat Emptor: slavish following not encouraged– Consider “ground conditions” for your
project
• Influences– Extreme Programming– Scott Meyers: large-scale software
engineering– Gamma et al: Design Patterns
• Caveat Emptor: slavish following not encouraged– Consider “ground conditions” for your
project
56
Questions & AnswersQuestions & Answers