COMS W3156: Software Engineering, Fall 2001 Lecture #3: Intro to software engineering and the...

COMS W3156:Software Engineering, Fall 2001

Lecture #3: Intro to software engineering and the project: the big catch-up lecture

Janak J Parekh

janak@cs.columbia.edu

Administrativia (I)

• Website is your friend – You are expected to be up-to-date on the

readings, in particular – the course is about to “speed up”

• Questionnaire– If you have not yet filled it out, do it today– We will be determining TA office hours and

recitations using this information

Administrativia (II)

• This should be the last set of slides to be posted so late – use the later PDF’s for notes

• Bulletin board accounts created! Username is your UNI, password is last 4 digits of your SSN. If you can’t login:– Make sure you’ve done the questionnaire!– Otherwise, email suhit@cs.columbia.edu

Next class

• Schach, chapter 3; Lifecycles– Will begin chapter 5 content, Tools, but reading not

due next week

• Group proposals to be submitted– Extremely simple:

• Group name

• List of people in group

– Should be between 4 and 6 people, preferably 5– More info on website later today

Today’s class

• I was going to include another anecdote, but we’re a little behind, so we’re going to play some catch-up today

• Topics to discuss– Intro to Software Engineering– Software Engineering Teams (Schach)– Project introduction– Process model overview– XML

Introduction to Software Engineering

• The practice, not the course • In 1968, it was pointed out that software is

delivered late, overbudget, and with many residual faults

• Err… not much has changed, has it?– Played a game recently?– Updated drivers

Bridges vs. operating systems

• Schach likes this example1. Crash: a bridge needs to be completely rebuilt;

software is just rebooted2. Imperfect engineering: we “accept” faults, while

we cannot on a bridge… is software really engineered?

3. Complexity: software uses discrete states – a bit change != wind

4. Maintenance: no bridge is half-replaced, but this happens often with software

Economics of software engineering

• Put simply, it’s not apparent• If a particular development mechanism is

cheaper, that does not necessarily imply better – code that’s more difficult to maintain may be the result

• Yet, the cheaper mechanism may be adopted• A tremendous amount of software

development is maintenance and evolution – Schach’s cup of tea

Maintenance aspects

• Software, as previously mentioned, is not a build-once-and-throw-away process – that’s far too expensive, or at least we perceive it to be too expensive

• Ergo, software has a life cycle• We need to implement a process so that

software is maintained correctly, i.e. so the software lifecycle is sane

Software lifecycle model

• Schach identifies 7 basic phases; most people use some derivative of this– Requirements– Specification– Design– Implementation

• Integration while implementation

– Maintenance– Retirement

Where’s testing?

• In certain Software Engineering courses, testing was considered a separate phase

• Schach says no– Need to test at each individual phase– The design needs to be tested as much as the

implementation itself– Verification (at the end of each phase)– Validation (before delivering finished product)

Where’s documentation?

• Again, no explicit documentation phase: all phases need to be documented

• Extremely important for maintainability

• Postponed documentation is rarely completed

Cost of each• Which do you think is the most expensive

phase?

1976-1981

Requirements phase (I)

• What are we doing, and why?

• Need to determine what the client needs, not what the client wants or thinks they need

• Worse, requirements are a moving target

Requirements phase (II)

• Common ways of building requirements include– Prototyping (Schach likes)– Requirements document – natural-language

descriptions• Ambiguity is a problem, as with all natural-language

documentation

• Use interviews to get information– Difficult from busy laypeople– We don’t have time for this either, so we’re giving you

the requirements

Specification phase (I)

• The “contract” – this is frequently the legal document

• What will the product do, not how to do it• Should not be:

– ambiguous, e.g. “optimal” or “98% complete”– incomplete, e.g. omitting modules from the

requirement– contradictory

Specification phase (II)

• Detailed, to allow cost and duration estimation

• Schach separates classical from OO specification– Classical: several mechanisms, including DFD,

FSM, Petri Nets, Z– Object-oriented: OOA (“analysis”), utilizing

UML (Universal Modeling Language) diagrams

Design phase

• The “how” of the project; fills in the underlying tenets of the specification

• Design decisions last a long time, even after finished product– Maintenance documentation– Try to leave it open-ended

• Architectural design: decompose project into modules

• Detailed design: each module (data structures, algorithms)

Implementation phase

• Implement the detailed design in code• Bind to language here: C/C++/Java/etc.

(Phil listed ~ 12)• Observe standardized programming

mechanisms• Testing: desk checking (black/white box),

SQA, code review, etc.• Documentation: commented code, test cases

Integration phase (I)

• Combine modules and check the product as a whole

• Top-down vs. bottom-up– top-down: high-level modules debugged first,

major design faults found– bottom-up: low-level modules first, finds small

operational faults and isolates them– “Sandwich integration” does both

Integration phase (II)

• Testing: product and acceptance testing; code review

• Documentation: commented source code and test cases

• Done continually with implementation; can’t wait until last minute*

Maintenance phase (I)

• Maintenance is defined by Schach as any change once the client has accepted the software

• Most expensive phase, by far

• Poor (or lost) documentation often plagues the situation*

• Programmers hate it

Maintenance phase (II)

• Several different types of maintenance:– Corrective (bugs!)– Perfective (additions to improve)– Adaptive (system or other underlying changes)

• Testing maintenance: regression testing• Documentation: must record all of the

changes made, and why, as well as test cases

Retirement phase

• The last phase, of course• Why?

– Changes too drastic, i.e., redesign– Too many interdependencies: “house of cards”– No documentation– Hardware obsolete

• True retirement rare; product no longer useful

Faults (I)

• Faults vs. errors– Fault is the actual problem in the program– Error is the observed effect

• Goal, obviously, is to minimize faults through software engineering

Faults (II)

• 60-70% of faults arespecification and design faults

• They areexpensiveto correct

• Hint: correct themearly

Teams (I)

• Brooks’ Law: “Adding people to a late project makes it later”– Training time– Increased communication: pairs grow by n2

while people/work grows by n– How to divide software? This is not task-

sharing

Teams (II)

• Types of teams– Democratic– “Chief programmer”– “Modern” teams– Synchronize-and-Stabilize teams– eXtreme Programming teams

Democratic Teams (I)

• Problem: programmers are highly attached to their code– Naming code after themselves– “It’s got to be perfect!”– “A stray bug got into the code {but it isn’t my

fault!}”

Democratic Teams (II)

• Basic concept: “egoless” programming

• The “group” owns the code

• Up to 10 egoless programmers

• Fundamental problem with this model: “Be egoless, darn it!”*

Chief Programmer model (I)

• The Chief Programmer knows all: he is “god”

• There’s a “backup” programmer, who knows almost as much, in case the Chief Programmer is incapacitated, and to assist him: Vice-Presidential model

• Secretary to do clerical tasks

• Several programmers under chief programmer

Chief Programmer model (II)

• Problems– You’re going to pay a backup programmer $$

to just sit there?– Doesn’t scale beyond a few programmers

“Modern” Team

• Technical Leader and Team Manager – two separate people– Manager: “HR”-esque, administrative only– Tech lead focuses on technical issues– Growable hierarchy

“Modern” Team (II)

“Modern” Team (III)

• Still decentralized

• Still, it’s popular*

Synchronize-and-Stabilize Teams

• The Microsoft model

• Many sequential builds, many parallel teams

• Synchronized daily: everyone commits code, nightly build

• Problem: need really good people to do this

eXtreme Teams

• Code in pairs, no specialization

• One of the two writes up test cases, the other codes while the first watches

• Prevents turnover problems

• Somewhat egoless – centralized computers

• Problem: “watching” all the time, expensive HR-wise

Team conclusion

• There’s no one solution

• You’ll probably adopt a hybrid of democratic and modern teams

The project

• Role-playing game• Three main parts

– Client– Server– AI

• Each group will work on one of the three• We’re working on the way to choose – look

for the group proposal documents

Client

• Graphical, tile-based side scroller

• Network communication with server

• Running animations

• Send commands from user to server, and animations/updates in the reverse

• Editor mode

• Clients are untrusted!

Server

• Communicate with client over network• Model the entire game world

– Combat– Movement/player location– etc.

• Manage game clock (e.g., pacing)• Store to LDAP server (when players quit or

move to another server)

• Determine “bots” / “monster” actions

• Determine shortest path between points, without getting completely stuck

• Ability to converse

• Lightweight Directory Access Protocol

• Both clients and servers will talk to this

• “Legacy” platform (in a sense, it is)

• Much, much more detail later (hint: JNDI)

• We will use this for server client communication

• eXtended Markup Language, e.g., generalized HTML (define your own tags)

• Grammar (schema), validation

• How to parse XML: DOM and SAX

COMS W3156: Software Engineering, Fall 2001 Lecture #3: Intro to software engineering and the...

Documents

DVoiceR - cs.columbia.edu

Data Structures in Java - cs.columbia.edu

Prof. Stephen A. Edwards sedwards@cs.columbia.edu Columbia

COMS W3156: Software Engineering, Fall 2001 Lecture #15: Distributed Objects II, Network event infrastructures Janak J Parekh janak@cs.columbia.edu

Improv Language - cs.columbia.edu

COMS W3156: Software Engineering, Fall 2001 Lecture #24: The End…? Janak J Parekh janak@cs.columbia.edu

100-year anniversary pictures Janak - a Midmark company

ethanolll janak

COMS W3156: Software Engineering, Fall 2001 Lecture #1: The Smorgasboard Janak J Parekh janak@cs.columbia.edu

COMS W3156: Software Engineering, Fall 2001 Lecture #14: Implementation II, LDAP Janak J Parekh janak@cs.columbia.edu

Introduction - cs.columbia.edu

COMS W3156: Software Engineering, Fall 2001 Lecture #6: Objects I Janak J Parekh janak@cs.columbia.edu

COMS W3156: Software Engineering, Fall 2001 Lecture #9: Classical specification, service discovery Janak J Parekh janak@cs.columbia.edu

Janak Aluminium And Fabricators In Pune

Janak Patel

05329R04 Janak 1st Stage

Nios II Processor Reference Handbook - cs.columbia.edu

Wiretapping and Surveillance II - cs.columbia.edu

Ballr: A 2D Game Generator - cs.columbia.edu

COMS W3156: Software Engineering, Fall 2001 Lecture #12: Design, Distributed Objects Janak J Parekh janak@cs.columbia.edu