Data Structures and Programming

Embed Size (px)

Citation preview

  • 8/2/2019 Data Structures and Programming

    1/162

    Data Structures and ProgrammingLecture 1

    Why Data Structures?

    In my opinion, there are only three important ideas which must be mastered towrite interestingprograms.

    Iteration - Do, While, Repeat, If Data Representation - variables and pointers Subprograms and Recursion - modular design and abstraction

    At this point, I expect that you have mastered about 1.5 of these 3.

    It is the purpose ofComputer Science IIto finish the job.

    Data types vs. Data Structures

    A data type is a well-defined collection of data with a well-defined set ofoperations on it.

    A datastructure is an actual implementation of a particular abstract data type.

    Example: The abstract data type Set has the operations EmptySet(S),Insert(x,S), Delete(x,S), Intersection(S1,S2), Union(S1,S2), MemberQ(x,S),

    EqualQ(S1,S2), SubsetQ(S1,S2).

    This semester, we will learn to implement such abstract data types by buildingdata structures from arrays, linked lists, etc.

    Modula-3 Programming

    Control Structures: IF-THEN-ELSE, CASE-OF

    Iteration Constructs: REPEAT-UNTIL (at least once), WHILE-DO (at least 0),FOR-DO (exactly n times).

    Elementary Data Types: INTEGER, REAL, BOOLEAN, CHAR

    Enumerated Types: COINSIDE = {HEADS, TAIL, SIDE}

    Operations: +, -, , #

  • 8/2/2019 Data Structures and Programming

    2/162

    Elementary Data Structures

    ArraysThese let you access lots of data fast. (good)

    You can have arrays ofany other data type. (good)

    However, you cannot make arrays bigger if your program decides it needs morespace. (bad)

    RecordsThese let you organize non-homogeneous data into logical packages tokeep everything together. (good)

    These packages do not include operations, just data fields (bad, which is whywe need objects)

    Records do not help you process distinct items in loops (bad, which is whyarrays of records are used)

    SetsThese let you represent subsets of a set with such operations asintersection, union, and equivalence. (good)

    Built-in sets are limited to a certain small size. (bad, but we can build ourownset data type out of arrays to solve this problem if necessary)

    Subroutines

    Subprograms allow us to break programs into units of reasonable size andcomplexity, allowing us to organize and manage even very long programs.

    This semester, you will first encounter programs big enoughthat modularization will be necessary for survival.

    Functions are subroutines which return values, instead of communicating byparameters.

    Abstract data types have each operation defined by a subroutine.

    Subroutines which call themselves are recursive. Recursion provides a verypowerful way to solve problems which takes some getting used to.

    Such standard data structures as linked lists and trees are inherently recursivedata structures.

  • 8/2/2019 Data Structures and Programming

    3/162

    Parameter Passing

    There are two mechanisms for passing data to a subprogram, depending uponwhether the subprogram has the power to alter the data it is given.

    Inpass by value, a copy of the data is passed to the subroutine, so that nomatter what happens to the copy the original is unaffected.

    Inpass by reference, the variable argument is renamed, not copied. Thus anychanges within the subroutine effect the original data. These are the VAR

    parameters.

    Example: suppose the subroutine is declaredPush(VAR s:stack, e:integer) andcalled withPush(t,x). Any changes with Push to e have no effect onx, butchanges tos effect t.

    Generic Modula-3 Program I

    MODULE Prim EXPORTS Main;(* Prime number testing with Repeat *)IMPORT SIO;

    VAR candidate, i: INTEGER;

    BEGINSIO.PutText("Prime number test\n");

    REPEATSIO.PutText("Please enter a positive number; enter 0 to quit. ");candidate:= SIO.GetInt();IF candidate > 2 THENi:= 1;REPEATi:= i + 1

    UNTIL ((candidate MOD i) = 0) OR (i * i > candidate);IF (candidate MOD i) = 0 THENSIO.PutText("Not a prime number\n")

    ELSESIO.PutText("Prime number\n")

    END; (*IF (candidate MOD i) = 0 ...*)

    ELSIF candidate > 0 THENSIO.PutText("Prime number\n") (*1 and 2 are prime*)

    END; (*IF candidate > 2*)UNTIL candidate

  • 8/2/2019 Data Structures and Programming

    4/162

    (* The Euclidean algorithm (with controlled input):Compute the greatest common divisor (GCD) *)

    IMPORT SIO;

    VARa, b: INTEGER; (* input values *)x, y: CARDINAL; (* working variables *)

    BEGIN (*statement part*)SIO.PutText("Euclidean algorithm\nEnter 2 positive numbers: ");

    a:= SIO.GetInt();WHILE a

  • 8/2/2019 Data Structures and Programming

    5/162

    Program defensively - Add the debugging statements and routines at thebeging, because you know you are going to need them later.

    A good program is a pretty program. - Remember that you will spend moretime reading your programs than we will.

    Perfect Shuffles

    1 1 1 1 1 1 1 1 12 27 14 33 17 9 5 3 23 2 27 14 33 17 9 5 34 28 40 46 49 25 13 7 45 3 2 27 14 33 17 9 56 29 15 8 30 41 21 11 67 4 28 40 46 49 25 13 78 30 41 21 11 6 29 15 89 5 3 2 27 14 33 17 910 31 16 34 43 22 37 19 1011 6 29 15 8 30 41 21 1112 32 42 47 24 38 45 23 1213 7 4 28 40 46 49 25 1314 33 17 9 5 3 2 27 1415 8 30 41 21 11 6 29 1516 34 43 22 37 19 10 31 1617 9 5 3 2 27 14 33 1718 35 18 35 18 35 18 35 1819 10 31 16 34 43 22 37 1920 36 44 48 50 51 26 39 2021 11 6 29 15 8 30 41 2122 37 19 10 31 16 34 43 2223 12 32 42 47 24 38 45 23

    24 38 45 23 12 32 42 47 2425 13 7 4 28 40 46 49 2526 39 20 36 44 48 50 51 2627 14 33 17 9 5 3 2 2728 40 46 49 25 13 7 4 2829 15 8 30 41 21 11 6 2930 41 21 11 6 29 15 8 3031 16 34 43 22 37 19 10 3132 42 47 24 38 45 23 12 3233 17 9 5 3 2 27 14 3334 43 22 37 19 10 31 16 3435 18 35 18 35 18 35 18 3536 44 48 50 51 26 39 20 36

    37 19 10 31 16 34 43 22 3738 45 23 12 32 42 47 24 3839 20 36 44 48 50 51 26 3940 46 49 25 13 7 4 28 4041 21 11 6 29 15 8 30 4142 47 24 38 45 23 12 32 4243 22 37 19 10 31 16 34 4344 48 50 51 26 39 20 36 4445 23 12 32 42 47 24 38 4546 49 25 13 7 4 28 40 46

  • 8/2/2019 Data Structures and Programming

    6/162

    47 24 38 45 23 12 32 42 4748 50 51 26 39 20 36 44 4849 25 13 7 4 28 40 46 4950 51 26 39 20 36 44 48 5051 26 39 20 36 44 48 50 5152 52 52 52 52 52 52 52 52

    Software Engineering and Top-Down

    Design

    Lecture 2

    Software Engineering and Saddam Hussain

    Think about the Patriot missiles which tried to shoot down SCUD missiles inthe Persian Gulf war and think about how difficult it is to produce workingsoftware!

    1. How do you test a missile defense system?2. How do you satisfy such tight constraints as program speed, computer

    size/weight, flexibility to recognize different types of missiles?3. How do you get hundreds of people to work on the same program

    without getting chaos?

    Even today, there is great controversy about how well the missiles actually didin the war.

    Testing and Verification

    How do you know that your program works?Notby testing it!

    ``Testing reveals the presence, but not the absence of bugs.'' - Dijkstra

    Still, it is important to design test cases which exercise the boundaryconditions of the program.

    Example: Linked list insertion. The boundary cases include:

    insertion before the first element. insertion after the last element.

  • 8/2/2019 Data Structures and Programming

    7/162

    insertion into the empty list. insertion between element (the general case).

    Test Case Generation

    In the Microsoft Excel group, there is one tester for each programmer! Types oftest cases include:

    Boundary cases - Make sure that each line of code and branch of IF is executedat least once.

    Random data - Automatically generated test data can be useful to test userpatterns you might otherwise not consider, but you must verify that the resultsare correct!

    Other users - People who were not involved in writing the program will havevastly different ideas of how to use it.

    Adversaries - People who want to attack your program and have access to thesource for often find bugs by reading it.

    Verification

    But how can we know that our program works? The ideal way is tomathematically prove it.

    For each subprogram, there is a precise set ofpreconditions which we assumeis satisfied by the input parameters, and a precise set ofpost-conditions whichare satisfied by the output parameters.

    If we can show that any input satisfying the preconditions is alwaystransformed to output satisfying the post conditions, we haveproven thesubprogram correct.

    Top-Down Refinement

    To correctly build a complicated system requires first setting broad goals andrefining them over time. Advantages include:

    A hierarchy hides information - This permits you to focus attention on only amanageable amount of detail.

  • 8/2/2019 Data Structures and Programming

    8/162

    With the interfaces defined by the hierarchy, changes can be made without

    effecting the rest of the structure - Thus systems can be maintained withoutbeing ground to a halt.

    Progress can be made in parallel by having different people work on different

    subsections - Thus you can organize to build large systems.

    Stepwise Refinement in Programming

    The best way to build complicated programs is to construct the hierarchy onelevel at a time, finally writing the actual functions when the task is smallenough to be easily done.

    Build a prototype to throw away, because you will, anyway. Anything difficult put off for last. If necessary, decompose it into

    another level of detail.

    Most of software engineering is just common sense, but it is very easy to ignorecommon sense.

    Building a Military Threat

    Module Build-Military: The first decision is now to organize it, not what typeof tank to buy.

    Several different organizations are possible, and in planning we shouldinvestigate each one:

    Offense, Defense Army, Navy, Air Force, Marines, Coast Guard ...

    Procedure Army: Tanks, Troops, Guns ...

    Procedure Troops: Training, Recruitment, Supplies

    Top-Down Design Example

    ``Teaching Software Engineering is like telling children to brush their teeth.'' -anonymous professor.

    To make this more concrete, lets outline how a non-trivial program should bestructured.

  • 8/2/2019 Data Structures and Programming

    9/162

    Suppose that you wanted to write a program to enable a person to play thegame Battleship against a computer.

    Tell me what to do!

    What is Battleship?

    Each side places 5 ships on a grid, and then takes turns guessing gridpoints until one side has covered all the ships:

    For each query, the answer ``hit'', ``miss'', or ``you sunk my battleship'' must begiven.

    There are two distinct views of the world, one reflecting the truth about theboard, the other reflecting what your opponent knows.

    Program: Battleship

    Interesting subproblems are: display board, generate query, respond to query,generate initial configuration, move-loop (main routine).

    What data structure should we use? Two-dimensional arrays.

    How do I enforce separation between my view and your view?

    Data Structures and ProgrammingLecture 3

    Steven S. Skiena

    Stacks and Queues

    The first data structures we will study this semester will be lists which have theproperty that the order in which the items areusedis determined by the orderthey arrive.

    Stacks are data structures which maintain the order oflast-in, first-out Queues are data structures which maintain the order offirst-in, first-out

  • 8/2/2019 Data Structures and Programming

    10/162

    Queues might seem fairer, which is why lines at stores are organized as queuesinstead of stacks, but both have important applications in programs as a datastructure.

    Operations on Stacks

    The terminology associated with stacks comes from the spring loaded platecontainers common in dining halls.

    When a new plate is washed it ispushed on the stack.

    When someone is hungry, a clean plate ispopped off the stack.

    A stack is an appropriate data structure for this task since the plates don't careabout when they are used!

    Maintaining Procedure Calls

    Stacks are used to maintain the return points when Modula-3 procedures callother procedures which call other procedures ...

    Jacob and Esau

    In the biblical story, Jacob and Esau were twin brothers where Esau was bornfirst and thus inherited Issac's birthright. However, Jacob got Esau to give it

    away for a bowl of soup, and so Jacob went to become a patriarch of Israel.

    But why was Jacob justified in so tricking his brother???

    Rashi, a famous 11th century Jewish commentator, explained the problem bysaying Jacob was conceivedfirst, then Esau second, and Jacob could not getaround the narrow tube to assume his rightful place first in line!

    Therefore Rebecca was modeled by a stack.

    ``Push'' Issac, Push ``Jacob'', Push ``Esau'', Pop ``Esau'', Pop ``Jacob''

    Abstract Operations on a Stack

    Push(x,s) and Pop(x,s) - Stack s, item x. Note that there is no searchoperation.

    Initialize(s), Full(s), Empty(s), - The latter two are Boolean queries.

  • 8/2/2019 Data Structures and Programming

    11/162

    Defining these abstract operations lets us build a stack module to use and reusewithout knowing the details of the implementation.

    The easiest implementation uses an array with an index variable to representthe top of the stack.

    An alternative implementation, using linked lists is sometimes better, for itcan't ever overflow. Note that we can change the implementations without therest of the program knowing!

    Declarations for a stack

    INTERFACE Stack; (*14.07.94 RM, LB*)(* Stack of integer elements *)

    TYPE ET = INTEGER; (*element type*)

    PROCEDURE Push(elem : ET); (*adds element to top of stack*)PROCEDURE Pop(): ET; (*removes and returns top element*)PROCEDURE Empty(): BOOLEAN; (*returns true if stack is empty*)PROCEDURE Full(): BOOLEAN; (*returns true if stack is full*)

    END Stack.

    Stack Implementation

    MODULE Stack; (*14.07.94 RM, LB*)(* Implementation of an integer stack *)

    CONSTMax = 8; (*maximum number of elements on stack*)

    TYPES = RECORD

    info: ARRAY [1 .. Max] OF ET;top: CARDINAL := 0; (*initialize stack to empty*)

    END; (*S*)

    VAR stack: S; (*instance of stack*)

    PROCEDURE Push(elem:ET) =(*adds element to top of stack*)

    BEGININC(stack.top); stack.info[stack.top]:= elemEND Push;

    PROCEDURE Pop(): ET =(*removes and returns top element*)BEGINDEC(stack.top); RETURN stack.info[stack.top + 1]

    END Pop;

  • 8/2/2019 Data Structures and Programming

    12/162

    PROCEDURE Empty(): BOOLEAN =(*returns true if stack is empty*)BEGINRETURN stack.top = 0

    END Empty;

    PROCEDURE Full(): BOOLEAN = (*returns true if stack is full*)BEGINRETURN stack.top = Max

    END Full;

    BEGINEND Stack.

    Using the Stack Type

    MODULE StackUser EXPORTS Main; (*14.02.95. LB*)(* Example client of the integer stack *)

    FROM Stack IMPORT Push, Pop, Empty, Full;FROM SIO IMPORT Error, GetInt, PutInt, PutText, Nl; (*suppress warning*)

    BEGINPutText("Stack User. Please enter numbers:\n");WHILE NOT Full() DOPush(GetInt()) (*add entered number to stack*)

    END;WHILE NOT Empty() DOPutInt(Pop()) (*remove number from stack and return

    it*)END;Nl();

    END StackUser.

    FIFO Queues

    Queues are more difficult to implement than stacks, because action happens atboth ends.

    The easiestimplementation uses an array, adds elements at one end,and moves all elements when something is taken off the queue.

    It is very wasteful moving all the elements on each DEQUEUE. Can we dobetter?

    More Efficient Queues

    Suppose that we maintaining pointers to the first (head) and last (tail) elementsin the array/queue?

  • 8/2/2019 Data Structures and Programming

    13/162

    Note that there is no reason to explicitly clear previously unused cells.

    Now bothENQUEUEandDEQUEUEare fast, but they are wasteful of space.We need a array bigger than the total number ofENQUEUEs, instead of themaximum number of items stored at a particular time.

    Circular Queues

    Circular queues let us reuse empty space!

    Note that the pointer to the front of the list is now behindthe back pointer!

    When the queue is full, the two pointers point to neighboring elements.

    There are lots of possible ways to adjust the pointers for circular queues.All

    are tricky!

    How do you distinguish full from empty queues, since their pointer positionsmight be identical? The easiest way to distinguish full from empty is with acounter of how many elements are in the queue.

    FIFO Queue Interface

    INTERFACE Fifo; (*14.07.94 RM, LB*)(* A queue of text elements *)

    TYPE ET = TEXT; (*element type*)PROCEDURE Enqueue(elem:ET); (*adds element to end*)PROCEDURE Dequeue(): ET; (*removes and returns first element*)PROCEDURE Empty(): BOOLEAN; (*returns true if queue is empty*)PROCEDURE Full(): BOOLEAN; (*returns true if queue is full*)

    END Fifo.

    Priority Queue Implementation

    MODULE Fifo; (*14.07.94 RM, LB*)(* Implementation of a fifo queue of text elements *)

    CONSTMax = 8; (*Maximum number of elements in FIFO

    queue*)TYPEFifo = RECORD

    info: ARRAY [0 .. Max - 1] OF ET;in, out, n: CARDINAL := 0;

  • 8/2/2019 Data Structures and Programming

    14/162

    END; (*Fifo*)VAR w: Fifo; (*contains a FIFO queue*)

    PROCEDURE Enqueue(elem:ET) =(*adds element to end*)BEGINw.info[w.in]:= elem; (*stores new element*)w.in:= (w.in + 1) MOD Max; (*increments in-pointer in ring*)INC(w.n); (*increments number of stored

    elements*)END Enqueue;

    PROCEDURE Dequeue(): ET =(*removes and returns first element*)VAR e: ET;BEGINe:= w.info[w.out]; (*removes oldest element*)w.out:= (w.out + 1) MOD Max; (*increments out-pointer in ring*)DEC(w.n); (*decrements number of stored

    elements*)RETURN e; (*returns the read element*)

    END Dequeue;

    Utility Routines

    PROCEDURE Empty(): BOOLEAN =(*returns true if queue is empty*)BEGINRETURN w.n = 0;

    END Empty;

    PROCEDURE Full(): BOOLEAN =(*returns true if queue is full*)BEGINRETURN w.n = Max

    END Full;BEGINEND Fifo.

    User Module

    MODULE FifoUser EXPORTS Main; (*14.07.94. LB*)

    (* Example client of the text queue. *)FROM Fifo IMPORT Enqueue, Dequeue, Empty, Full; (* operations of the

    queue *)FROM SIO IMPORT Error, GetText, PutText, Nl; (*supress warning*)

    BEGINPutText("FIFO User. Please enter texts:\n");WHILE NOT Full() DO

  • 8/2/2019 Data Structures and Programming

    15/162

    Enqueue(GetText())END;WHILE NOT Empty() DOPutText(Dequeue() & " ")

    END;Nl();

    END FifoUser.

    Other Queues

    Double-ended queues - These are data structures which support both push andpop and enqueue/dequeue operations.

    Priority Queues(heaps) - Supports insertions and ``remove minimum''operations which useful in simulations to maintain a queue of time events.

    We will discuss simulations in a future class.

    Pointers and Dynamic Memory Allocation

    Lecture 4

    Pointers and Dynamic Memory Allocation

    Although arrays are good things, we cannot adjust the size of them in themiddle of the program.

    If our array is toosmall- our program will fail for large data.

    If our array is too big- we waste a lot of space, again restricting what we cando.

    The right solution is to build the data structure from small pieces, and add anew piece whenever we need to make it larger.

    Pointers are the connections which hold these pieces together!

    Pointers in Real Life

    In many ways, telephone numbers serve as pointers in today's society.

  • 8/2/2019 Data Structures and Programming

    16/162

    To contact someone, you do not have to carry them with you at alltimes.All you need is their number.

    Many different people can all have your number simultaneously.All youneed do is copy the pointer.

    More complicated structures can be built by combining pointers.Forexample, phone trees or directory information.

    Addresses are a more physically correct analogy for pointers, since they reallyare memory addresses.

    Linked Data Structures

    All the dynamic data structures we will build have certain shared properties.

    We need a pointer to the entire object so we can find it. Note that this is

    a pointer, not a cell. Each cell contains one or more data fields, which is what we want to

    store. Each cell contains a pointer field to at least one ``next'' cell. Thus much

    of the space used in linked data structures is not data!

    We must be able to detect the end of the data structure. This is why weneed the NIL pointer.

    Pointers in Modula-3

    A node in a linked list can be declared:

    typepointer = REF node;node = record

    info : item;next : pointer;

    end;

    varp,q,r : pointer; (* pointers *)x,y,z : node; (* records *)

    Note circular definition. Modula-3 lets you get away with this because it is areference type. Pointers are the same sizeregardless of what they point to!

    We want dynamic data structures, where we make nodes as we need them.Thus declaring nodes as variables are not the way to go!

    Dynamic Allocation

  • 8/2/2019 Data Structures and Programming

    17/162

    To get dynamic allocation, use new:

    p := New(ptype);

    New(ptype) allocates enough space to store exactly one object of the type

    ptype. Further, it returns a pointer to this empty cell.

    Before a new or otherwise explicit initialization, a pointer variable has anarbitrary value which points to trouble!

    Warning- initialize all pointers before use. Since you cannot initialize them toexplicit constants, your only choices are

    NIL - meaning explicitly nothing. New(ptype) - a fresh chunk of memory.

    assignment to some previously initialized pointer of the same type.

    Pointer Examples

    Example: p := new(node); q := new(node);

    p.x grants access to the fieldx of the record pointed to byp.

    p^.info := "music";q^.next := nil;

    The pointer value itself may be copied, which does not change any of the otherfields.

    Note this difference between assigning pointers and what they point to.

    p := q;

    We get a real mess. We have completely lost access to music and can't get itback! Pointers are unidirectional.

    Alternatively, we could copy the object being pointed to instead of the pointeritself.

    p^ := q^;

    What happens in each case if we now did:

    p^.info := "data structures";

  • 8/2/2019 Data Structures and Programming

    18/162

    Where Does the Space Come From?

    Can we really get as much memory as we want without limit just by usingNew?

    No, because there are the physical limits imposed by the size of the memory ofthe computer we are using. Usually Modula-3 systems let the dynamic memorycome from the ``other side'' of the ``activation record stack'' used to maintain

    procedure calls:

    Just as the stack reuses memory when a procedure exits, dynamic storage mustbe recycled when we don't need it anymore.

    Garbage Collection

    The Modula-3 system is constantly keeping watch on the dynamic memorywhich it has allocated, making sure thatsomethingis still pointing to it. If not,there is no way for you to get access to it, so the space might as well berecycled.

    Thegarbage collectorautomatically frees up the memory which has nothingpointing to it.

    It frees you from having to worry about explicitly freeing memory, at the costof leaving certain structures which it can't figure out are really garbage, such as

    a circular list.

    Explicit Deallocation

    Although certain languages like Modula-3 and Java support garbage collection,others like C++ require you to explicitly deallocate memory when you don'tneed it.

    Dispose(p) is the opposite of New - it takes the object which is pointed tobyp and makes it available for reuse.

    Note that each dispose takes care of only one cell in a list. To dispose of anentire linked structure we must do it one cell as a time.

    Note we can get into trouble with dispose:

  • 8/2/2019 Data Structures and Programming

    19/162

    Of course, it is too late to dispose of music, so it will endure forever withoutgarbage collection.

    Suppose we dispose(p), and later allocation more dynamic memory with new.The cell we disposed of might be reused. Now what does q point to?

    Answer - the same location, but it means something else! So called danglingreferences are a horrible error, and are the main reason why Modula-3 supportsgarbage collection.

    A dangling reference is like a friend left with your old phone number after youmove. Reach out and touch someone - eliminate dangling references!

    Security in Java

    It is possible to explicitly dispose of memory in Modula-3 when it is reallynecessary, but it is strongly discouraged.

    Java does not allow one to do such operations on pointers at all. The reasonissecurity.

    Pointers allow you access to raw memory locations. In the hands of skilled butevil people, unchecked access to pointers permits you to modify the operatingsystem's or other people's memory contents.

    Java is a language whose programs are supposed to be transferred across theInternet to run on your computer. Would you allow a stranger's program to runon your machine if they could ruin your files?

    Linked Stacks and Queues

    Lecture 5

    Pointers about Pointers

    var p, q : ^node;

    p = new(node) creates a new node and setsp to point to it.

    p describes the node which is pointed to byp.

  • 8/2/2019 Data Structures and Programming

    20/162

    p .itemdescribes the item field of the node pointed to byp.

    dispose(p) returns to the system the memory used by the node pointed to byp.This is not used because of Modula-3 garbage collection.

    NIL is the only value a pointer can have which is not an address.

    Linked Stacks

    The problem with array-based stacks are that the size must be determined atcompile time. Instead, let's use a linked list, with the stack pointer pointing tothe top element.

    To push a new element on the stack, we must do:

    p^.next = top;top = p;

    Note this works even for the first push if top is initialized to NIL!

    Popping from a Linked Stack

    To pop an item from a linked stack, we just have to reverse the operation.

    p = top;top = top^.next;p^.next = NIL; (*avoid dangling reference*)

    Note again that this works in the boundary case of one item on the stack.

    Note that to check we don't pop from an empty stack, we must testwhethertop =NILbefore using top as a pointer. Otherwise things crash orsegmentation fault.

    Linked Stack in Modula-3

    MODULE Stacks; (*14.07.94 RM, LB*)

    (* Implementation of the abstract, generic stack. *)REVEALT = BRANDED REF RECORD

    info: ET; next: T;END; (*T*)

    PROCEDURE Create(): T = (*creates and intializes a new stack*)BEGINRETURN NIL; (* a new, empty stack is simply NIL *)

    END Create;

  • 8/2/2019 Data Structures and Programming

    21/162

    PROCEDURE Push(VAR stack: T; elem:ET) =(*adds element to stack*)VAR new: T := NEW(T, info:= elem, next:= stack); (*create element*)BEGINstack:= new (*add element at top*)

    END Push;

    PROCEDURE Pop(VAR stack: T): ET =(*removes and returns top element, or NIL for empty stack*)VAR first: ET := NIL; (* Pop returns NIL for empty stack*)BEGINIF stack # NIL THENfirst:= stack.info; (*copy info from first element*)stack:= stack.next; (*remove first element*)

    END; (*IF stack # NIL*)RETURN first;

    END Pop;PROCEDURE Empty(stack: T): BOOLEAN =

    (*returns TRUE for empty stack*)BEGINRETURN stack = NIL

    END Empty;BEGINEND Stacks.

    Generic Stack Interface

    INTERFACE Stacks; (*14.07.94 RM, LB*)(* Abstract generic stack. *)

    TYPET

  • 8/2/2019 Data Structures and Programming

    22/162

    *)IMPORT Stacks;IMPORT FractionType;FROM Stacks IMPORT Push, Pop, Empty;FROM SIO IMPORT PutInt, PutText, Nl, PutReal, PutChar;

    TYPEComplex = REF RECORD r, i: REAL END;

    VARstackFraction: Stacks.T:= Stacks.Create();stackComplex : Stacks.T:= Stacks.Create();

    c: Complex;f: FractionType.T;

    BEGIN (*StacksClient*)PutText("Stacks Client\n");

    FOR i:= 1 TO 4 DOPush(stackFraction, FractionType.Create(1, i)); (*stores numbers

    1/i*)END;

    FOR i:= 1 TO 4 DOPush(stackComplex, NEW(Complex, r:= FLOAT(i), i:= 1.5 * FLOAT(i)));

    END;WHILE NOT Empty(stackFraction) DOf:= Pop(stackFraction);PutInt(FractionType.Numerator(f));PutText("/");PutInt(FractionType.Denominator(f), 1);

    END;Nl();

    WHILE NOT Empty(stackComplex) DOc:= Pop(stackComplex);PutReal(c.r);PutChar(':');PutReal(c.i);PutText(" ");

    END;Nl();

    END StacksClient.

    Linked Queues

    Queues in arrays were ugly because we need wrap around for circular queues.Linked lists make it easier.

  • 8/2/2019 Data Structures and Programming

    23/162

    We need two pointers to represent our queue - one tothe rearforenqueue operations, and one to thefrontfordequeueoperations.

    Note that because both operations move forward through the list, no backpointers are necessary!

    Enqueue and Dequeue

    To enqueue an item :

    p^.next := NIL;if (back = NIL) then begin (* empty queue *)

    front := p; back := p;end else begin (* non-empty queue *)

    back^.next := p;back := p;

    end;

    To dequeue an item:

    p := front;front := front^.next;p^.next := NIL;if (front = NIL) then back := NIL; (* now-empty queue *)

    Building the Calculator

    Lecture 6Reverse Polish Notation

    HP Calculators use reverse Polish notation orpostfix notation. Instead of theconventional a + b, we writeA B +.

    Our calculator will do the same. Why? Because it is the easiest notation toimplement!

    The rule for conversion is to read the expression from left to right. When wesee a number, push it on the operation stack. When we see an operation, popthe last two numbers on stack, do the operation, and push the result on thestack.

  • 8/2/2019 Data Structures and Programming

    24/162

    Look Ma, no parentheses!

    Algorithms for the calculator

    To implement addition, we add digits from right to left, with the carry one

    place if the sum is greater than 10.

    Note that the last carry can go beyond one or both numbers, so you must handlethis special case.

    To implement subtraction, we work on digits from right to left, and borrow 10if necessary from the digit to the left.

    A borrow from the leftmost digit is complicated, since that gives a negativenumber.

    This is why I suggest completing addition first before worrying aboutsubtraction.

    I recommend to test which number has a larger absolute value, subtract fromthat, and then adjust the sign accordingly.

    Parsing the Input

    There are several possible ways to handle the problem of reading in the inputline andparsingit, i.e. breaking it into its elementary components of numbersand operators.

    The way that seems best to me is to read the entire line as one character stringin a variable of type TEXT.

    As detailed in your book, you can use the function Text.Length(S) to get thelength of this string, and the functionText.GetChar(S,i) to retreive any given

    character.

    Useful functions on characters include the function ORD(c), which returns theinteger character code ofc. Thus ORD(c) -ORD('0') returns the numerical valueof a digit character.

    You can test characters for equality to identify special symbols.

  • 8/2/2019 Data Structures and Programming

    25/162

    Standard I/O

    The easiest way to read and write from the files is to use I/O redirection fromUNIX.

    Suppose calc is your binary program, and it expects input from the keyboardand output to the screen. By running calc < fileinat the command prompt, itwill take its input from the filefilein instead of the keyboard.

    Thus by writing your program to read from regular I/O, you can debug itinteractively and also run my test files.

    Programming Hints

    1. Write the comments first, for your sake.

    2. Make sure your main routine is abstract enough that you can easily seewhat the program does.

    3. Isolate the details of your data structures to a few abstract operations.4. Build good debug print routines first.

    List Insertion and Deletion

    Lecture 7

    Search, Insert, Delete

    There are three fundamental operations we need for any database:

    Insert: add a new record at a given point

    Delete: remove an old record Search: find a record with a given key

    We will see a wide variety of different implementation of these operations over

    the course of the semester.

    How would you implement these using an array?

    With linked lists, we can creating arbitrarily large structures, and never have tomove any items.

  • 8/2/2019 Data Structures and Programming

    26/162

    Most of these operations should be pretty simple now that you understandpointers!

    Searching in a Linked List

    Procedure Search(head:pointer, key:item):pointer;Var

    p:pointer;found:boolean;

    Beginfound:=false;p:=head;While (p # NIL) AND (not found) DoBegin

    If (p^.info = key) thenfound = true;

    Elsep = p^.next;

    End;return p;

    END;

    Search performs better when the item is near the front of the list than the back.

    What happens when the item isn't found?

    Insertion into a Linked List

    The easiest way to insert a new nodep into a linked list is to insert it at the

    front of the list:

    p^.next = front;front = p;

    To maintain lists in sorted order, however, we will want to insert a nodebetween the two appropriate nodes. This means that as we traverse the list wemust keep pointers to both the current node and the previous node.

    MODULE Intlist; (*16.07.94. RM, LB*)(* Implementation of sorted integer lists. *)

    REVEAL (*reveal inner structure of T*)T = BRANDED REF RECORD

    key: INTEGER; (*key value*)next: T := NIL; (*pointer to next element*)

    END; (*T*)PROCEDURE Create(): T =(* returns a new, empty list *)BEGIN

  • 8/2/2019 Data Structures and Programming

    27/162

    RETURN NIL; (*creation is trivial; empty list is NIL*)END Create;

    PROCEDURE Insert(VAR list: T; value:INTEGER) =(* inserts new element in list and maintains order *)VARcurrent, previous: T;new: T := NEW(T, key:= value); (*create new element*)

    BEGINIF list = NIL THEN list:= new (*first element*)ELSIF value < list.key THEN (*insert at beginning*)new.next:= list; list:= new;

    ELSE (*find position for insertion*)current:= list;previous:= current;WHILE (current # NIL) AND (current.key

  • 8/2/2019 Data Structures and Programming

    28/162

    ELSEprevious.next:= current.next

    END;END; (*IF current = NIL*)

    END; (*IF list = NIL*)END Remove;

    Passing Procedures as Arguments

    Note the passing of a procedure as a parameter - it is legal, and useful to makemore general functions, for example a sort routine for both increasing anddecreasing order, or any order.

    PROCEDURE Iterate(list: T; action: Action) =(* applies action to all elements (with key value as parameter) *)BEGINWHILE list # NIL DO

    action(list.key);list:= list.next;

    END;END Iterate;

    BEGIN (* Intlist *)END Intlist.

    Pointers and Parameter Passing

    Pointers provide, for better or (usually) worse, and alternate way to modifyparameters. Let us look at two different ways to swap the ``values'' of two

    pointers.

    Procedure Swap1(var p,q:pointer);Var r:pointer;begin

    r:=q;q:=p;p:=r;

    end;

    This is perhaps the simplest and best way - we just exchange the values of thepointers...

    Alternatively, we could swap the values ofwhat is pointed to, and leave thepointers unchanged.

    Procedure Swap2(p,q : pointer);var tmp : node;begin

    tmp := q^; (*1*)

  • 8/2/2019 Data Structures and Programming

    29/162

    q^ := p^; (*2*)p^ := tmp; (*3*)

    end;

    After step (*1*):

    After step (*2*):

    After step (*3*):

    Side Effects of Pointers

    Ifswap2, since we do not change the values ofp and q, they do notneed to bevar parameters!

    However, copying the values did not do the same thing as copying the pointers,

    because in the first case thephysical locationof the data changed, while in thesecond the data stayed put.

    If data which ispointed to moves, the value of what is pointed to can change!

    Moral: you must be careful about theside effects of pointer operations!!!

    C language does nothave var parameters. All side effects are done by passingpointers. Additional pointer operations in C language help make this practical.

    Doing the Shuffle

    Lecture 8

    Programming Style

    Although programming style (like writing style) is a somewhat subjectivething, there is a big difference between good and bad.

    The good programmer doesn't just strive for something that works, butsomething that works elegantly and efficiently; something that can bemaintained and understood by others.

  • 8/2/2019 Data Structures and Programming

    30/162

    Just like a good writer rereads and rewrites their prose, a good programmerrewrites their program to make it cleaner and better.

    To get a better sense of programming style, let's critique some representativesolutions to the card-shuffling assignment to see what's good and what can be

    done better.

    Ugly looking main

    MODULE card EXPORTS Main;IMPORT SIO;TYPEindex=[1..200];Start=ARRAY index OF INTEGER;Left=ARRAY index OF INTEGER;Right=ARRAY index OF INTEGER;Final=ARRAY index OF INTEGER;

    VARi,j,times,mid,k,x: INTEGER;start: Start;left: Left;right: Right;final: Final;

    BEGINSIO.PutText("deck size shuffles\n");SIO.PutText("--------- --------\n");SIO.PutText(" 200 ");SIO.PutInt( times );

    REPEAT (*Repeat the following until perfect shuffle*)

    i:=1; (*original deck*)WHILE i

  • 8/2/2019 Data Structures and Programming

    31/162

    times:=times+1;END card.

    There are no variable or block comments. This program would be hard tounderstand.

    This is an ugly looking program - the structure of the program is not reflectedby the white space.

    Indentation and blank lines appear to be added randomly.

    There are no subroutines used, so everything is one big mess.

    See how the dependence on the number of cards is used several times withinthe body of the program, instead of just in one CONST.

    What Does shufs Do?

    PROCEDURE shufs( nn : INTEGER )= (* shuffling procedure *)

    VARi : INTEGER; (* index variable *)count : INTEGER; (* COUNT variable *)

    BEGIN

    FOR i := 1 TO 200 DO (* reset this array *)shuffled[i] := i;

    END;

    count := 0; (* start counter from 0 *)

    REPEATcount := count + 1;FOR i := 1 TO 200 DO (* copy shuffled ->

    tempshuf *)tempshuf[i] := shuffled[i];

    END;

    FOR i := 1 TO nn DO (* shuffle 1st half *)shuffled[2*i-1] := tempshuf[i];

    END;

    FOR i := nn+1 TO 2*nn DO (* shuffle 2nd half *)shuffled[2*(i-nn)] := tempshuf[i];

    END;

    UNTIL shuffled = unshuffled ; (* did it return tooriginal? *)

    (* print out the data *)Wr.PutText(Stdio.stdout , "2*n= " & Fmt.Int(2*nn) & " \t" );

  • 8/2/2019 Data Structures and Programming

    32/162

    Wr.PutText(Stdio.stdout , Fmt.Int(count) & "\n" );

    END shufs;

    Every subroutine should ``do'' something that is easily described. Whatdoesshufs do?

    The solution to such problems is to write the block comments for thesubroutine does before writing the subroutine.

    If you can't easily explain what it does,you don't understand it.

    How many comments are enough?

    MODULE Shuffles EXPORTS Main;IMPORT SIO;

    TYPEArray= ARRAY [1..200] OF INTEGER; (*Create an integer array from *)

    (*1 to 200 and called Array *)VAR

    original, temp1, temp2: Array; (*Declare original,temp1 and *)(*temp2 to be Array *)

    counter: INTEGER; (*Declare counter to be integer*)

    (********************************************************************)(* This is a procedure called shuffle used to return a number of *)(* perfect shuffle. It input a number from the main and run the *)(* program with it and then return the final number of perfect shuffle*)

    (********************************************************************)

    PROCEDURE shuffle(total: INTEGER) :INTEGER =VAR

    half, j, p: INTEGER; (*Declare half, j, p to beinteger *)BEGIN

    FOR j:= 1 TO total DOoriginal[j] := j;temp1[j] := j;

    END; (*for*)half := total DIV 2;REPEAT

    j := 0;p := 1;REPEAT

    j := j + 1;temp2[p] := temp1[j]; (* Save the number from the first half

    *)(* of the original array into temp2

    *)p := p + 1;

  • 8/2/2019 Data Structures and Programming

    33/162

    temp2[p] := temp1[half+j]; (* Save the number from the lasthalf*)

    (* of the original array into temp2*)

    p := p + 1;UNTIL p = total + 1; (*REPEAT_UNTIL used to make a new array of

    temp1*)INC (counter); (* increament counter when they shuffle once

    *)FOR i := 1 TO total DO

    temp1[i] := temp2[i];END; (* FOR loop used to save all the elements from temp2 to

    temp1 *)UNTIL temp1 = original; (* REPEAT_UNTIL, when two array match

    exactly *)(* same then quick *)

    RETURN counter; (* return the counter *)END shuffle; (* end procedure shuffle *)

    (********************************************************************)

    (* This is the Main for shuffle program that prints out the numbers *)(* of perfect shuffles necessary for a deck of 2n cards *)(********************************************************************)

    BEGIN...

    END Shuffles. (* end the main program called Shuffles *)

    This program has many comments which should be obvious to anyone who canread Modula-3.

    More useful would be enhanced block comments telling you what the program

    is done and how it works.

    The ``is it completely reshuffled yet?'' test is done cleanly, although all of the200 cards are tested regardless of deck size.

    The shuffle algorithm is too complicated. Algorithms must be pretty, too

    MODULE prj1 EXPORTS Main;IMPORT SIO;CONST

    n : INTEGER = 100; (*size of split deck*)

    TYPEnArray = ARRAY[1..n] OF INTEGER; (*n sized deck type*)twonArray = ARRAY[1..2*n] OF INTEGER; (*2n sized deck type*)

    VARmerged : twonArray; (*merged deck*)count : INTEGER;

  • 8/2/2019 Data Structures and Programming

    34/162

    PROCEDURE shuffle(size:INTEGER; VAR merged:twonArray)=VAR

    topdeck, botdeck : nArray; (*arrayed split decks*)BEGIN

    FOR i := 1 TO size DOtopdeck[i] := merged[i]; (*split entire deck*)botdeck[i] := merged[i+size]; (*into top, bottom decks*)

    END;FOR j := 1 TO size DO

    merged[2*j-1] := topdeck[j]; (*If odd then 2*n-1position.*)

    merged[2*j] := botdeck[j]; (*If even then 2*n position*)END;

    END shuffle;

    PROCEDURE printout(count:INTEGER; size:INTEGER)=BEGIN

    SIO.PutInt(size);SIO.PutText(" ");

    SIO.PutInt(count);SIO.PutText(" \n");

    END printout;

    PROCEDURE checkperfect(merged:twonArray; i:INTEGER) : BOOLEAN=VAR

    size : INTEGER;check : BOOLEAN;

    BEGINcheck := FALSE;size := 0;REPEAT

    INC(size, 1); (*check to see if*)IF merged[size+1] - merged[size] = 1 THEN (*deck is perfectly*)

    check := TRUE; (*shuffled, if so *)END; (*card progresses by

    1*)UNTIL (check = FALSE OR size - 1 = i);RETURN check;

    END checkperfect;

    Checkperfect is much more complicated than it need be; just checkwhethermerged[i] = i. You can return without the BOOLEAN variable.

    A good thing is that the deck size is all a function of a CONST.

    The shuffle is slightly wasteful of space - two extra full arrays instead of twoextra half arrays.

    Why does this work correctly?

    BEGIN

  • 8/2/2019 Data Structures and Programming

    35/162

    SIO.PutLine("Welcome to Paul's card shuffling program!");SIO.PutLine(" DECK SIZE NUMBER OF SHUFFLES ");SIO.PutLine(" _________________________________ ");num_cards := 2;REPEAT

    counter := 0;FOR i := 1 TO (num_cards) DO

    deck[i] :=i;END; (*initializes deck*)REPEAT

    deck := Shuffle(deck,num_cards);INC(counter);

    UNTIL deck[2] = 2;SIO.PutInt(num_cards,16); SIO.PutInt(counter,19);SIO.PutText("\n");INC(num_cards,2);(*increments the number of cards in deck by 2.*)

    UNTIL ( num_cards = ((2*n)+2));END ShuffleCards.

    Why we know that this stopping condition suffices to get us all the cards in theright position. This should beproven prior to use.

    Why use a Repeat loop when For will do?

    Program Defensively

    I am starting to see the wreckage of several programs because students are notbuilding their programs to be debugged.

    Add useful debug print statements! Have your program describe what itis doing!

    Document what you think your program does! Otherwise, how do youknow whine it is doing it!

    Build your program in stages! Thus you localize your bugs, and makesure you understand simple things before going on to complicatedthings.

    Use spacing to show the structure of your program. A good program is apretty program!

    Recursive and Doubly Linked Lists

    Lecture 9

    Recursive List Implementation

  • 8/2/2019 Data Structures and Programming

    36/162

    The basic insertion and deletion routines for linked lists are more elegantlywritten using recursion.

    PROCEDURE Insert(VAR list: T; value:INTEGER) =(* inserts new element in list and maintains order *)VAR new: T; (*new node*)BEGINIF list = NIL THENlist:= NEW(T, key := value) (*list is empty*)

    ELSIF value < list.key THEN (*proper place found: insert*)new := NEW(T, key := value);new.next := list;list := new;

    ELSE (*seek position for insertion*)Insert(list.next, value);

    END; (*IF list = NIL*)END Insert;

    PROCEDURE Remove(VAR list:T; value:INTEGER; VAR found:BOOLEAN) =

    (* deletes (first) element with value from sorted list,or returns false in found if the element was not found *)

    BEGINIF list = NIL THEN (*empty list*)found := FALSE

    ELSIF value = list.key THEN (*elemnt found*)found := TRUE;list := list.next

    ELSE (*seek for the element to delete*)Remove(list.next, value, found);

    END;END Remove;

    Doubly Linked Lists

    Often it is necessary to move both forward and backwards along a linked list.Thus we need another pointer from each node, to make it doubly linked.

    List types are analogous to dance structures:

    Conga line - singly linked list. Chorus line - doubly linked list. Hora circle - double linked circular list.

    Extra pointers allow the flexibility to have both forward and backwards linkedlists:

    typepointer = REF node;node = record

    info : item;front : pointer;

  • 8/2/2019 Data Structures and Programming

    37/162

    back : pointer;end;

    Insertion

    How do we insertp between nodes q and rin a doubly linked list?p^.front = r;p^.back = q;r^.back = p;q^.front = p;

    It is not absolutely necessary to have pointerr, since r = q .front, but it makes itcleaner.

    The boundary conditions are inserting before the first and after the last element.

    How do we insert before the first element in a doubly linked list (head)?

    p^.back = NIL;p^.front = head;head^.back = p;head = p; (* must point to entire structure *)

    Inserting at the end is similar, except headdoesn't change, and a back pointer isset to NIL.

    Linked Lists: Pro or Con?

    The advantages of linked lists include:

    Overflow can never occur unless the memory is actually full. Insertions and deletions are easierthan for contiguous (array) lists. With large records, moving pointers is easier and faster than moving the

    items themselves.

    The disadvantages of linked lists include:

    The pointers require extra space. Linked lists do not allow random access. Time must be spent traversing and changing the pointers. Programming is typically trickier with pointers.

  • 8/2/2019 Data Structures and Programming

    38/162

    Recursion and Backtracking

    Lecture 10

    Recursion

    Recursion is a wonderful, powerful way to solve problems.

    Elegant recursive procedures seem to work by magic, but the magic is same

    reason mathematical induction works!

    Example: Prove .

    Forn=1, , so its true. Assume it is true up to n-1.

    Example: All horses are the same color! (be careful of your basis cases!)

    The Tower of Hanoi

    MODULE Hanoi EXPORTS Main; (*18.07.94*)(* Implementation of the game Towers of Hanoi. *)

    PROCEDURE Transfer(from, to: Post) =(*moves a disk from post "from" to post "to"*)BEGINWITH f = posts[from], t = posts[to] DOINC(t.top);

    t.disks[t.top]:= f.disks[f.top];f.disks[f.top]:= 0;DEC(f.top);

    END; (*WITH f, t*)END Transfer;

    PROCEDURE Tower(height:[0..Height] ; from, to, between: Post) =(*Does the job through recursive calls on itself*)BEGINIF height > 0 THEN

  • 8/2/2019 Data Structures and Programming

    39/162

    Tower(height - 1, from, between, to);Transfer(from, to);Display();Tower(height - 1, between, to, from);

    END;END Tower;

    BEGIN (*main program Hanoi*)posts[Post.Start].top:= Height;FOR h:= 1 TO Height DOposts[Post.Start].disks[h]:= Height - (h - 1)

    END;Tower(Height, Post.Start, Post.Finish, Post.Temp);

    END Hanoi.

    To count the number of moves made,

    Recursion not only made a complicated problem understandable, it made iteasy to understand.

    Combinatorial Objects

    Many mathematical objects have simple recursive definitions which can beexploited algorithmically.

    Example: How can we build all subsets ofn items? Build all subsets ofn-1items, copy the subsets, and add item n to each of the subsets in one copy but

    not the other.

    Once you start thinking recursively, many things have simpler formulations,such as traversing a linked list or binary search.

    Gray codes

    We saw how to generate subsets recursively. Now let us generate them in aninteresting order.

    All subsets of can be represented as binary strings of length n,where bit i tells whetheri is in the subset or not.

    Obviously, all subsets must differ in at least one element, or else they would beidentical. An order where they differ by exactly one from each other is calleda Gray code.

  • 8/2/2019 Data Structures and Programming

    40/162

    Forn=1, {},{1}.

    Forn=2, {},{1},{1,2},{2}.

    Forn=3, {},{1},{1,2},{2},{2,3},{1,2,3},{1,3},{3}

    Recursive construction algorithm: Build a Gray Code of , makea reverse copy of it, append n to each subset in the reverse copy, and stick thetwo together!

    Formulating Recursive Programs

    Think about the base cases, the small cases where the problem is simpleenough to solve.

    Think about thegeneral case, which you can solve if you can solve the smallercases.

    Unfortunately, many of the simple examples of recursion are equally well doneby iteration, making students suspicious.

    Further, many of these classic problems have hidden costs which makerecursionseem expensive, but don't be fooled!

    Factorials

    PROCEDURE Factorial (n: CARDINAL): CARDINAL =BEGINIF n = 0 THENRETURN 1 (* trivial case *)

    ELSERETURN n * Factorial(n-1) (* recursive branch *)

    END (* IF*)END Factorial;

    Be sure you understandhow the parameter passing mechanism works.

    Would this program work ifn was a VAR parameter?

    Fibonacci Numbers

    The Fibonacci numbers are given by the recurrence

    relation .

  • 8/2/2019 Data Structures and Programming

    41/162

    PROCEDURE Fibonacci(n : CARDINAL) : CARDINAL =BEGIN (* Fibonacci *)IF n 1*)

    END (* IF *)END Fibonacci;

    How much time does this elementary Fibonacci function take?

    Implementing Recursion

    Part of the mystery of recursion is the question of how the machine keepseverything straight.

    How come local variables don't get trashed?

    The answer is that whenever a procedure or function is called, the localvariables arepushedon a stack, so the new recursive call is free to use them.

    When a procedure ends, the variables arepoppedoff the stack to restore themto where they were before the call.

    Thus the space used is equal to the depth of the recursion, since stack space isreused.

    Tail Recursion

    Tail recursion costs space, but not time. It can be removed mechanically and isby some compilers.

    Moral: Do not be afraid to use recursion if the algorithm is efficient.

    The overhead of recursion vs. maintaining your own stack is too small to worryabout.

    By being clever, you can sometimes save stack space. Consider the followingvariation of Quicksort:

    If (p-1 < h-p) then

  • 8/2/2019 Data Structures and Programming

    42/162

    Qsort(1,p)

    Qsort(p,h)

    else

    Qsort(p,h)

    Qsort(1,p)

    By doing the smaller half first, the maximum stack depth is in theworst case.

    Applications of Recursion

    You may say, ``I just want to get a job and make lots of money. What can

    recursion do for me?

    We will look at three applications

    Backtracking Game Tree Search Recursion Descent Compilation

    The N-Queens Problem

    Backtracking is a way to solve hard search problems.

    For example, how can we put n queens on an board so that no twoqueens attack each other?

    Tree Pruning

  • 8/2/2019 Data Structures and Programming

    43/162

    Backtracking really pays off when we can prove a node early in the search tree.

    Thus we need never look at its children, or grandchildren, or great....

    We apply backtracking to big problems, so the more clever we are, the more

    time we save.

    There are total sets of eight squares but no two queens can be in the same

    row. There are ways to place eight queens in different rows. However, sinceno two queens can be in the same column, there are only 8! permutations ofcolumns, or only 40,320 possibilities.

    We must also be clever to test as quickly as possible the new queen does not

    violate a diagonal constraint

    Applications of Recursion

    Lecture 11

    Game Trees

    Chess playing programs work by constructing a tree of all possible moves froma given position, so as to select the best possible path.

    The player alternates at each level of the tree, but at each node the player whosemove it is picks the path that is best for them.

    A player has a forced loss if lead down a path where the other guy wins if theyplay correctly.

    This is a recursive problem since we can always maximize, by just changingperspective.

    In a game like chess, we will never reach the bottom of the tree, so we muststop at a particulardepth.

    Alpha-beta Pruning

    Sometimes we don't have to look at the entire game tree to get the right answer:

  • 8/2/2019 Data Structures and Programming

    44/162

    No matter what the red score is, it cannot help max and thus need not be lookedat.

    An advanced strategy called alpha-beta running reduces search accordingly.

    Recursive Descent Compilation

    Compilers do two useful things

    They identify whether a program is legal in the language. They translate it into assembly language.

    To do either, we need a precise description of the language, aBNFgrammarwhich gives the syntax. A grammar for Modula-3 is given throughoutyour text.

    The language definition can be recursive!!

    Our compiler will follow the grammar to break the program into smaller andsmaller pieces.

    When the pieces get small enough, we can spit out the appropriate chunk ofassembly code.

    To avoid getting into infinite loops, we place our trust in the fellow who wrote

    the grammar. Proper design can ensure that there are no such troubles.

    Abstraction and Modules

    Lecture 12

    Abstract Data Types

    It is important to structure programs according to abstract datatypes: collections of data with well-defined operations on it

    Example: Stack or Queue.Data: A sequence of items Operations: Initialize,Empty?, Full?, Push, Pop, Enqueue, Dequeue

  • 8/2/2019 Data Structures and Programming

    45/162

    Example: Infinite Precision Integers.Data: Linked list of digits with signbit. Operations: Print number, Read Number, Add, Subtract, Multiply, Divide,Exponent, Module, Compare.

    Abstract data types add clarity by separating the definitions from the

    implementations.

    What Do We Want From Modules?

    Separate Compilation - We should be able to break the program into smallerfiles. Further, we shouldn't need the source for each Module to link it together,

    just the compiled object code files.

    Communicate Desired Information Between Modules - We should be able todefine a type or procedure in one module and use it in another.

    Information Hiding- We should be able to define a type or procedure in onemodule andforbidusing it in another! Thus we can clearly separate thedefinition of an abstract data type from its implementation!

    Modula-3 supports all of these goals by separating interfaces (.i3 files) fromimplementations (.m3 files).

    Example: The Piggy Bank

    Below is an interface file to:

    INTERFACE PiggyBank; (*RM*)(* Interface to a piggy bank:

    You can insert money with "Deposit". The only other permissibleoperation is smashing the piggy bank to get the ``money back''The procedure "Smash" returns the sum of all deposited amountsand makes the piggy bank unusable.

    *)PROCEDURE Deposit(cash: CARDINAL);PROCEDURE Smash(): CARDINAL;

    END PiggyBank.

    Note that this interface does not reveal where or how the total value is stored,nor how to initialize it.

    These are issues to be dealt with within the implementation of the module.

  • 8/2/2019 Data Structures and Programming

    46/162

    Piggy Bank Implementation

    MODULE PiggyBank; (*RM/CW*)(* Implementation of the PiggyBank interface *)VAR contents: INTEGER; (* state of the piggy bank *)

    PROCEDURE Deposit(cash: CARDINAL) =(* changes the state of the piggy bank *)BEGIN= 0*> (* piggy bank still okay? *)contents := contents + cash

    END Deposit;PROCEDURE Smash(): CARDINAL =VAR oldContents: CARDINAL := contents; (* contents before smashing *)BEGINcontents := -1; (* smash piggy bank *)RETURN oldContents

    END Smash;BEGINcontents := 0 (* initialization of state variables in body *)

    END PiggyBank.

    A Client Program for the Bank

    MODULE Saving EXPORTS Main; (*RM*)(* Client of the piggy bank:

    In a loop the user is prompted for the amount of deposit.Entering a negative amount smashes the piggy bank.

    *)FROM PiggyBank IMPORT Deposit, Smash;FROM SIO IMPORT GetInt, PutInt, PutText, Nl, Error;VAR cash: INTEGER;

    BEGIN (* Saving *)PutText("Amount of deposit (negative smashes the piggy bank): \n");REPEATcash := GetInt();IF cash >= 0 THENDeposit(cash)

    ELSEPutText("The smashed piggy bank contained $");PutInt(Smash());Nl()

    END;UNTIL cash < 0

    END Saving.

  • 8/2/2019 Data Structures and Programming

    47/162

    Interface File Conventions

    Imports describe what procedures a given module makes available.

    Exports describes what we are willing to make public, ultimately including the

    ``MAIN'' program.

    By naming files with the same .m3 and .i3 names, the ``ezbuild'' makecommand can start from the file with the main program, and final all otherrelevant files.

    Ideally, the interface file should hide as much detail about the internalimplementation of a module from its users as possible. This is not easy withoutsophisticated language features.

    Hiding the Details

    INTERFACE Fraction; (*RM*)(* defines the data type for rational numbers *)TYPE T = RECORD

    num : INTEGER;den : INTEGER;

    END;PROCEDURE Init (VAR fraction: T; num: INTEGER; den: INTEGER := 1);(* Initialize "fraction" to be "num/den" *)

    PROCEDURE Plus (x, y : T) : T; (* x + y *)PROCEDURE Minus (x, y : T) : T; (* x - y *)PROCEDURE Times (x, y : T) : T; (* x * y *)PROCEDURE Divide (x, y : T) : T; (* x / y *)

    PROCEDURE Numerator (x : T): INTEGER; (* returns the numerator of x

    *)PROCEDURE Denominator (x : T): INTEGER; (* returns the denominator of

    x *)END Fraction.

    Note that there is a dilemma here. We must make type Tpublic so these

    procedures can use it, but would like to prevent users from accessing (or evenknowing about) the fields num and dem directly.

    Subtypes and REFANY

  • 8/2/2019 Data Structures and Programming

    48/162

    Modula-3 permits one to declaresubtypes of types,A

  • 8/2/2019 Data Structures and Programming

    49/162

    With generic pointers, it becomes necessary for type checking to be donea run-time, instead of at compile-time as done to date.

    This gives more flexibility, but much more room for you to hang yourself. Forexample:

    TYPEStudent = REF RECORD lastname,firstname:TEXT END;Address = REF RECORD street:TEXT; number:CARDINAL END;

    VARr1 : Student;r2 := NEW(Student, firstname:="Julie", lastname:="Tall");adr := NEW(Address, street:="Washington", number:="21");any := REFANY;

    BEGINany := r2; (* always a safe assignment *)

    r1 := any; (* legal because any is of type student *)adr := any; (* produces a run-time error, not compile-time

    *)

    You should worry about the ideas behind generic implementations (why doesModula-3 do it this way?) more than the syntactic details (how does Modula-3let you do this?). It is very easy to get overwhelmed by the detail.

    Generic Types

    When we think about the abstract data type ``Stack'' or ``Queue'', the

    implementation of th the data structure is pretty much the same whether wehave a stack ofintegers orreals.

    Without generic types, we are forced to declare the type of everything atcompile time. Thus we need two distinct sets of functions, like PushInteger andPushReal for each operation, which is waste.

    Object-Oriented programminglanguages provide features which enable us tocreate abstract data types which are more truly generic, making it cleaner andeasier to reuse code.

    Object-Oriented Programming

    Lecture 13

    Why Objects are Good Things

  • 8/2/2019 Data Structures and Programming

    50/162

    Modules provide a logical grouping of procedures on a related topic.

    Objects provide a logical grouping of data and associated operations.

    The emphasis of modules is on procedures; the emphasis of objects is on data.

    Modules are verbs followed by nouns:Push(S,x), while objects are nounsfollowed by verbs: S.Push(x).

    This provides only an alternate notation for dealing with things, but differentnotations can sometimes make it easier to understand things - the history ofCalculus is an example.

    Objects do a great job ofencapsulatingthe data items within, because the onlyaccess to them is through the methods, or associated procedures.

    Stack Object

    MODULE StackObj EXPORTS Main; (*24.01.95. LB*)(* Stack implemented as object type. *)

    IMPORT SIO;

    TYPEET = INTEGER; (*Type of elements*)Stack = OBJECT

    top: Node := NIL; (*points to stack*)METHODSpush(elem:ET):= Push; (*Push implements push*)

    pop() :ET:= Pop; (*Pop implements pop*)empty(): BOOLEAN:= Empty; (*Empty implements empty*)

    END; (*Stack*)Node = REF RECORD

    info: ET; (*Stands for any information*)next: Node (*Points to the next node in

    the stack*)

    END; (*Node*)

    PROCEDURE Push(stack: Stack; elem:ET) =(*stack: receiver object (self)*)VARnew: Node := NEW(Node, info:= elem); (*Element instantiate*)

    BEGINnew.next:= stack.top;stack.top:= new; (*new element added to top*)

    END Push;

    PROCEDURE Pop(stack: Stack): ET =(*stack: receiver object (self)*)VAR first: ET;BEGIN

  • 8/2/2019 Data Structures and Programming

    51/162

    first:= stack.top.info; (*Info copied from firstelement*)

    stack.top:= stack.top.next; (*first element removed*)RETURN first

    END Pop;

    PROCEDURE Empty(stack: Stack): BOOLEAN =(*stack: receiver object (self)*)BEGINRETURN stack.top = NIL

    END Empty;

    VARstack1, stack2: Stack := NEW(Stack); (*2 stack objects created*)i1, i2: INTEGER;

    BEGINstack1.push(2); (*2 pushed onto stack1*)stack2.push(6); (*6 pushed onto stack2*)i1:= stack1.pop(); (*pop element from stack1*)i2:= stack2.pop(); (*pop element from stack2*)

    SIO.PutInt(i1);SIO.PutInt(i2);SIO.Nl();

    END StackObj.

    Object-Oriented Programming

    Object-oriented programming is a popular, recent way of thinking aboutprogram organization.

    OOP is typically characterized by three major ideas:

    Encapsulation - objects incorporate both data and procedures. Inheritance - classes (object types) are arranged in a hierarchy, and each

    class inherits but specializes methods and data from its ancestors. Polymorphism - a particular object can take on different types at

    different times. We saw this with REFANY variables whose typesdepend upon what is assigned it it (dynamic binding).

    Inheritance

    When we define an object type (class), we can specify that it be derived from(subtype to) another class. For example, we can specialize the Stackobject intoa GarbageCan:

    TYPEGarbageCan = Stack OBJECT

    OVERRIDESpop():= Yech; (* Remove something from can?? *)dump():= RemoveAll; (* Discard everything from can *)

  • 8/2/2019 Data Structures and Programming

    52/162

    END; (*GarbageCan*)

    The GarbageCan type is a form of stack (you can still push in it the same way),but we have modified the pop and dump methods.

    This subtype-supertype relation defines a hierarchy (rooted tree) of classes. Theappropriate method for a given object is determined at run time (dynamic

    binding) according to the first class at or above the current class to define themethod.

    OOP and the Calculator Program

    How might object-oriented programming ideas have helped in writing thecalculator program?

    Many of you noticed that the linked stack type was similar to the long integertype, and wanted to reuse the code from one in another.

    The following type hierarchy shows one way we could have exploited this, bycreating special stack methodspush andpop, and overwritingthe addandsubtractmethods for general long-integers.

    Philosophical issue: shouldLong-Integerbe a subtype ofPositive-Long-Integeror visa versa?

    Why didn't I urge you to do it this way? In my opinion, the complexity ofmastering and using the OOP features of Modula-3 would very muchoverwhelm the code savings from such a small program. Object-orientedfeatures differ significantly from language to language, but the basic principlesoutlined here are fairly common.

    However, you should see why inheritance can be a big win in organizing largerprograms.

    Simulations

    Lecture 14

    Simulations

    Often, a system we are interested in may be too complicated to readilyunderstand, and too expensive or big to experiment with.

  • 8/2/2019 Data Structures and Programming

    53/162

    What direction will an oil spill move in the Persian Gulf, given certainweather conditions?

    How much will increases in the price of oil change the Americanunemployment rate?

    Now much traffic can an airport accept before long delays becomecommon?

    We can often get good insights into hard problems by performing mathematicalsimulations.

    Scoring in Jai-alai

    Jai-alai is a Basque variation of handball, which is important because you canbet on it in Connecticut. What is the best way to bet?

    The scoring system in use in Connecticut is very interesting. Eight players orteams appear in each match, numbered 1 to 8. The players are arranged ina queue, and the top two players in the queue play each other. The winner getsa point and keeps playing, the loser goes to the end of the queue. Winner is thefirst one to get to 7 points.

    This scoring obviously favors the low numbered players. For fairness, after thefirst trip through the queue, each point counts two.

    But now is this scoring system fair?

    Simulating Jai-Alai

    1 PLAYS 21 WINS THE POINT, GIVING HIM 11 PLAYS 33 WINS THE POINT, GIVING HIM 14 PLAYS 33 WINS THE POINT, GIVING HIM 25 PLAYS 3

    3 WINS THE POINT, GIVING HIM 36 PLAYS 33 WINS THE POINT, GIVING HIM 47 PLAYS 33 WINS THE POINT, GIVING HIM 58 PLAYS 3

  • 8/2/2019 Data Structures and Programming

    54/162

    8 WINS THE POINT, GIVING HIM 18 PLAYS 22 WINS THE POINT, GIVING HIM 21 PLAYS 22 WINS THE POINT, GIVING HIM 44 PLAYS 22 WINS THE POINT, GIVING HIM 65 PLAYS 25 WINS THE POINT, GIVING HIM 25 PLAYS 65 WINS THE POINT, GIVING HIM 45 PLAYS 77 WINS THE POINT, GIVING HIM 2

    3 PLAYS 77 WINS THE POINT, GIVING HIM 48 PLAYS 77 WINS THE POINT, GIVING HIM 61 PLAYS 77 WINS THE POINT, GIVING HIM 8WIN-PLACE-SHOW IS 7 2 3

    BETTER THAN AVERAGE TRIFECTAS: 1 TRIALS

    WIN PLACE SHOW OCCURRENCES

    7 2 3

    Is the Scoring Fair?

    How can we test if the scoring system is fair?

    We can simulate a lot of games and see how often each player wins the game!

    But when playerA plays a point against playerB, how do we decide who wins?If the players are all equally matched, we can flip a coin to decide. We can use

    a random number generatorto flip the coin for us!

    What data structures do we need?

    A queue to maintain the order of who is next to play. An array to keep track of each player's score during the game. A array to keep track of how often a player has won so far.

  • 8/2/2019 Data Structures and Programming

    55/162

    Simulation Results

    Jai-alai Simulation ResultsPos win %wins place %places show %shows1 16549 16.55 17989 17.99 15123 15.12

    2 16207 16.21 17804 17.80 15002 15.003 13584 13.58 16735 16.73 14551 14.554 12349 12.35 13314 13.31 13786 13.795 10103 10.10 10997 11.00 13059 13.066 10352 10.35 7755 7.75 11286 11.297 9027 9.03 8143 8.14 9007 9.018 11829 11.83 7263 7.26 8186 8.19

    total games = 100000

    Compare these to the actual win results from Berenson's Jai-alai 1983-1986:

    1 14.1%, 2 14.6%, 3 12.8%, 4 11.5%, 5 12.0%, 6 12.4%, 7 11.1%, 8 11.3%

    Were these results good?

    Yes, but not good enough to bet with! The matchmakers but the best players inthe middle, so as to even the results. A more complicated model will benecessary for better results.

    Limitations of Simulations

    Although simulations are good things, there are several reasons to be skepticalof any results we get.

    Is the underlying model for the simulation accurate?

    Are the implicit assumptions reasonable, or are there biases?

    How do we know the program is an accurate implementation of the givenmodel?

    After all, we wrote the simulation because we do not know the answers! How

    do you debug a simulation of two galaxies colliding or the effect of oil priceincreases on the economy?

    So much rides on the accuracy of simulations it is critical to build in self-verification tests, and prove the correctness of implementation.

    Random Number Generator

  • 8/2/2019 Data Structures and Programming

    56/162

    We have shown that random numbers are useful for simulations, but how do weget them?

    First we must realize that there is a philosophical problem withgenerating random numbers on a deterministic machine.

    ``Anyone who considers arithmetical methods of producing random digits is ,of course, in a state of sin.'' - John Von Neumann

    What we really want is a good way to generate pseudo-random numbers, asequence which has the same properties as a truly random source.

    This is quite difficult - people are lousy at picking random numbers. Note thatthe following sequence produces 0's + 1's with equal frequency but does notlook like a fair coin:

    Even recognizing random sequences is hard. Are the digits of pseudo-random?

    Should all those palindromes (535, 979, 46264, 383) be there?

    The Middle Square Method

    Von Neumann suggested generating random numbers by taking a big integer,squaring it, and using the middle digits as the seed/random number.

    It looks random to me... But what happens when the middle digits just happento be 0000000000? From then on, all digits will be zeros!

    Linear Congruential Generators

    The most popular random number generators, because of simplicity, quality,and small state requirements are linear congruential generators.

    If is the last random number we generated, then

  • 8/2/2019 Data Structures and Programming

    57/162

    The quality of the numbers generated depends upon careful selection of the

    seed and the constants a, c, and m.

    Why does it work? Clearly, the numbers are between 0 and m-1. Taking theremainder mod m is like seeing where a roulette ball drops in a wheelwith m slots.

    SUNY at Stony BrookMidterm 1

    CSE 214 - Data Structures October 10, 1997

    Midterm Exam

    Name: Signature:ID #: Section #:

    INSTRUCTIONS:

    You may use either pen or pencil. Check to see that you have 4 exam pages plus this cover (5 total). Look over all problems before starting work. Your signature above signs the CSE 214 Honor Pledge: ``On my honor

    as a student I have neither given nor received aid on this exam.'' Think before you write. Good luck!!

    1) (25 points) Assume that you have the linked structure on the left, whereeach node contains a .nextfield consisting of a pointer, and the pointerp pointsto the structure as shown. Describe the sequence of Modula-3 pointer

  • 8/2/2019 Data Structures and Programming

    58/162

    manipulations necessary to convert it to the linked structure on the right. Youmay not change any of the .info fields, but you may use temporary

    pointerstmp1, tmp2, and tmp3 if you wish.

    Many different solutions were possible, including:tmp1 := p;p := p.next;p^.next^.next := tmp1;

    2) (30 points) Write a procedure which ``compresses'' a linked list by deletingconsecutive copies of the same character from it. For example, thelist (A,B,B,C,A,A,A,C,A) should be compressed to (A,B,C,A,C,A). Thus thesame character can appear more than once in the compressed list, only notsuccessively. Your procedure must have one argument as defined below, aVAR parameterheadpointing to the front of the linked list. Each node in thelist has .info and .next fields.

    PROCEDURE compress(VAR head : pointer);

    Many different solutions are possible, but recursive solutions are particularlyclean and elegant.PROCEDURE compress(VAR head : pointer);VAR

    second : pointer; (* pointer to next element *)

    BEGINIF (head # NIL) THEN

    second := head^.next;IF (second # NIL)

    IF (head^.info = second^.info) THENhead^.next = second^.next;compress(head);

    ELSEcompress(head^.next);

    END;END;

    END;END;

    3) (20 points) Provide the output of the following program:

    MODULE strange; EXPORTS main;

    IMPORT SIO;TYPE

    ptr_to_integer = REF INTEGER;VARa, b : ptr_to_integer;

    PROCEDURE modify(x : ptr_to_integer; VAR y : ptr_to_integer);

  • 8/2/2019 Data Structures and Programming

    59/162

    beginx^ := 3;SIO.PutInt(a^);SIO.PutInt(x^); SIO.Nl();y^ := 4;SIO.PutInt(b^);SIO.PutInt(y^); SIO.Nl();

    end;

    begina := NEW(INTEGER); b := NEW(INTEGER);a^ := 1;b^ := 2;SIO.PutInt(a^);SIO.PutInt(b^); SIO.Nl();modify(a,b);SIO.PutInt(a^);SIO.PutInt(b^); SIO.Nl();

    end.Answers:1 23 34 43 4

    4) (25 points)

    Write brief essays answering the following questions. Your answer must fitcompletely in the space allowed

    (a) Explain the difference between objects and modules? ANSWER: Severalanswers possible, but the basic differences are (1) the notation to use them, and(2) that objects encapsulate both procedures and data where modules are

    procedure oriented. (b) What is garbage collection? ANSWER: The automaticreuse of dynamic memory which, because of pointer dereferencing, is no longeraccessible. (c) What might be an advantage of a doubly-linked list over asingly-linked list for certain applications? ANSWER: Additional flexibility inmoving both forward and in reverse on a linked list. Specific advantagesinclude being able to delete a node from a list given just a pointer to the node,

    and efficiently implementing double-ended queues (supporing push, pop,enqueue, and dequeue).

    http://www.cs.sunysb.edu/~skiena/214/lectures/index.htmlhttp://www.cs.sunysb.edu/~skiena/214/lectures/lect15/node1.html
  • 8/2/2019 Data Structures and Programming

    60/162

    Asymptotics

    Lecture 15

    Analyzing Algorithms

    There are often several different algorithms which correctly solve the sameproblem. How can we choose among them? There can be several differentcriteria:

    Ease of implementation Ease of understanding Efficiency in time and space

    The first two are somewhat subjective. However, efficiency is somethingwe can study with mathematical analysis, and gain insight as to which is thefastest algorithm for a given problem.

    Time Complexity of Programs

    What would we like as the result of the analysis of an algorithm? We mighthope for a formula describing exactly how long a program implementing it willrun.

    Example: Binary search will take milliseconds on an arrayofn elements.

    This would be great, for we could predict exactly how long our program willtake. But it is not realistic for several reasons:

    1. Dependence on machine type - Obviously, binary search will run fasteron a CRAY than a PC. Maybe binary search will now

    take ms?

    2. Dependence on language/compiler- Should our time analysis changewhen someone uses an optimizing compiler?

    3. Dependence of the programmer- Two different people implementing thesame algorithm will result in two different programs, each taking slightlydiffered amounts of time.

  • 8/2/2019 Data Structures and Programming

    61/162

    4. Should your time analysis be average or worst case? - Many algorithmsreturn answers faster in some cases than others. How did you factor thisin? Exactly what do you mean by average case?

    5. How big is your problem? - Sometimes small cases must be treateddifferent from big cases, so the same formula won't work.

    Time Complexity ofAlgorithms

    For all of these reasons, we cannot hope to analyze the performanceofprograms precisely. We can analyze the underlyingalgorithm, but at a less

    precise level.

    Example: Binary search will use about iterations, where each iterationtakes time independent ofn, to search an array ofn elements in the worst case.

    Note that this description is true for all binary search programs regardless oflanguage, machine, and programmer.

    By describing the worst case instead of the average case, we saved ourselvessome nasty analysis. What is the average case?

    Algorithms for Multiplications

    Everyone knows two different algorithms for multiplication: repeated addition

    and digit-by-digit multiplication.

    Which is better? Let's analyze the complexity of multiplying an n-digit number

    by an m-digit number, where .

    In repeated addition, we explicity use that . Thus

    adding an n-digit + m-digit number, requires `about'' n+m steps, onefor each digit.

    How many additions can we do in the worst case? The biggest n-digit number

    is all nines, and .

    The total time complexity is the cost per addition times the number of

    additions, so the total complexity .

  • 8/2/2019 Data Structures and Programming

    62/162

    Digit-by-Digit Multiplication

    Since multiplying one digit by one other digit can be done by looking up in amultiplication table (2D array), each step requires a constant amount of work.

    Thus to multiply an n-digit number by one digit requires ``about'' n steps.With m ``extra'' zeros (in the worst case), ``about'' n+ m steps certainly suffice.

    We must do m such multiplications and add them up - each add costs as muchas the multiplication.

    The total complexity is the cost-per-multiplication * number-of-multiplications

    + cost-per-addition * number-of- multiplication .

    Which is faster?

    Clearly the repeated addition method is much slower by our analysis, and thedifference is going to increase rapidly with n...

    Further, it explains the decline and fall of Roman empire - you cannot do digit-

    by-digit multiplication with Roman numbers!

    Growth Rates of Functions

    To compare the efficiency of algorithms then, we need a notation toclassify numerical functions according to their approximate rate of growth.

    We need a way ofexactly comparing approximately definedfunctions. This isthe big Oh Notation:

    Iff(n) andg(n) are functions defined for positive integers, thenf(n)= O(g(n))

    means that there exists a constant csuch that forallsufficiently large positive integers.

  • 8/2/2019 Data Structures and Programming

    63/162

    The idea is that iff(n)=O(g(n)), thenf(n) grows no faster (and possibly slower)thang(n).

    Note this definition says nothingabout algorithms - it is just a way to comparenumerical functions!

    Examples

    Example: is . Why? For all n > 100,

    clearly , so it satisfies the definition forc=100.

    Example: is not . Why? No matter what value ofc you

    pick, is not true forn>c!

    In the big Oh Notation, multiplicative constants and lower order terms areunimportant. Exponents are important.

    Ranking functions by the Big Oh

    The following functions are different according to the big Oh notation, and areranked in increasing order:

    O(1) Constant growth

    Logarithmic growth (note:independent of base!)

    Polynomial growth: ordered by exponent

  • 8/2/2019 Data Structures and Programming

    64/162

  • 8/2/2019 Data Structures and Programming

    65/162

    An Application: The Complexity of Songs

    Suppose we want to sing a song which lasts forn units of time. Since n can belarge, we want to memorize songs which require only a small amount of brainspace, i.e. memory.

    Let S(n) be thespace complexity of a song which lasts forn units of time.

    The amount of space we need to store a song can be measured in either thewords or characters needed to memorize it. Note that the number of characters

    is since every word in a song is at most 34 letters long -Supercalifragilisticexpialidocious!

    What bounds can we establish on S(n)? S(n) = O(n), since in the worst case we

    must explicitly memorize every word we sing - ``The Star-Spangled Banner''

    The Refrain

    Most popular songs have a refrain, which is a block of text which gets repeatedafter each stanza in the song:

    Bye, bye Miss American pieDrove my chevy to the levy but the levy was dryThem good old boys were drinking whiskey and rye

    Singing this will be the day that I die.

    Refrains made a song easier to remember, since you memorize it once yet singit O(n) times. But do they reduce the space complexity?

    Not according to the big oh. If

    Then the space complexity is still O(n) since it is only halved (if the verse-size= refrain-size):

    The kDays of Christmas

    To reduce S(n), we must structure the song differently.

  • 8/2/2019 Data Structures and Programming

    66/162

    Consider ``The kDays of Christmas''. All one must memorize is:

    On the kth Day of Christmas, my true love gave to me,

    On the First Day of Christmas, my true love gave to me, a partridge in a peartree

    But the time it takes to sing it is

    If , then , so . 100 Bottles ofBeer

    What do kids sing on really long car trips?

    n bottles of beer on the wall,n bottles of beer.You take one down and pass it aroundn-1 bottles of beer on the ball.

    All you must remember in this song is this template of size , and the

    current value ofn. The storage size forn depends on its value, but bitssuffice.

    This for this song, .

    Uh-huh, uh-huh

    Is there a song which eliminates even the need to count?

    That's the way, uh-huh, uh-huhI like it, uh-huh, huh

    Reference: D. Knuth, `The Complexity of Songs', Comm. ACM, April 1984,pp.18-24

  • 8/2/2019 Data Structures and Programming

    67/162

    Introduction to Sorting

    Lecture 16

    Sorting

    Sorting is, without doubt, the most fundamental algorithmic problem

    1. Supposedly, 25% of all CPU cycles are spent sorting2. Sorting is fundamental to most other algorithmic problems, for example

    binary search.3. Many different approaches lead to useful sorting algorithms, and these

    ideas can be used to solve many other problems.

    What is sorting? It is the problem of taking an arbitrary permutation ofn itemsand rearrangingthem into the total order,

    Knuth, Volume 3 of ``The Art of Computer Programming is the definitivereference of sorting.

    Issues in Sorting

    Increasing or Decreasing Order? - The same algorithm can be used by both all

    we need do is change to in the comparison function as we desire.

    What about equal keys? - Does the order matter or not? Maybe we need to sorton secondary keys, or leave in the same order as the original permutations.

    What about non-numerical data? - Alphabetizing is sorting text strings, andlibraries have very complicated rules concerning punctuation, etc. IsBrown-Williams before or afterBrown America before or afterBrown, John?

    We can ignore all three of these issues by assuming a comparisonfunction which depends on the application. Compare (a,b)should return ``'', or ''=''.

    Applications of Sorting

  • 8/2/2019 Data Structures and Programming

    68/162

    One reason why sorting is so important is that once a set of items is sorted,many other problems become easy.

    SearchingBinary search lets you test whether an item is in a dictionary

    in time.

    Speeding up searching is perhaps the most important application of sorting.

    Closest pairGiven n numbers, find the pair which are closest to each other.

    Once the numbers are sorted, the closest pair will be next to each other insorted order, so an O(n) linear scan completes the job.

    Element uniquenessGiven a set ofn items, are they all unique or are there any

    duplicates?

    Sort them and do a linear scan to check all adjacent pairs.

    This is a special case of closest pair above.

    Frequency distribution - ModeGiven a set ofn items, which element occurs thelargest number of times?

    Sort them and do a linear scan to measure the length of all adjacent runs.

    Median and SelectionWhat is the kth largest item in the set?

    Once the keys are placed in sorted order in an array, the kth largest can befound in constant time by simply looking in the kth position of the array.

    How do you sort?

    There are several different ideas which lead to sorting algorithms:

    Insertion - putting an element in the appropriate place in a sorted list

    yields a larger sorted list. Exchange - rearrange pairs of elements which are out of order, until no

    such pairs remain. Selection - extract the largest element form the list, remove it, and

    repeat. Distribution - separate into piles based on the first letter, then sort each

    pile.

  • 8/2/2019 Data Structures and Programming

    69/162

    Merging- Two sorted lists can be easily combined to form a sorted list.

    Selection Sort

    In my opinion, the most