Complexity and Asymptotic ative programs; and ¢â‚¬¢ Be able to describe the relationship between time

  • View
    0

  • Download
    0

Embed Size (px)

Text of Complexity and Asymptotic ative programs; and ¢â‚¬¢ Be able to describe...

  • Chapter 3

    Complexity and Asymptotic Analysis

    In this part of the course we will look at measures of program execution e�ciency. In other words, how to answer the question which program is better? We will begin by discussing how we can analyse algorithms directly and then use this to motivate a technique for approximating their behaviours as the problem size increases. This discussion will focus on upper bounds on the amount of time required. We will extend the discussion to also consider lower bounds and space complexity. We then meet the concept of complexity classes, which are a way in which we can group functions together based on their complexity. The chapter will finish with a discussions about practical applications of these concepts.

    Learning Outcomes

    At the end of this Chapter you will:

    • Be familiar with the methods used to analyze the e�ciency of algo- rithms written in an imperative programming language;

    • Be able to explain the ‘Big-Oh’ notation (and its relatives ⌦ and ⇥);

    • Be able to analyse and compare the asymtotic complexities of imper- ative programs; and

    • Be able to describe the relationship between time complexity and space complexity and the notion of a complexity class

    69

  • 70 CHAPTER 3. COMPLEXITY AND ASYMPTOTIC ANALYSIS

    3.1 Setting the Scene

    To set the scene we answer a few clarifying questions about what this chapter is focussing on.

    What are we analysing?

    We are analysing programs (or algorithms, they mean the same thing here) to measure their complexity. This is usually taken to mean predicting the resources required by the algorithm. Sometimes this is memory, commu- nications bandwidth, or computer hardware, but usually we are concerned with the time taken to execute the algorithm. And for the first part of this chapter this is what we will focus on.

    Which version of a program are we analysing?

    If we wanted to write a program to check if an input is larger than 1000 we could write the following program:

    r e s u l t := 0 while (n>0 & r e s u l t =0)

    i f n != 1000 then n := n�1 else r e s u l t :=1 If we give this program 1000 it terminates immediately with the correct answer. If we give the program 999 it will perform 999 iterations before it terminates with the correct answer. If we give the program 10000 it will perform 9000 iterations before it terminates with the correct answer. If we assume that the program could receive numbers in the range 1 to 10000 with uniform probability then the average number of iterations performed will be

    1 10000 +

    ⇣P999 n=1

    1 10000n

    ⌘ + ⇣P10000

    n=1001 1

    10000(n� 1000) ⌘

    = 1

    10000

    � 1 + 999⇥10002 +

    � 10000⇥10001

    2 � 1000⇥10012 � + (10000⇥�1000)�

    = 1

    10000 (1 + 499500 + (50005000� 500500)� 10000000) = 4000.4

    Which took some non-trivial calculations to find out. Which behaviour are we interested in? The best case? The worst case? The likely case? How do we know what the likely case is? Above I assumed that the inputs were drawn uniformly from some range but we rarely know such details. If we are implementing a sorting function are the inputs likely to be already sorted?

  • 3.1. SETTING THE SCENE 71

    If we are implementing a function to turn an NFA into a DFA, is the input almost an NFA already?

    Practically we often think about the common case but in these notes we will mainly be focussing on the worst case as this allows us to more easily compare programs without worrying about what the expected use case is.

    What’s the di↵erence between programs and problems?

    Previously we were very particular about the di↵erence between a program and the function it computes i.e. the problem it solves. That same distinc- tion is still important here. Two programs can compute the same function (solve the same problem) and have di↵erent complexities. But we might also be interested in the inherent complexity of a problem itself. For example, finding the maximum value in an array is a problem that inherently requires us to look at every element in an array. No solution can be a correct solution without doing this. This suggests a lower bound on the complexity of any solution. Throughout this chapter we will abuse terminology to talk about a problem’s complexity. When we do this we mean the best possible program that could solve that problem.

    Therefore, when we talk about programs we will usually think in terms of upper bounds (worst case) and when we talk about problems we talk about lower bounds (best case). We clarify this more later.

    What are we not doing?

    This isn’t a course in algorithms. We won’t be introducing complicated algorithms to solve specific problems. In fact, we won’t be looking at many programs. The main learning outcome is to understand the definition of Big- Oh notation and what it means. This is a first year course. We won’t be exploring advanced material on computational complexity. The only model of computation we have met is the while language, so we won’t be looking at things from the context of Turing Machines and we won’t be considering parallel, non-deterministic, or random computation, all of which change the story.

    We are not analysing programs in terms of their readability or under- standability. This is not what we mean by complexity here. Such things are important but this chapter is interested in the complexity of the computation performed by a program not the program itself.

  • 72 CHAPTER 3. COMPLEXITY AND ASYMPTOTIC ANALYSIS

    3.2 Simplifying Assumptions

    To answer the question of how long a program will take to execute we need to answer some more simple questions first. For example, how long does it take a computer to compute 1 + 2? This will depend on the architecture and its instruction set. For some reduced instruction set architectures, computing 1 + 2 might require multiple instructions to place 1 and 2 into registers first, taking multiple cycles. For a complex instruction set there may be a single instruction that does everything in one cycle, but typically cycles take longer. This distinction might not matter for adding two 32-bit integers. But what about multiplication or division? What if we are doing security for e- commerce involving the RSA cryptographic protocols, where we might need to perform operations on 200-400 digit numbers. These will be implemented as a sequence of machine instructions operating on vectors of 32 bit unsigned integers. Thus the number of operations performed will depend on the size of the integers concerned. In this course we have been considering Z in general; but arguably arithmetic does not have fixed cost for all integers. As a simplifying assumption we can assume that we only deal with numbers that fit into 32-bit integers, or at least numbers for which the cost of arithmetic operations on them is fixed.

    Can we access and write to all memory in constant time? Not necessarily, to perform an operation on a data value the computer may need to move it into a register, and move the result. More generally, computers have memory hierarchies with di↵erent access times. As a simplifying assumption we can assume that all accesses occur in constant time. However, in many practical programs memory access time is the limited factor and in such cases it should take part in the complexity analysis.

    As part of complexity analysis we measure the usage of some resource but the units that we measure will have a significant impact on the result. Particularly, for time complexity we need to decide which operations matter and what their associated cost should be.

    Important The instructions or operations that we measure, is an informed choice made by the computer scientist. Ideally, this choice should be made so that the results obtained are useful.

  • 3.3. ANALYZING ALGORITHMS 73

    3.3 Analyzing Algorithms

    Recall the division algorithm introduced in Exercise 1.2.1:

    1 r := x ; d := 0 ; 2 while y  r do 3 d := d+1; r := r�y ;

    How many primitive operations are performed executing this program? Let’s count the operations used on each line. Line 1 contains two assignments. Line 2 performs the loop check. If the loop is executed N times then we perform N + 1  operations. Similarly, on line 3 we will perform N + operations, N � operations, and 2N assignments. The only question we now need to address is how big is N? Notice that variable d is assigned the value of 0; and each time we iterate through the loop we increment d by 1. At the end of the execution of the loop, d will have the value bx

    y

    c, and this is thus the value of N as well. So, if we assume that all of these operations have the same cost, the cost of this program is

    3 + 5

    � x

    y

    Now let us consider the slightly more complex program from Exam- ple 1.2.5 where the following program checks whether x is prime.

    1 y:=x�1; 2 z :=1; 3 while (y>1 ^ z=1) do ( 4 r :=x ; 5 (while ( y  r ) do r :=r�y ) ; 6 ( i f ( r=0) then z :=0) ; 7 y:=y�1; 8 )

    Let’s consider the case where x is prime, this is the worst case as when x is non-prime the outer loop will terminate early. In this case we check if every number less than x divides x. There are x � 2 such numbers so the outer loop executes x � 2 times. The inner loop is the algorithm we used above, so we can reuse the results from that analysis. Given the three initial operations (two assignments and a �) before the outer loop and the additional 2 assignments,