1 Programming for Engineers in Python Autumn 2011-12 Lecture 9: Sorting, Searching and Time...

Preview:

DESCRIPTION

3 Today Information Importance of quick access to information How can it be done? Preprocessing the data enables fast access to what we are interested in Example: dictionary The most basic data structure in Python: list Sorting  preprocessing Searching  fast access Time complexity

Citation preview

1

Programming for Engineers in

Python

Autumn 2011-12

Lecture 9: Sorting, Searching and Time Complexity Analysis

Design a recursive algorithm by1. Solving big instances using the solution to smaller

instances2. Solving directly the base cases

Recursive algorithms have1. Stopping criteria2. Recursive case(s)3. Construction of a solution using solution to

smaller instances 2

Lecture 8: Highlights

3

Today• Information• Importance of quick access to information• How can it be done?

• Preprocessing the data enables fast access to what we are interested in

• Example: dictionary

• The most basic data structure in Python: list• Sorting preprocessing• Searching fast access

• Time complexity

Information

4

There are about 20,000,000,000 web pages in the internet

Sorting

• A sorted array is an array whose values are in ascending/descending order

• Very useful• Sorted array example: • Super easy in Python – the sorted function

1 2 5 678 139

5

Why is it Important to Sort Information?

• To find a value, and fast!• Finding M values in a list of size N• Naive solution: given a query, traverse the list

and find the value• Not efficient, average of N/2 operations per query

• Better: sort the array once and than perform each query much faster

6

Why not Use Dictionaries?

• Good idea!• Not appropriate for all applications:

• Find the 5 most similar results to the query• Query's percentile

• We will refer to array elements of the form (key, value)

7

Naïve Search in a General Array

• Find location of a value in a given array

8

Binary Search (requires a sorted array)

9

• Input: sorted array A, query k

• Output: corresponding value / not found

• Algorithm:• Check the middle element in A

• If the corresponding key equals k return corresponding value

• If k < middle find k in A[0:middle-1]

• If k > middle find k in A[middle+1:end]

10

Example

-5-30481122565797value

index 0 2 3 4 51 6 7 8 9

Searching for 56

11

Example

-5-30481122565797value

index 0 2 3 4 51 6 7 8 9

Searching for 4

12

Code –Binary Search

13

Binary Search – 2nd Try

Time Complexity

14

• Worst case: • Array size decreases with every recursive call

• Every step is extremely fast (constant number of operations - c)

• There are at most log2(n) steps

• Total of approximately c*log2(n)

• For n = 1,000,000 binary search will take 20 steps - much faster than the naive search

Time Complexityעל רגל אחת

15

• Algorithms complexity is measured by run time and space (memory)

• Ignoring quick operations that execute constant number of times (independent of input size)

• Approximate time complexity in order of magnitude, denoted with O (http://en.wikipedia.org/wiki/Big_O_notation)

• Example: n = 1,000,000• O(n2) = constant * trillion (Tera)

• O(n) = constant * million (Mega)

• O(log2(n)) = constant * 20

16

Order of Magnitude

nlog2 nn log2 nn2

1001

16464256

25682,04865,536

4,0961249,15216,777,216

65,536161,048,5654,294,967,296

1,048,5762020,971,5201,099,511,627,776

16,777,21624402,653,183281,474,976,710,656

Graphical Comparison

17

18

Code – Iterative Binary Search

19

Testing EfficiencyPreparations

http://www.doughellmann.com/PyMOTW/timeit/

(Tutorial for timeit: http://www.doughellmann.com/PyMOTW/timeit/)

20

Testing Efficiency

21

Results

Until now we assumed that the array is sorted…

22

How to sort an array efficiently?

Bubble Sortמיון בועות

23

נסרוק את המערך ונשווה כל זוג ערכים שכנים•נחליף ביניהם אם הם בסדר הפוך•נחזור על התהליך עד שלא צריך לבצע יותר החלפות •

(המערך ממויין)

למה בועות? •האלגוריתם "מבעבע" בכל סריקה את האיבר •

הגדול ביותר למקומו הנכון בסוף המערך.

24

7 2 8 5 4

2 7 8 5 4

2 7 8 5 4

2 7 5 8 4

2 7 5 4 8

2 7 5 4 8

2 5 7 4 8

2 5 4 7 8

2 7 5 4 8

2 5 4 7 8

2 4 5 7 8

2 5 4 7 8

2 4 5 7 8

2 4 5 7 8

)done(

Bubble Sort Example

25

Code – Bubble Sort

n iterations

i iterations

constant

)n-1 + n-2 + n-3 + …. + 1 * (const ~ ½ * n2

26

זמן סיבוכיות לחישוב דוגמאותריצה

מצא ערך מקסימלי במערך לא ממויין•מצא ערך מקסימלי במערך ממויין•מצא את הערך החמישי הכי גדול במערך •

ממוייןמצא ערך מסויים במערך לא ממויין•מצא ערך מסויים במערך ממויין• "שאלות פיבונאצ'י"nענה על •

בסדרת פיבונאצ'י?Kשאלת פיבונאצ'י: מהו הערך ה-•MAX מוגבל להיות קטן מ-Kנניח ש-•

We showed that it is possible to sort in O(n2)Can we do it faster?

27

Yes! – Merge Sort

28

Comparing Bubble Sort with sorted

The Idea Behind Merge Sort

29

• Sorting a short array is much faster than a long array

• Two sorted arrays can be merged to a combined sorted array quite fast (O(n))

Generic Sorting

30

• We would like to sort all kinds of data types• Ascending / descending order• What is the difference between the different cases?• Same algorithm!• Are we required to duplicate the same algorithm

for each data type?

The Idea Behind Generic Sorting• Write a single function that will be able to sort all types in

ascending and descending order • What are the parameters?

• The list

• Ascending/descending order

• A comparative function

31

Recommended