Department of Computer Science

Winter 1997 Class Notes for 308-251B

Landau symbols have mathematically precise definitions. They have many uses, such as in the analysis of algorithms. These symbols are used to evaluate and to concisely describe the performance of an algorithm, in time and in space.

Five different symbols were discussed in class. They are :

- O (called the
*Big Oh*) - (upper case greek letter
*OMEGA*) - (upper case greek letter
*THETA*) - o (called the
*Little Oh*) - (lower case greek letter
*OMEGA*)

But only three of them were properly defined in class :

The

*Big Oh*is the**upper bound**of a function. In the case of algorithm analysis, we use it to bound the worst-case running time, or the longest running time possible for*any*input of size*n*. We can say that the**maximum**running time of the algorithm is in the order of*Big Oh*.is also an order of growth but it is the opposite of the

*Big Oh*: it is the**lower bound**of a function. We can say that the**minimum**running time of the algorithm is in the order of .is the rate of growth that is the most interesting of the three. It is composed of

**both***Big OH*and .

If a_{n}and b_{n}are positive sequences, then a_{n}= O (b_{n})-if C, n_{o}such that a_{n}C b_{n}for all n n_{o}

If a_{n}and b_{n}are positive sequences, then a_{n}= (b_{n})-if C, n_{o}such that a_{n}C b_{n}for all n n_{o}

If a

_{n}and b_{n}are positive sequences, then a_{n}= (b_{n})-if C, n_{o}such that a_{n}C b_{n}for all n n_{o}

-if C, n_{o}such that a_{n}C b_{n}for all n n_{o}

This graph shows the bounds on a given function.

This function can depict the running time of an algorithm.

A few remarks :

- We only care what happens for large n (n
n
_{o}) - O, , "hide" the constant C from us.
- If a
_{n}is the worst-case time of an algorithm, then O(.) gives worst-case guarantess (good news), while gives a lower bound (bad news)

Here is a simple example to show why C is not needed.

Suppose we know the running time of a given algorithm to be (n^{2}) (as in Insertion Sort or Bubble Sort of n elements). Say we timed the algorithm at n = 1000, and we find that it takes 4 seconds. If we want to find the time for n = 1 million, we don't have to run the algorithm over again. Since the time grow quadratically in this case, the estimated time for the problem should be close to (1000)^{2}* 4 seconds, or 4 million seconds. Had this algorithm been linear ( (n) ), the time would have been (1000) * 4 or 4000 seconds only! In both of these cases (quadratic and linear), the constant C was not needed.

This is a table of Complexity Classes. The row entries are in increasing order of growth.

Class | Function/Order | Examples | Remarks |
---|---|---|---|

1 | Constant time | A[i] | The running time of the algorithm doesn't depend on the value of n. |

log*n | Log Star time | ... | log*n = min {i : log For practical purposes, log*n is never more than 5. For
example, 2 FIGURE 1 shows the values of log star from 0 to 5. |

log_{a}n | Logarithmic time | binary search | Binary
Search: Every time you go through the recursion (binary search uses
recursion), the problem is reduced in half. So the most amount of
times you can go through the recursion is log |

n | Linear time | printing, listing | The running time of the algorithm is proportional to the value of n. |

n * log(n) | ... | heapsort, mergesort | ... |

n^{2} | Quadratic time | bubblesort, insertion sort | Why is insertion sort quadratic? Comparisons : |

n^{a} | Polynomial time | ... | Hypothesis : Exponential is larger
than polynomial : Compute the limit of c
c This was found by the Taylor
series expansion of e Substituting
n As a result, Exponentials clearly swamps Polynomials. |

2^{n}, 10^{n}, a^{n} | Exponential time | security codes | Hans Bremermann was an early computer scientist
in Berkeley. In 1962, based upon physical reasoning, he stated that no
computer, biological or artificial, could ever process more than
(2 * 10
Let's assume that this is true. Now, assume that we protect a computer
system with a code consisting of 40 decimal digits. Breaking this code
by brute force would in the worst case require trying out 10 Now, if we protected it with 100 digits, we would be safe, even if we had a 10 ton computer, for at least ^{100} *
log_{2}(10) / (2 * 10^{47} * 10^{7})
seconds.That is basically until eternity. |

n! | Factorial time | TSP | TSP is the Obviously, it would be naive to try all the permutations. There are (n-1)! permutations, and each permutation requires computation worth n (add n distances), for a total of n!. Many tried their hand on this problem, but few have found a good solution for it. |

e^{n1/2} | Betweem Poly. and Exp. | ... | Is there life between polynomial and exponential? e |

n | log*n |
---|---|

1 | 0 |

2^{1}=2 | 1 |

2^{2}=4 | 2 |

2^{4}=16 | 3 |

2^{16}=65536 | 4 |

2^{65536} | 5 |

Summations are used extensively in computer science as a mathematical tool to help analyze the running time of algorithms. In computer science, most algorithms are completed through the repitition of simple actions, or loops. Loops are created in two different ways: first, by constructs such as repeat,while,and for, and second, by recursion. In order to analyze the times taken by looping programs, we have to deal with summations, as one must sum the times taken in various iterations. In order to analyze the times taken by recursive programs, we have to deal with recurrences, which is the subject of the next lecture.

There are five basic methods for dealing with summations.

Method | Example Easily Solved by the Given Method |
---|---|

Rough Bounding | _{
i=1 to n} i |

Exact Computations | _{
i=1 to n} i |

Induction | _{
i=0 to n} 3^{i} C * 3
^{n} |

Telescoping Series | _{i=1 to n
} 1 / (i * (i + 1)) |

Comparison With an Integral | _{
i=1 to n} 1/i |

#### Rough Bounding

Given a summation, we would like to bound each term by something simple, and thus make the summation more managable. While this method will not determine an exact measure of the running time of the algorithm which the summation is supposed to represent, it is useful in achieving bounds to solve Big Oh notation questions. In the example below, the summation represents the running time of many different algorithms, among them selection sort.

- Example :
(

_{ i=1 to n}i)= (1 + 2 + .... + n)

(n + n + .... + n)

= n^{2}

- Example :
#### Exact Computations

As the name of the method suggests, we would like to compute an exact numerical value for the summation.

- Example 1 :
(

_{i=1 to n}i) = [n (n + 1) / 2]Here is a short proof of this identity. Pair off the n numbers such that the first element is grouped with the last element (n

^{th}element), the second is grouped with the (n-1)^{th}element, and so on. For example, 1 is grouped with n, 2 is grouped with (n-1), 10 is grouped with (n-9), etc. Clearly, each pair sums to (n+1), and there are (n/2) such pairs. Thus the equality clearly follows.

- Example 2 : Geometric Series
>From high school mathematics we know that :

(1 + x + x

^{2}+ ... + x^{n}) = [(x^{n+1}- 1) / (x - 1)]So if | x | < 1 then :

(

_{i=0 to }x^{i}) = [1 / (1 - x)]

- since the limit of x
^{n+1}, as x approaches , is 0Also, if | x | > 1 then we can differentiate the right hand side and left hand side of the above equation with respect to x. As a result,

_{i=0 to }i(x^{i-1}) = (1 - x)^{-2}Here is a brainteaser for you to solve by yourself.

What is

_{i=0 to }x^{i}i^{2}?

Hint: Differentiate once more. - since the limit of x

- Example 1 :
#### Induction

Induction can be used to prove a certain statement. We say it is "induction with respect to a parameter n", if we first show it for n = 0. Then we assume the statement to be true up to but not including n, and finally we verify that statement for n.

- Example :
(

_{i=0 to n }3^{i}) (C * 3^{n})In this case, C is some constant to be determined.

We believe the statement to be true because an exponentially growing series usually has a huge dominating last term.Step 1 : The inequality is trivially true for n = 0 provided C 1

Step 2 : Assume that the inequality is true for all integers up to and including n-1

Step 3 : Verify that it is true for n(

_{i=0 to n}3^{i})= [

_{i=0 to n-1}3^{i}+ (3^{n})]

[(C * 3^{n-1}) + (3^{n})]*(definition by induction)*The problem now boils down to whether the following inequality is satisfied

[(C * 3

^{n-1}) + (3^{n})] (C * 3^{n})Dividing both sides of the inequality by 3

^{n}yields :(C + 3) 3 * C

3 2 * C

3/2 CThus the inequality is satisfied as long as 3 / 2 C. We showed the original statement with C = 3 / 2. So indeed, the last term in an exponentially increasing sum is about the same order of magnitude as the entire sum.

- Example :
#### Telescoping Series

A telescoping series is a series where each term can be expressed as a difference in such a way that most of the terms cancel each other out.

- Example :
_{i=1 to n}[1 / (i * (i + 1))]

=

_{i=1 to n}[(1 / i) - (1 (i + 1))]

= 1 - [1 / (n + 1)]

- Example :
#### Comparison With An Integral

The basic principle behind this method is that

_{i=1 to n}f(i)_{i=1 to n}f(x) dx- Example : Harmonic Series
_{i=1 to n }(1/i)_{i=1 to n}(1/x) dx

= ln (n)In the above diagram, we are taking a finite amount of rectangles each of width 1 and height 1/i for i = 1 to n. Inspection of the diagram clearly shows that the sum of all these rectangles is less than 1 +

_{i=1 to n }(1/x) dxln(n+1)

_{ i=1 to n}(1/i)

1 +_{i=1 to n}(1/x)dx

= 1 + ln(n)

### Chip Testing Problem

This is Problem 4.7 from Cormen et al (our course textbook).

This problem is motivated by the VLSI industry, where millions of VLSI chips must be tested for defaults. The idea is to let the chips test each other by built-in circuitry. Note that VLSI stands for

*very large scale integration*which is the integrated-circuit chip technology used to fabricate most microprocessors today.There are two kinds of chips - good and bad. Good chips always tell the truth, i.e. will report accurately whether the tested chip is good or bad. Bad chips say anything about the other chip, thus cannot be trusted. Our goal is to identify all good chips. The problem must be solved in O(n) time.

It is assumed that there are more good chips than bad. In other words, |G| > |B|. Our computational model states that each chip tested counts for one time unit.

It suffices to define one good chip in time O(n). This is because one good chip can identify all good chips in O(n) time.

#### A naive method

We are going to test each chip individually by comparing it against all remaining chips. We take votes from the remaining chips and let the majority decide. In case of equeality, good wins. If a chip is declared bad, it is junked and we continue.

Time Analysis :

(n-1)+(n-2)+(n-3)+...+(n/2) n + n + .... + n = (n

^{2})/2 = O(n^{2})However such time is unacceptable, for it greatly exceeds O(n) time.

#### Divide and Conquer Paradigm

Definitions :

- G : set of all good chips
- B : set of all bad chips
- B-B : pair of chips that has been declared "bad, bad"
- B-G : pair of chips that has been declared "bad, good"
- G-G : pair of chips that has been declared "good, good"

Here is one iteration of the divide and conquer paradigm:

If n = 1 or n = 2,

- then return one chip (it is good).
If n is odd,

- test odd chip.
If odd chip is good,

- return the chip (it is good).
If odd chip is bad,

- with remaining chips do :
- pair all chips
- junk all B-G, B-B pairs
- junk one chip of every G-G pair
- find (recursively) one good chip among the remaining chips

#### Remarks :

If before the iteration, |G| > |B|, then after the iteration, |G| > |B|. Before the pairing step, we have |G| |B| + 2. Each B-B, B-G pair has at one bad chip. So we junk at least as many bad chips as good ones. So, we still have |G| |B| + 2. Each G-G pair has either two good chips or two bad ones; what matters is that both chips are alike. Thus, the halving step leaves us with |G| |B| + 1.

Time Complexity :

If T(n) is the worst-case time for n chips, then T(2) = T(1) = 0 andT(n) n - 1 + floor(n/2) + T(floor(n/2))

< 3n/2 + T(floor(n/2))

< 3n/2 + 3n/4 + T(floor(n/4))

< 3n/2 + 3n/4 + 3n/8 + T(floor(n/8)) ...So that clearly

- T(n) < 3n/2 (1 + 1/2 + 1/4 + 1/8 + ...) = 3n
where we used a summation formula for the geometric series seen above

### Java Applet on Chip Testing Problem

Here is the source code of the java applet : ShowChip.java

### Links to related material

- Design and Analysis of Algorithms, Department of Information of CS of University of California, Irvine : ICS 161 : Lecture notes for Jan 25, 1996.
- Advanced Data Structures & Algorithms, Department of Computer and Information Sciences at the New Jersey Institute of Technology : Spring 1997 Lecture notes.
- Algorithm Analysis and Complexity, University of Guelph, Data Structures, Course 27-242, Prof. David Calvert.
- Analyzing Algorithm Performance, Warren Harrison, Computer Science Professor at Portland State University, Spring 1996.
- Asymptotic Complexity, University of Ottawa, DSS T26 Data Structure Course, Prof. Denis Duchier.
- Best, Worst and Average cases, University of Ottawa, DSS T26 Data Structures Course, Prof. Denis Duchier.
- Big Oh notation as backgroud for the Euler-Mascheroni constant
- Examples of Big Oh notation for Complexity Classes, University of Ottawa, DSS T26, Data Structures Course, Prof. Denis Duchier.

### References

- CORMEN, Thomas H. and LEISERSON, Charles E. and RIVEST, Ronald L.,
*Introduction to Algorithms*, The Massachusetts Institute of Technology, Cambridge, Massachusetts, 1990. - VAN DER LINDEN, Peter,
*Just Java*, Sun Microsystems Inc, Mountain View, California, 1996, pp350. - Mathematical Symbols made by Amy Chow.

### Web page designers

This section is provided to give readers the opportunity to contact us, to point out errors or omissions, to ask questions, and to suggest improvements. Explanations on the role of each team member is included in the creation of the web page.

The creators are, in no particular order of importance except alphabetical :

- Anastassios Soumas (asouma@cs.mcgill.ca) - Java Applet Programmer, References and Links
- Anatoly Likhatchev (tolik@dsuper.net) - Image and Graph Designer, Chip Testing Problem
- Andrew Bierbryer (bnss@musicb.music.ca) - Notes Compilation and Research, Links, HTML Editing
- Andrew Ma (hurc@cs.mcgill.ca) - HTML Generation and Editing, Notes Compilation

Last updated February 17, 1997 by Andrew Ma.

Copyright © 1997, Anastassios Soumas, Anatoly Likhatchev, Andrew Bierbryer and Andrew Ma. All rights reserved. Reproduction of all or part of this work is permitted for educational or research use provided that this copyright notice is included in any copy. Disclaimer: this collection of notes is experimental, and does not serve as-is as a substitute for attendance in the actual class.- Example : Harmonic Series