Landau symbols have mathematically precise definitions. They have many uses, such as in the analysis of algorithms. These symbols are used to evaluate and to concisely describe the performance of an algorithm, in time and in space.
Five different symbols were discussed in class. They are :
But only three of them were properly defined in class :
The Big Oh is the upper bound of a function. In the case of algorithm analysis, we use it to bound the worst-case running time, or the longest running time possible for any input of size n. We can say that the maximum running time of the algorithm is in the order of Big Oh.
If an and bn are positive sequences, then an = O (bn)-ifC, no such that an
C bn for all n
no
is also an order of growth but it is the opposite
of the Big Oh : it is the lower bound of a function. We
can say that the minimum running time of the algorithm is in the order of
.
If an and bn are positive sequences, then an =(bn)
-ifC, no such that an
C bn for all n
no
is the rate of growth that is the most interesting
of the three. It is composed of both Big OH and
.
If an and bn are positive sequences, then an =
(bn)
-ifC, no such that an
C bn for all n
no
-ifC, no such that an
C bn for all n
no
This graph shows the bounds on a given function.
This function can depict the running time of an algorithm.
A few remarks :
Here is a simple example to show why C is not needed.
Suppose we know the running time of a given algorithm to be(n2) (as in Insertion Sort or Bubble Sort of n elements). Say we timed the algorithm at n = 1000, and we find that it takes 4 seconds. If we want to find the time for n = 1 million, we don't have to run the algorithm over again. Since the time grow quadratically in this case, the estimated time for the problem should be close to (1000)2 * 4 seconds, or 4 million seconds. Had this algorithm been linear (
(n) ), the time would have been (1000) * 4 or 4000 seconds only! In both of these cases (quadratic and linear), the constant C was not needed.
This is a table of Complexity Classes. The row entries are in increasing order of growth.
Class | Function/Order | Examples | Remarks | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
1 | Constant time | A[i] | The running time of the algorithm doesn't depend on the value of n. | ||||||||
log*n | Log Star time | ... | log*n = min {i : log2
log2 ... log2n For practical purposes, log*n is never more than 5. For example, 265536 > 1080, the number of atoms in the universe! FIGURE 1 shows the values of log star from 0 to 5. | ||||||||
logan | Logarithmic time | binary search | Binary
Search: Every time you go through the recursion (binary search uses recursion), the problem is reduced in half. So the most amount of times you can go through the recursion is log2n. | ||||||||
n | Linear time | printing, listing | The running time of the algorithm is proportional to the value of n. | ||||||||
n * log(n) | ... | heapsort, mergesort | ... | ||||||||
n2 | Quadratic time | bubblesort, insertion sort | Why is insertion sort quadratic? Comparisons : | ||||||||
na | Polynomial time | ... | Hypothesis : Exponential is larger
than polynomial : Compute the limit of cn / nk for c > 1. To do this, find a lower bound for cn.
cn = en*log c This was found by the Taylor
series expansion of ex. Clearly, for c > 1 and a
proper choice of a constant, this function is Substituting
n2k for cn in the original problem, shows that
the limit we desire is the limit of n2k / nk.
This limit is clearly As a result, Exponentials clearly swamps Polynomials. | ||||||||
2n, 10n, an | Exponential time | security codes | Hans Bremermann was an early computer scientist in Berkeley. In 1962, based upon physical reasoning, he stated that no computer, biological or artificial, could ever process more than (2 * 1047) bits of information per second-gram. Let's assume that this is true. Now, assume that we protect a computer system with a code consisting of 40 decimal digits. Breaking this code by brute force would in the worst case require trying out 1040 combinations. In a computer working with Bremermann's speed, that could be done in less than a second, even if the computer weighed only 1 gram. Now, if we protected it with 100 digits, we would be safe, even if we had a 10 ton computer, for at least That is basically until eternity. n! | Factorial time | TSP | TSP is the Travelling Salesman Problem. Let's say you are a salesperson and you have to visit each of n cities to sell your product. You want to visit all the cities and return to the city you came from. You want to find the shortest path possible, visiting all the cities. Obviously, it would be naive to try all the permutations. There are (n-1)! permutations, and each permutation requires computation worth n (add n distances), for a total of n!. Many tried their hand on this problem, but few have found a good solution for it. en1/2 | Betweem Poly. and Exp. | ...
| Is there life between polynomial and exponential? en1/2 is one such function. |
n | log*n |
---|---|
1 | 0 |
21=2 | 1 |
22=4 | 2 |
24=16 | 3 |
216=65536 | 4 |
265536 | 5 |
Summations are used extensively in computer science as a mathematical tool to help analyze the running time of algorithms. In computer science, most algorithms are completed through the repitition of simple actions, or loops. Loops are created in two different ways: first, by constructs such as repeat,while,and for, and second, by recursion. In order to analyze the times taken by looping programs, we have to deal with summations, as one must sum the times taken in various iterations. In order to analyze the times taken by recursive programs, we have to deal with recurrences, which is the subject of the next lecture.
There are five basic methods for dealing with summations.
Method | Example Easily Solved by the Given Method |
---|---|
Rough Bounding | ![]() |
Exact Computations | ![]() |
Induction | ![]() ![]() |
Telescoping Series | ![]() |
Comparison With an Integral | ![]() |
Given a summation, we would like to bound each term by something simple, and thus make the summation more managable. While this method will not determine an exact measure of the running time of the algorithm which the summation is supposed to represent, it is useful in achieving bounds to solve Big Oh notation questions. In the example below, the summation represents the running time of many different algorithms, among them selection sort.
(i=1 to n i)
= (1 + 2 + .... + n)
(n + n + .... + n)
= n2
As the name of the method suggests, we would like to compute an exact numerical value for the summation.
(i=1 to n i) = [n (n + 1) / 2]
Here is a short proof of this identity. Pair off the n numbers such that the first element is grouped with the last element (nth element), the second is grouped with the (n-1)th element, and so on. For example, 1 is grouped with n, 2 is grouped with (n-1), 10 is grouped with (n-9), etc. Clearly, each pair sums to (n+1), and there are (n/2) such pairs. Thus the equality clearly follows.
>From high school mathematics we know that :
(1 + x + x2 + ... + xn) = [(xn+1 - 1) / (x - 1)]
So if | x | < 1 then :
(i=0 to
xi) = [1 / (1 - x)]
- since the limit of xn+1, as x approaches
, is 0
Also, if | x | > 1 then we can differentiate the right hand side and left hand side of the above equation with respect to x. As a result,
i=0 to
i(xi-1) = (1 - x)-2
Here is a brainteaser for you to solve by yourself.
What isi=0 to
xii2?
Hint: Differentiate once more.
Induction can be used to prove a certain statement. We say it is "induction with respect to a parameter n", if we first show it for n = 0. Then we assume the statement to be true up to but not including n, and finally we verify that statement for n.
(i=0 to n 3i)
(C * 3n)
In this case, C is some constant to be determined.
We believe the statement to be true because an
exponentially growing series usually has a huge dominating
last term.
Step 1 : The inequality is trivially true for n = 0 provided
C 1
Step 2 : Assume that the inequality is true for all integers up to and
including n-1
Step 3 : Verify that it is true for n
(i=0 to n 3i)
= [i=0 to n-1 3i + (3n)]
[(C * 3n-1) + (3n)] (definition by induction)
The problem now boils down to whether the following inequality is satisfied
[(C * 3n-1) + (3n)](C * 3n)
Dividing both sides of the inequality by 3n yields :
(C + 3)3 * C
32 * C
3/2C
Thus the inequality is satisfied as long as 3 / 2
C. We showed the original
statement with C = 3 / 2. So indeed, the last term in
an exponentially increasing sum is about the same order
of magnitude as the entire sum.
A telescoping series is a series where each term can be expressed as a difference in such a way that most of the terms cancel each other out.
i=1 to n [1 / (i * (i + 1))]
=i=1 to n [(1 / i) - (1 (i + 1))]
= 1 - [1 / (n + 1)]
The basic principle behind this method is that
i=1 to n f(i)
i=1 to n f(x) dx
i=1 to n (1/i)
![]()
i=1 to n (1/x) dx
= ln (n)
In the above diagram, we are taking a finite amount of rectangles each
of width 1 and height 1/i for i = 1 to n. Inspection of the diagram
clearly shows that the sum of all these rectangles is less than 1 +
i=1 to n
(1/x) dx
ln(n+1)![]()
i=1 to n (1/i)
1 +
i=1 to n (1/x)dx
= 1 + ln(n)
This is Problem 4.7 from Cormen et al (our course textbook).
This problem is motivated by the VLSI industry, where millions of VLSI chips must be tested for defaults. The idea is to let the chips test each other by built-in circuitry. Note that VLSI stands for very large scale integration which is the integrated-circuit chip technology used to fabricate most microprocessors today.
There are two kinds of chips - good and bad. Good chips always tell the truth, i.e. will report accurately whether the tested chip is good or bad. Bad chips say anything about the other chip, thus cannot be trusted. Our goal is to identify all good chips. The problem must be solved in O(n) time.
It is assumed that there are more good chips than bad. In other words, |G| > |B|. Our computational model states that each chip tested counts for one time unit.
It suffices to define one good chip in time O(n). This is because one good chip can identify all good chips in O(n) time.
We are going to test each chip individually by comparing it against all remaining chips. We take votes from the remaining chips and let the majority decide. In case of equeality, good wins. If a chip is declared bad, it is junked and we continue.
Time Analysis :
(n-1)+(n-2)+(n-3)+...+(n/2)
n + n + .... + n = (n2)/2 = O(n2)
However such time is unacceptable, for it greatly exceeds O(n) time.
Definitions :
- G : set of all good chips
- B : set of all bad chips
- B-B : pair of chips that has been declared "bad, bad"
- B-G : pair of chips that has been declared "bad, good"
- G-G : pair of chips that has been declared "good, good"
Here is one iteration of the divide and conquer paradigm:
If n = 1 or n = 2,
- then return one chip (it is good).
If n is odd,
- test odd chip.
If odd chip is good,
- return the chip (it is good).
If odd chip is bad,
- with remaining chips do :
- pair all chips
- junk all B-G, B-B pairs
- junk one chip of every G-G pair
- find (recursively) one good chip among the remaining chips
If before the iteration, |G| > |B|, then after the iteration, |G|
> |B|. Before the pairing step, we have |G|
|B| + 2. Each B-B, B-G pair has at one bad chip. So we junk at least as many
bad chips as good ones. So, we still have |G|
|B| + 2. Each G-G pair has either two good chips or two bad ones; what
matters is that both chips are alike. Thus, the halving step leaves us with
|G|
|B| + 1.
Time Complexity :
If T(n) is the worst-case time for n chips, then
T(2) = T(1) = 0 and
T(n)n - 1 + floor(n/2) + T(floor(n/2))
< 3n/2 + T(floor(n/2))
< 3n/2 + 3n/4 + T(floor(n/4))
< 3n/2 + 3n/4 + 3n/8 + T(floor(n/8)) ...So that clearly
- T(n) < 3n/2 (1 + 1/2 + 1/4 + 1/8 + ...) = 3n
where we used a summation formula for the geometric series seen above
Here is the source code of the java applet : ShowChip.java
This section is provided to give readers the opportunity to contact us, to point out errors or omissions, to ask questions, and to suggest improvements. Explanations on the role of each team member is included in the creation of the web page.
The creators are, in no particular order of importance except alphabetical :