Given a set of "n" unordered numbers we want to find the "k^{th}" smallest number. (k is an integer between 1 and n)
In the case of problems involving sorting, searching, and selection, a good indicator of time is the number of comparisons. Therefore, we will set
Algorithm Step  Description of the algorithm step  Running Time 
1  Sort the n elements using mergesort or heapsort  O(n log_{2}n) 
2  Return the k^{th}smallest element  0 
Total Running Time of the algorithm=  O(n log_{2}n) 
Selection can be performed in O(1) time in a sorted array. However, data are not always sorted with respect to the key that is used for selection. Example: we may sort data by name, and ask for the k^{th} smallest phone number
In this lecture, we will study two algorithms with linear worstcase and averagecase complexities. The naive algorithm above is simply not comptetitive.
This algorithm will find the k^{th}smallest in O(n) worst case time.
It is also called the medianfinding algorithm.
The method uses a divide & conquer strategy to solve the problem. Like the problem with the good and bad chips where we "threw away" 50% of the chips, if we can throw away 30% or even 10% without too much complication, we're cutting down the running time substantially. Therefore, we must efficiently eliminate as much as we can from the competition.
Algorithm Step  Description of the Algorithm step  Bound on the running time 
0  If n is small, for example n<6, just sort and return the k ^{th}smallest number.  7 
1  If n>5, then partition the numbers into groups of 5.  n/5 
2  Sort the numbers within each group. Select the middle elements (the medians)  7n/5 
3  Call your "Selection" routine recursively to find the median of n/5 medians and call it m.  T_{n/5} 
4(a)  Compare all n1 elements with the median of medians m and determine the sets L and R, where L contains all elements <m, and R contains all elements >m. Clearly, the rank of m is r=L+1 (L is the size or cardinality of L)  n 
4(b)  If k=r, then return m  
4(c)  If k<r, then return k^{th} smallest of the set L  T_{7n/10} 
4(d)  If k>r, then return kr^{th} smallest of the set R 
If n is small, it obviously doesn't make any sense to "juggle" the values. Let us set the small threshold at 5 elements. To find the "k^{th}" item from n elements, here are the least number of comparisons needed for values where n is less than 6.
Show how five elements can be sorted using at most seven comparisons.
The Oxford Dictionary defines Median: adj situated in the middle. 2.Middle value of a series.
When n is not an exact mutiple of 5, we divide the n elements into groups of 5 elements each, and at most one group with the remaining n mod 5 elements. Also, when there are an even number of medians, select the larger of the two median of medians.
T(n) is an upper bound on the worstcase time
We will prove that for all n and for some (yet unknown) constant C.
Luckily for us, it turns out that the third step , is true provided
This concludes the proof. We have actually shown that
There is a slight error in the derivation as 7n/10 and n/5 may not always be integervalued! However, it still remains true if we take truncations into consideration, that T_{n}=O_{n}.
Even numbers don't work since there won't be a median. On the other hand 5, 7, 11, 13, etc. work but why won't 3, the smallest odd number? When we try to use the value of 3 in the algorithm then we obtain the inequality
Justify each term in the inequality
ExerciseWhat is the best we can conclude from
The algorithm uses one concept from quicksort and returns the k^{th} smallest of n.
If n=1  then return the element 
If n>1  then grab a random element from your set and call it x. Cut the group into two sections. The left side (L) will contain all elements smaller than x, the right side(R) will contain all elements greater than x. Let r=L+1 be the rank of x. 
If k=r  then return the element. 
If k<r  then return the k^{th} smallest from the set L. 
If k>r  then return the kr^{th} smallest from the set R. 
The worst case of this algorithm happens when x is either the smallest or largest element in the set. There will be no splitting and the time will become
T(n) is for the same computational model but now shows the expected or average time
Case 1:  x(recall: the random element) is in the middle half. Then you recurse on the set of size at most 3n/4 
Case 2:  x is not in the middle half. Taking a pessimistic view, you recurse on a set of size at most n. (that means, we assume that no progress was made) 
As each case occurs with probability 50%, we get:
Rewrite this inequality as:
The algorithm of the previous section had
in the worst case. Using the inequality above, show that for the average case,
.
Hint: draw the recursion tree and sum over all levels.
Many data structures and algorithms courses are ``on the web''. Some useful pointers to get you started in your search are given below.
Algorithms Information and Course Materials on the Net  Courses on algorithms. 
Princeton Data Structures  Bob Sedgewick's course on data structures and algorithms at Princeton. 
Sedgewick's transparencies  Bob Sedgewick's lecture notes (copies of transparencies) on data structures and algorithms at Princeton. For our course, of particular interest are sections 10 (tries), 11 (strings), 12 (more strings), and 13 (compression), as this material is not covered in detail by Cormen, Leiserson and Rivest. 
Sugihara's lecture notes on strings  Sugihara's lecture notes on strings (storage, coding, compression, search) are a nice supplement to Cormen as well. Sugihara teaches at the University of Hawaii. 
Index of Sugihara's lecture notes.  Index of Sugihara's lecture notes. 
Java applets for five sorting algorithms  Animation for sorting algorithms. 
Algorithms at UMass 

C++, Algorithms, and Data Structures (CS202) Notes 

Rivest's bibliography on algorithms  Ron Rivest's bibliography on algorithms. 
MergeSort demo  Mergesort demo from University of Toronto. 
Linear Time Selection  Lecture notes 
Hoare's homepage  C.A.R. Hoare's homepage 
The problem of selection (finding the k^{th} largest of n elements) goes back to Rev. C. L. Dodgson's (aka Lewis Carroll) essay (Carroll 1883, p.56) on how prizes were awarded unfairly in tennis tournaments. Knuth (1973, p.209)
A linear method to solving this problem was always sought after and only after Manuel Blum discovered a method with O(nlog logn) steps in 1971 were R.Rivest and R.Tarjan (Rivest, 1973, p.448461) able to create the alorithm which we now call "Linear Time Selection". (Knuth 216)
Hoare, Charles Antony Richard published a method called quicksort in 1962 which is used in the algorithm of finding the k^{th }smallest of n elements.
Cormen, Thomas H, Charles E. Leirson, Ronald L. Rivest
"Introduction to Algorithms". Montreal: McGrawHill Book Company, 1991.
Section 10.3
St.James Gazette. 1 August, 1883
"Communications of the ACM" 1961.
Vol.4, pp.321322
"Volume 3: Sorting and Searching" Reading, Massachusetts: AddisonWesley Publishing Company, 1973.
Section 5.3.3. MinimumComparison Selection
Sams.net. (1997, p.365469)Rivest, Ronald L., and Robert E. Tarjan
Journal of Computers & Sytem Sciences 7: 1973 Vol.7, p.448461
Daichi Saeki worked on the HTML coding and creating the equations using LATEX2HTML, Chien Liao worked on the JAVA applet and Silva Baeva worked on research.