Midterm 1--CS251--Winter 2000


A 90 minute closed book examination.

Question 1 Lower bounds. In a comparison-based model of computation, give the decision tree-based lower bound (without working out logarithms!) for the number of comparisons to solve the following problems:
Part A
Given is a sorted list of ten integers, and two additional integers x and y. All integers are different. Create a sorted list of twelve elements.
Part B
Given are sorted lists of twenty and seven integers respectively. All integers are different. Create a sorted list of 27 elements.
Solution As we have a binary decision tree, the solution is ceiling(log_2 N) where N is the number of different solutions. Think about this more generally. Let A be a sorted list of n elements, let B be a sorted list of m elements, and let C be an unsorted list of k elements. For the first question, we need to merge A and C. The merged list of length n+k is totally determined if each element of C in turn were to know its position. The first element has n+k choices, the second n+k-1 and so forth until n+1. Thus, the number of solutions is
	   N = (n+k)(n+k-1)...(n+1) .
	
To merge A and B into a list of length n+m, we only need to specify the positions the elements of B (as a group!) occupy in it. There are (n+m choose m) such choices. Thus,
	   N = (n+m)! / n! m!
	
To get back to specific numbers, for part A, the answer is ceiling(log_2 (11x12))=8. For part B, the answer is ceiling(log_2 (27!/20!7!))=19.
Question 2 Recurrences, trees. We are given a complete position tree with a fanout of 41 (41 children per node). Given that there are n nodes in all, answer these questions:
Part A
What is the height of the tree as a function of n?
Part B
How many leaves are there?
Part C
Consider the algorithm that recursively visits the first and the last (41st) subtree of each node and writes down the label of each leaf it encounters. If T_n is the time taken by this algorithm, give a value for T_n in theta notation.
Solution Let h be the height of the tree. Two of a number of equivalent answers for part A. are:
	   h = ceiling(log_41 (40n+1)) -1
	   h = floor(log_41 (40n-39))
	
Verify that these formulas are exact for all n (even n=1). To get the first solution, count the number of nodes in each full level, and obtain these bounds:
	 1 = 41 + ... + 41^(h-1) < n <= 1 + 41 + ... + 41^h
	
from which we derive
	 (41^h -1)/40 <   n   <= (41^(h+1) -1)/40

	     41^h     < 40n+1 <=     41^(h+1) .
        
Now take logarithms to the base 41.
Part B: consider the standard array implementation of a k-ary position tree. A moment's thought shows that the number of leaves is n-parent[n], where parent[n] is the index of the parent of node n. Now, parent[n]=floor((n+39)/41) (this is the proper generalization of the formula for binary trees we saw in class). The answer is n-floor((n+39)/41), or, equivalently, floor((40n-39)/41).
For part C, the recurrence is
	T_n = 2 T_(n/41) + 1
	
which by the master theorem yields T_n = theta(n^a), where a = log_41 (2).
Question 3 Stacks, queues, algorithms. Given are the following operations on any kind of element:
  • The standard stack operations MAKENULL, EMPTY, POP, TOP, PUSH.
  • The standard queue operations MAKENULL, EMPTY, ENQUEUE, DEQUEUE, FRONT, REAR.
  • The operation JOINQUEUE which joins two queues in one time unit and returns the new queue.
  • The operation MAKEQUEUE (x) which returns a queue with an element x (of whatever type) all by itself.
We recall postfix expressions such as 345+*6-, with operators following the two operands they operate upon. In this example, the corresponding standard (or: infix) expression is (3*(4+5))-6. The postfix expression is given to you as a queue Q of elements. The boolean procedure ISOPERATOR (x) can be used to determine the nature of x. The objective is to write a short nonrecursive algorithm (not C or java) that makes one pass through Q for creating a queue I (I for infix) consisting of the infix expression (with parentheses added in when necessary; too many parentheses do not hurt, too few will).
Solution We will stick all elements and partial infix subexpressions in queues. It is helpful to write a short additional procedure JOIN(a,b,op) that takes queues a and b (these could be single-element queues holding operands, or multi-element queues holding subexpressions), and an operator op as input and returns a queue holding (, the elements of a, op , the elements of b, and ).
	  JOIN(a,b,op): MAKENULL(Q)
	                JOINQUEUE(Q,MAKEQUEUE("("))
	                JOINQUEUE(Q,a)
	                JOINQUEUE(Q,MAKEQUEUE(op))
	                JOINQUEUE(Q,b)
	                JOINQUEUE(Q,MAKEQUEUE(")"))
			RETURN(Q)
	
The main algorithm proceeds as follows:
	  MAKENULL(S)   (S is an auxiliary stack of queues)
	  while not EMPTY(Q) do:
	      x=DEQUEUE(Q)       (x is an element)
	      if ISOPERATOR(x)  then  b=POP(S)
		                      a=POP(S)
		                      PUSH(JOIN(a,b,x),S)  
		                else PUSH(MAKEQUEUE(x),S)
	  I=TOP(S)        (I is the infix queue)
	
Note: it is important to treat partially reconstructed pieces of the infix expression as we would individual elements. This can be achieved by treating all such pieces and all individual elements as queues (everyone is the same for the law). The JOIN operation puts the parentheses in the right places so that operator precedence is a non-issue. At the end, the stack S has only one element, the queue I. Technical note: JOINQUEUE(Q,MAKEQUEUE(op)) is equivalent to ENQUEUE(op,Q), but JOINQUEUE(Q,a) cannot be replaced by an ENQUEUE operation. It is also inefficient to enqueue all elements of the queue "a" individually as "a" may have many elements. For efficiency, JOINQUEUE is needed here.

Copyright © 2000 Luc Devroye. Email: luc@cs.mcgill.ca.