Here is cryptic collection of answers for the 1997 final of CS251B. Refer to the final, which is posted in the main page for the course. Question 1. Part A. This part has nothing to do with caterpillars. The adjacency list format for a graph uses an array[1..n] of pointers to linked lists, where each linked list consists of something like struct{ whatever key; list *next} list Part B. First of all, it is easy to verify in O(|V|) time that (1) all degrees are <= 3; (2) the graph is connected. Most students checked then for acyclicity, but that is not enough. Indeed, the complete binary tree with 4 levels of nodes is acyclic, but it is not a caterpillar! There are two solutions: A. (Very clever, but not noted by the students): a connected graph is a caterpillar if and only if the number of degree-1 nodes is equal to the number of degree-3 nodes plus 2. Clearly, this can be checked in O(|V|) time. B. Define an appendix as a degree-1 node followed by any number of degree-2 nodes. By a careful DFS, all appendices may be colored red. Then check that the remaining uncolored nodes form a chain. Question 2. Part A. Many possible solutions. Here is one: call the following function as postorder(root); it returns the postorder number of the subtree at root; set counter=0 where counter is external. function postorder (root) (returns index); nrright := nrleft := 0; (0 plays role of nil) if l[root] != nil then nrleft := postorder(l[root]); if r[root] != nil then nrright := postorder(r[root]); counter += 1; key[counter] := k[root]; left[counter] := nrleft; right[counter] := nrright; return (counter); Part B. This is easy, as we already have a postorder sequence. So we just go: for i=1 to n do if (left[i]>0) or (right[i]>0) (there is at least one child) then height[i] = max (height[left[i]], height[right[i]]) +1 else height[i] := 0; Part C. Note that node n is the root. So call diameter(n) where we have: function diameter(i): returns diameter for subtree at i; if i has two children: D = max( diameter(left[i]), diameter(right[i]), height[left[i]]+height[right[i]+2 ) if i has no children: D = 0 if i has only a left child: D = max( diameter(left[i]), height[left[i]]+1 ) if i has only a right child: D = max( diameter(right[i]), height[right[i]]+1 ) return D Question 3. I was looking for a table size m which had to be a prime number in the rough range 5000...20000. I also needed a primary hash function, something like (rg + gb + br) mod m. There were many possible answers, but I needed at least a suggestion to be able to give you marks. The secondary hash function for double hashing could have been anything. For example, f mod m would have been OK if f mod m was not zero. Question 4. Part A: direct addressing; in other words, an array of size N. Part B: a balanced binary search tree. The answers "heap" ad "linked list" are not good here, because duplicate checking in both structures is very costly. Question 5. I asked you to do one of two things. So, let's answer (ii). Let us merge all lists pairwise (not the smallest first!). So, after one round of merges, we are left with d/2 lists. We repeat this for about log_2 d rounds. Each round takes time n if we count comparisons. The overall cost is thus about n log_2 d. Now, we can do (i). Indeed, the Huffman strategy (merge the two smallest remaining lists) is optimal with respect to the number of comparisons. So, it must beat the one I explained in the previous paragraph! Thus, the number of comparisons is bounded from above by about n log_2 d. Note: the Huffman tree for merging the files does not necessarily have height O(log d), so any argument based on the height of this tree was wrong. Question 6. To convince you I have a median of 5 elements, I must exhibit to you the median and the two smaller elements. So, how many possible answers can I give you? Back-of-the- envelope calculations show that I can give you 30 answers. Now, if we use comparisons between elements, the decision tree argument would give us a lower bound of ceiling(log_2 30) = 5. I gave maximum marks for this answer. Remark: In fact, a better lower bound is 6. The way to get this is by performing one conmparison, and noting that each subtree of the decision tree has 18 possible answers. So the lower bound is one plus ceiling(log_2 18) = 6. Question 7. Let (a,b) and (c,d) be the crossing edges. We argue by contradiction. We assume that the EMST passes through (a,b) and (c,d). Note then that since it is a spanning tree, then removing both (a,b) and (c,d) leaves 3 disconnected trees. One of these trees thus has two of our four nodes, either (a,c), (a,d), (b,c) or (b,d) (but not (a,b) or (c,d)). Assume that it is (a,c). So, I will now reconnect the pieces by adding (a,d) and (b,c). This makes a spanning tree (note that I have to be careful to avoid cycles). But length(a,d) + lenght(b,c) < length(a,b) + length(c,d) by geometrical considerations (sum of diagonals of a quadrilateral is < sum of two opposite sides). So, I am able to construct a better spanning tree. Contradiction! Question 8. This was a midterm question. The simplest counterexample is 0^n 1^n, n zeros followed by n ones. The responses "n zeros", "n ones", and "consider a string that makes the suffix trie a complete binary tree" are all wrong. Question 9. The worst-case time of 2^n occurs when V is much larger than all elements, so that the algorithm will have to try all possible (2^n) subsets of n elements. Question 10. A. n-1 (n also accepted) B. 1 (2 also accepted) C. 2n-1 (2n also accepted) D. 2 (3 also accepted) E. (n-1)(n-2)/2 (n(n-1)/2 also accepted) Question 11. The Huffman tree was not unique. The correct answer was 172 bits, but I gave partial credits if I could see that you knew how to construct the Huffman tree and how to derive a Huffman code.