Answers to second set of practice questions--CS251--Winter 1999

This set contains practice questions on augmented data structures and on priority queues.

Question 1 Searchable stack. The ADT searchable stack is supposed to handle the following operations on on sets of data with two keys (for example, student name, and student ID number). The operations are PUSH, POP, TOP, MAKENULL, SEARCH-BY-KEY-1, and SEARCH-BY-KEY-2. Using the principle of augmentation of a data structure, suggest a data structure for this situation, such that PUSH, POP and both SEARCH operations take time O(log n) (n is the number of elements in the data structure at the time of the call), and the other operations take time O(1).
Answer Because we must search, it seems natural to augment a red-black search tree. One possible answer is to use a red-black tree for each of the two keys, with nodes containing the following information fields:
  • key1, key2: (type unimportant)
  • left1, right1, parent1: pointers for first red-black tree
  • color1: color of node in first red-black tree
  • left2, right2, parent2: pointers for second red-black tree
  • color2: color of node in second red-black tree
  • nextbelow: points to node below on the stack
The structure itself consists of two pointers: root-of-tree, and top-of-stack. We must verify that MAKENULL and TOP take O(1) time--clearly they do. PUSH involves an INSERT in both trees, and so two applications of red-black tree INSERT are required. Similarly, POP corresponds to a DELETE from the red-black tree. In all cases, all information can be updated in O(log n) time.
Question 2 Maxstack. The ADT maxstack is like an ordinary stack, with an extra operation, MAX, which returns the maximum of the key values of the elements in the stack. Show an implementation that allows all operations to be performed in O(1) worst-case time.
Answer In an ordinary linked-list based stack, we keep a pointer "nextbelow" as in a single linked list. However, if we keep a second pointer with each node, called "maxbelow", which points to the node with maximal key value at or below the given node, then all operations can be done in time O(1). Thus, a node contains the fields
Let TOP point to the top node. Then POP just needs:
And PUSH (newnode) needs:
    if key[newnode]>key[maxbelow[TOP]]
      then maxbelow[newnode]=newnode
      else maxbelow[newnode]=maxbelow[TOP]
Clearly, both stack operations, as well as the operation MAX (=key[maxbelow[TOP]) take time O(1).
Question 3 The ADT median-priority-queue (or MPQ for short). The ADT median-priority-queue operates on a set of elements with different key values, and involves the following operations: makenull, insert, delete-median, makeMPQ. It is known that the number of elements never exceeds a given number n. The operation makeMPQ takes a given number of elements and inserts these globally (not one by one; recall buildheap for a similar situation). For delete-median, note that the median of a set of 2n elements is the n-th smallest, and that the median of a set of 2n+1 elements is the n+1-st smallest element. Suggest, adapt or invent a data structure so that the following can be achieved:
  • Insert and delete-median take O(log n) worst-case time.
  • MakeMPQ takes O(n) time in the worst case, where n is the number of elements involved.
Your answer should be such that a good programmer would know how to implement your data structure. You must tell us therefore in words how makeMPQ, insert, delete-median are to be implemented, and you must draw the data structure.
Answer Among the possible solutions, the most elegant one uses two binary heaps of equal size, HL and HR (L for left, R for right). HL is a max-heap, HR is a min-heap, and all elements of HL are smaller than those of HR. Furthermore, if |.| denotes size, |HL| = |HR|, or |HL|=|HR|+1. To insert newnode, insert it in HL if key[newnode] < key[min[HR]] (in O(log n) time of course). Insert it in HR otherwise. Note that the median is always the node max[HL] (the root of HL), so it can be deleted in O(log n) time. After "insert" and "delete", the balance condition may be violated. But by transferring max[HL] to HR or min[HR] to HL, we may reestablish it. This requires at most one additional deletemin/deletemax and one additional heap-insert. The total time is O(log n). The operation MakeMPQ involves finding the median in O(n) time by a median-selection algorithm, and then running "buildheap" on the sets of elements at or below the median (to make HL) and on the elements above the median (to make HR).
Question 4 k-ary heaps. A k-ary heap is a generalization of a standard binary heap so that there are k children per node. Give an implicit data structure implementation, with details on how to find the i-th child of node j, and the parent of node j. Estimate the worst-case number of comparisons (as a function of n and k, where n is the number of elements) for a standard operation DELETEMIN. Same question for the operation INSERT.
Answer Clearly, one should use an array in which nodes are kept in level order. If we begin with a root of index 0, then the children of a node with index i have indices ki+1 through ki+k. And the parent of a node with index i has index floor((i-1)/k). The complete k-ary tree has height about log_k n. For a deletemin, we need k comparisons to descend one level, so the number of comparisons in the worst case is about k log_k n. For an insert, only one comparison per level is required, for a worst-case total of about log_k n. (Note that inserts are thus "cheaper" than deletemins).
Question 5 k-ary heaps (continued). Let a k-ary heap with n elements be given. Let the number of DELETEMIN and INSERT operations be equal; the sequence of the operations is such that the size of the heap remains about n. Argue why a 4-ary heap is best (you may need a calculator).
Answer By what we said in the previous answer, it is best to minimize (k+1) log_k n = ((k+1)/log k) log n. Now, take a calculator and compute (k+1)/log k, and notice that the values at k=2, 3, 4, and 5 are 4.32, 3.64, 3.60, and 3.72 respectively, and that these values increase from k=5 onwards. Thus, k=4 is the best choice!
Question 6 Beap. In 1976, Munro and Suwanda proposed the beap (biparental heap) as a possible implementation for a priority queue-like ADT supporting the operations makenull, insert, deletemin and search. A beap of n elements should be visualized as a triangular matrix such as the one shown below, which has 10 elements. The elements are added in cross-diagonal fashion, so that the inherent order is 1, 2, 4, 6, 5, 9, 21, 8, 20 and 13 in the example shown below.
 1   4   9  13  ...
 2   5  20  ...
 6   8  ...
21   ...
The beap has the property that along any row or column, the elements are encountered in increasing order. Each element can be thought of as having two coordinates, say (c,r) where c stands for column and r stands for row. We use the convention that the first element is at (0,0). In our example, the element at (2,1) is 20, and the last element, at (3,0), is 13. If we place elements in cross-diagonal fashion, then the n-th element ends up at coordinates (c(n),r(n)), where c(.) and r(.) are certain simple functions of n that are given. Thus, (c(1),r(1))=(0,0), (c(2),r(2))=(0,1), (c(3),r(3))=(1,0), (c(4),r(4))=(0,2), and so forth. Define an implicit data structure for the beap, assuming that n never exceeds "maxnum". (This question is very easy.)
Answer Of course, one could use a double array of kxk elements, where k = 1 + sqrt(2n). However, as half the array is necessarily unused, one could also use a linear array, where elements are placed in the order (0,0), (0,1),(1,0), (0,2),(1,1),(2,0), and so forth. In both cases, it is necessary, given n, to determine the n-th element in the structure. In the former case, neighbors in the structure are easy to find, while in the latter, some work is needed. Let us number the diagonal rows of elements, and keep track of the highest index on each diagonal. For the i-th diagonal, the last element has index 1 + 2 + ... + i = i(i+1)/2. Let i = diagonal(n) be the index of the diagonal containing n. This index is O(sqrt(n)) and diagonal(n) takes at most O(sqrt(n)) time to compute. indeed: given n, we find the unique i for which (i-1)i/2 < n <= i(i+1)/2 by verifying the inequalities starting with i=1. The i-th diagonal has i elements elements numbered (0,i-1), (1,i-2), ... , (i-1,0). The n-th element overall is thus at array position (i-1-m,m) where m = i(i+1)/2 -n.
Question 7 Beap (continued). Inspired by similar operations on heaps, describe how you would implement makenull, insert, search, and deletemin.
Answer Let us use the array approach. Assume that the beap holds n elements and that x is the key value of a new element to be inserted. We first fix the shape, by placing the new element at position n+1 (see previous question on how to do this in O(sqrt(n)) time).
    Let (c,r) be the array position of node n+1
    Repeat forever:
      Let (c',r') be the array position of the maximum of key[c,r], key[c-1,r], and key[c,r-1] (if the latter two are existing elements).
      If (c',r') = (c,r) then stop.
      Else exchange the keys of (c,r) and (c',r'), and set (c,r) <- (c',r').
The number of iterations cannot be more than the original c+r, because in each iteration, c or r decreases by one. But we saw that c+r = diagonal(n)-1 = O(sqrt(n)). For deletemin, we first move the n-th element to position (0,0), and fix things recursively.
    Move the element at position n to (0,0)
    Set (c,r) = (0,0)
    Repeat forever:
      Let (c',r') be the array position of the minimum of key[c,r], key[c+1,r], and key[c,r+1] (if the latter two are existing elements).
      If (c',r') = (c,r) then stop.
      Else exchange the keys of (c,r) and (c',r'), and set (c,r) <- (c',r').
This will terminate after O(sqrt(n)) steps. Finally, the search for key x starts at column 0 and either finds the element in that column (in time O(sqrt(n)) or finds an element (0,j) that is just larger. Now, move over one column, to (1,j), and look at elements (1,j), (1,j-1),... leading again to an exact match or an index (1,j') of an element that is just larger than x. We move over at most O(sqrt(n)) columns, and the second index (the "j" and "j'" above) can only decrease O(sqrt(n)) times, so the time for a search is bounded by O(sqrt(n)) as well.
Question 8 Beap (continued).
  • Give O and Omega statements about the worst-case complexity of your operations insert and deletemin.
  • Give the worst-case time complexity of beapsort, a sorting method in which n elements are inserted in a beap, and then deleted by n calls to deletemin.
  • For use as a future event set in discrete event simulation, would you prefer a beap over a heap?
  • The heap, the sorted array, the unsorted array and the beap are four structures capable of holding elements with the property that the position of an element in the structure contains information about the relative size of the element vis-a-vis some other elements. This property is not shared by pointer-based structures. Of the four structures cited above in boldface, which one is best, in the worst-case complexity sense, for the following tasks:
    • makenull, followed by n insert's, and then n deletemin's.
    • One search operation in a structure holding n elements.
    • makenull, followed by n insert's and n search operations.
Answer The first sub-question was addressed in my previous answer: it isd O(sqrt(n)). Beapsort takes time O(n^(3/2)) as each of the n inserts and n deletemins takes time bounded by a constant times n^(1/2). For a future event set, inserts and deletemins are equally frequent. Clearly, binary heaps are to be preferred. Finally, of the four data structures listed, the first group of operations is best handled by a binary heap (in time O(n log n)), the second one by the sorted array (in time O(log n)), and the third one by a beap (in time O(n^(3/2))) (the three other structures take worst-case time Theta (n^2) in the latter case).

Copyright © 1999 Luc Devroye. Email: