Question 1 | A simple recursive algorithm. Given is a standard min-heap H with n elements implemented as an array. Write a very simple recursive procedure that returns all the keys in the heap with value less than a given number z. If k elements are returned, the worst-case time should be O(k+1). |
Answer |
Perform preorder traversal started with report(1) (as 1 is
the root in the array implementation of H). We are also given
the heap size, n.
if key[x] < z then output key[x] if 2x <= n then report (2x) if 2x+1 <= n then report (2x+1) |
Question 2 | Mathematical exactness. Define precisely what is meant when we say that an algorithm takes polynomial worst-case time on an input of size n. |
Answer | Usually what is meant is that the worst-case time is polynomial in the O(.) sense: thus, there exist finite constants C and K such that the maximal execution time of an algorithm over any input of size n is bounded by C n^K for all n. A similar answer for Theta(.) was also accepted. |
Question 3 | Huffman trees. We wish to construct the Huffman tree for n symbols, with observed frequencies integer-valued between 1 and n (with possible duplicates). We have seen that the Hu-Tucker algorithm can construct the Huffman tree in time O(n log n). However, in this particular instance, show how you can construct the tree in time O(n). |
Answer |
First, sort the (symbol, frequency) pairs by
frequency by placing them in a direct
hash table with chaining (of size n): the hash
function is h(frequency)=frequency.
This takes O(n) time in the worst-case, even if
all elements collide in one chain, as insertions
are at the fronts of the chains.
For convenience, concatenate the chains into one
long queue L, sorted from small to large by frequency
with cells holding pairs (i, p_i)
where i is the i-th symbol/leaf,
and p_i is the weight; set up an initially
empty second queue Q
for pairs (j, q_j), where j stands for
an internal node and q_j is its weight.
The algorithm:
Delete the 2 smallest weight items from (L, Q) (these may both come from the same queue) Let (u,p_u), (v,p_v) be these symbol/weight pairs enqueue ( (j,p_u+p_v) , Q) left[j] <- u, right[j] <- v |
Question 4 | Algorithm on trees. Given is a parent pointer tree in array form: the array p[i], 1 <= i <= n, holds the pointers (integers) to the parents, and the nodes are numbered 1 through n. Give a linear time algorithm for making this into an ordered tree implemented by the oldest child-next sibling method. The ordering of the children is unimportant. Do this by filling in the arrays oldest[i], next[i], 1 <= i <= n. |
Answer |
oldest[.] is a pointer to one child,
and next[.] is a pointer to the next sibling.
The children of any node
are kept in a single linked list (as in
the adjacency lists of graphs).
Initially, we set oldest[x]=next[x]=NIL
for all x.
y=p[i] (y is the parent of i) if oldest[y]=nil then oldest[y]=i else next[i]=oldest[y], oldest[y]=i |
Question 5 | Algorithm on graphs. A directed graph is given in which each of the nodes has one neighbor, nb[.] (thus, (x,nb[x]) is an edge from x to nb[x], and loops (nb[x]=x) are not allowed). Part A. Why does this graph have a cycle? Part B. Write an O(n) worst-case time algorithm that outputs one (any) cycle. Give the details, but make the algorithm as "minimal" as possible (no extraneous variables, no unnecessary segments, etcetera). |
Answer |
Part A: Pick a node x(1). Start walking to neighbors
and obtain x(1),...,x(n+1). In this sequence, there
must be a repetition, say at x(i)=x(j). Thus,
x(i),...,x(j)=x(i) is a cycle.
Part B: We perform DFS (walking to neighbors) until a gray node is encountered. To fix the notation, let the DFS be at u and let the gray neighbor of u be v. Then, walk from v to u again, and add the edge (u,v) to obtain the cycle. The first step takes O(n) worst-case time as our graph has n vertices and n edges. The second step takes O(n) time because the size of a cycle is O(n). |
Question 6 | Sorting. The input is a sequence of n integers with many duplications, such that the number of distinct integers in the sequence is O(log n). Design a sorting algorithm (based on comparisons only) to sort such sequences using at most O(n log log n) comparisons in the worst case. Hint: A red-black tree may be useful here! |
Answer |
This question was taken from Manber's textbook, page 178.
We make a red-black tree of distinct keys,
keeping track of multiplicities in each node
via an augmented field count[.].
As the tree has size O(log n), each dictionary
operation takes time O(log log n), hence the red-black tree
may be constructed in O(n log log n) time.
A few details:
y=search (x) (returns position of x in tree, or NIL) if y=NIL then perform red-black-tree-insert (x) set count[.]=1; else count[y] += 1 (no insert needed) main program: makenull (tree) for all x do insert (x) inordertraversal (tree) (note: repeat element at y count[y] times in output) |
Question 7 | Analysis. How many nodes are there in a complete 17-ary tree with n nodes that have the property that none of their ancestors is a youngest child (note: every litter has only one youngest child)? Give the answer (N) as a function of n in Theta notation. Point out what algorithm could be used to achieve this in time O(N)? Hint: This is an exercise on the master theorem. |
Answer |
The following recurrence is obvious, by
considering what happens at the root:
|
Question 8 |
Hashing.
We are given an array A of size n
with records of shipments.
Shipment number i (the shipments are numbered from
1 to n) has the following fields:
descr[i]: the description of the contents of a shipment; address[i]: the address for the shipment; invoice[i]: a 9-digit invoice number; sender[i]: the name of the sender. |
Answer |
One possibility is to use direct addressing,
building on the given arrays. However, to locate by
invoice number, we make a second table, also of size n,
with a hash function based on the invoice number only (!!!),
in which we employ chaining. One might use something like:
h(invoicenumber) = invoicenumber mod n.
Thinking about this, we see that each node
requires a next[.] field, pointing to the next
node in the same chain. To perform delete in O(1) time,
the chains must be double-linked,
and we introduce an additional pointer previous[.].
So, here is the proposal:
weight[i] descr[i] address[i] invoice[i] sender[i] next[i]: next in chain previous[i]: previous in chain Additional hash table H of size n, consisting of n pointers to headers of chains, so search can be done in the chain with first element H[h(invoicenumber)]. Search thus takes O(1) expected time. |
Copyright © 1999 Luc Devroye. Email: luc@cs.mcgill.ca.