## Third assignment--CS251--Winter 2000

 Question 1 k-d trees. Given is a k-d tree with the following information in each node: left[u] is the left child of u, right[u] is the right child of u, X[u] is the X-coordinate value of a point, Y[u] is the Y-coordinate value of a point, d[u] is the direction of the cut through (X[u],Y[u]) (horizontal or vertical). The root of the tree T is root[T]. Write a simple recursive algorithm for locating all points u in the tree for which the X-coordinate is in the interval [a,b], and whose Y-coordinate is in the interval [c,d]. Solution ``` visit(u) (visits all nodes in the subtree at u) if a <= X[u] <= b and c <= Y[u] <= d then output(u) if d[u]=vertical then if a <= X[u] then visit(left[u]) if X[u] <= b then visit(right[u]) else if c <= Y[u] then visit(left[u]) if Y[u] <= d then visit(right[u]) ``` The procedure above is called with "visit(root[T])". Question 2 BST with a cache. To "cache" something in a data structure means to move it to the front or the top so that it easily and quickly accessible. The guiding principle is that elements that are frequently searched for should be easily accessible. Applying this principle to a binary search tree, we may implement the operations insert and search in two phases: first perform an ordinary insert or search, and then move the inserted or searched element to the root, one level at a time, by consecutive rotations. Note that the resulting tree is as if that element were the first inserted element. Write the move-to-root algorithm, assuming that all nodes have left[.], right[.] and parent[.] pointers, as well as key[.] values. Solution Consider the subtree rooted at parent[x], where x is the present node. Among them, x and parent[x] have three disjoint subtrees. Rearrange these and iterate. The procedure below is called starting at the freshly inserted node x or the last node x searched for. ``` movetoroot(x): y=parent[x] if y != nil (x is not the root) then a=left[x] b=right[x] c=parent[y] ( a,b,c are the "outside world" ) ( a,b,c can be avoided, but the code would be less readable without them ) parent[y]=x parent[x]=c if left[y]=x then right[x]=y left[y]=b if b!= nil then parent[b]=y else left[x]=y right[y]=a if a!= nil then parent[a]=y movetoroot(x) ``` Question 3 Random binary search tree. What is the expected depth of the minimum value node in a random binary search tree? Solution Well, this is the expected number of records in a random permutation of length n, minus one (as the number of edges on that path is one less than the number of nodes), so we have ``` 1/2 + 1/3 + ... + 1/n ``` when n >= 2. Note that this is asymptotic to log (n), while, as we saw in class, the expected depth of the n-th inserted node is asymptotic to 2 log (n). Question 4 Sorting with repeated values. Assume that we are given n numbers with very many repetitions, such that only k different values occur, where k is much smaller than n. Using comparisons only, show how you can sort them in worst-case time time O(n log k). Hint: use some form of tree search. Solution We build a red-black tree by consecutive insertion of the n elements. With each node x, we keep a count field count[x] to count the number of repetitions of a key. When inserting a node with key value k, two situations occur: if the key is not present, insert as in a red-black tree, and set the repeat count for the node equal to one. If the key is present in node x (say), then we merely increase count[x] by one. Both cases can be implemented in time O(h), where h is the height of the tree. But h <= 2 log_2 (k+1) be a property of red-black trees on fewer than k nodes. Repeating this n times thus yields a comparison-based complexity of O(n log k). Once the tree is constructed, perform inorder traversal, and output key[x] count[x] times. This takes O(n) time in all. Question 5 Random binary search tree. What is the expected size of the subtree (root included) of the 99-th node inserted in a random binary search tree with 2000 nodes? Solution The 99 key values define 100 intervals (less than the smallest, between first and second smallest, and so forth). Each interval has an equal probability of receiving each of the remaining 1901 nodes, so each will receive on the average 1901/100=19.01 nodes. But the 99-th node's subtree will have all nodes from the two neighboring intervals, and thus, its subtree has expected size 1 + 2*19.01 = 39.02. Question 6 Dual data structures. Assume data are given with two keys, key1 and key2. Describe a data structure that will allow in O(log n) expected time to perform the operations search-by-key1, search-by-key2, insert, delete-by-key1, delete-by-key2. Solution For one key, one would use an ordinary binary search tree. Thus, it makes sense to use two binary search trees. We do not wish to waste space by repeating key values, so each cell will have the following fields: ``` cell x: key1[x], key2[x] : the key values left1[x], right1[x], p1[x] : pointers for tree 1 color1[x] : red-black color for tree 1 left2[x], right2[x], p2[x] : pointers for tree 2 color2[x] : red-black color for tree 2 ``` To search is trivial. To insert, first perform an insert in tree 1. This will fix the values of left1[x], right1[x], color1[x] and p1[x] for all x. Then do the same for tree 2.