## Fourth assignment--CS251--Winter 2000

 Question 1 Red-black trees. In class, we saw a rough algorithm for insert. Using the fields color[.], rank[.], left[.], right[.], key[.] and p[.], give the full algorithm, rotation excluded (so, you may assume that rotation works fine). Solution Assume that all external nodes are non-existent and thus correspond to nil pointers. Let k be the key to be inserted in the tree with root r. First we find the parent y of the new node x. ``` create new node x key[x]=k left[x]=right[x]=nil rank[x]=1 color[x]=red y=r if k < key[y] then z=left[y] else z=right[y] while z != nil do: y = z if k < key[y] then z=left[y] else z=right[y] (y is now the parent) p[x]=y if k < key[y] then left[y]=x else right[y]=x ``` Next we iterate two levels at a time as long as the parent and uncle of x are red. Without loss of generality we will assume that the color of the root is always black. The procedure uncle(x) is obvious: ``` if p[x]=nil then return nil (x is the root) else if p[p[x]]=nil then return nil (p[x] is root) else if left[p[p[x]]]=p[x] then return right[p[p[x]]] else return left[p[p[x]]] ``` With this in hand, we tackle the algorithm: ``` while uncle(x) != nil and color[p[x]]=red and color[uncle(x)]=red do color[p[x]]=black color[uncle(x)]=black x=p[p[x]] color[x]=red rank[x]+=1 if uncle(x) != nil and color[p[x]]=red and color[uncle(x)]=black do ROTATION! ``` Question 2 Data structure. Suggest a data structure for data with two keys, x and y, in which we can implement the following operations in O(log n) worst-case time: ``` makenull search (k) ; k is an x-key searched for insert(x,y) ; insert a node with keys x and y delete(u) ; delete node pointed to by u searchmax(a,b) ; find the maximum y-key among all nodes with x-key between a and b ``` Solution One possibility is a red-black tree organized on the x-keys. This will guarantee that the height is O(log n). As augmented fields we have: ``` y-key[u] max[u]: pointer to the node with the maximal y-key in the subtree of u ``` It is trivial to perform search, delete and insert in O(log n) time, following the algorithms for red-black trees. We need only point out how max[.] is updated. But when a key is inserted, max[.] can only change for the keys on the insertion path, and can thus be updated on the fly. Similarly. the red-black tree rotations cause no problems. The tricky part is searchmax(a,b). First we find the deepest node u on the search paths for both a and b. This was done in the tutorial and won't be repeated here. Clearly, its key is between a and b. If u does not exist, then searchmax(a,b) returns nil. If u exists, we proceed as follows: ``` x=u MAXKEY=y-key[x] while x!=nil do: if key[x] > a then x=left[x] else x=right[x] if x!=nil and key[x] in [a,b] then MAXKEY = max (y-key[x], y-key[max[right[x]]], MAXKEY) x=u while x!=nil do: if key[x] < b then x=right[x] else x=left[x] if x!=nil and key[x] in [a,b] then MAXKEY = max (y-key[x], y-key[max[left[x]]], MAXKEY) return MAXKEY ``` Question 3 Red-black tree with lazy delete. Consider a red-black tree with lazy-delete, that is, when 50% of the nodes are marked "deleted", the tree is junked and a new red-black tree is constructed from scratch from the non-marked nodes. Assume that the tree initially has n nodes, and that we perform n/2 operations from the set "search, insert, delete". With the standard red-black tree implementation, each operation would take time bounded by C log (3n/2) as the tree can have at most 3n/2 elements. The total time bound would be C n log (3n/2) = O(n log n). With lazy delete thrown in, we would like to get O(n log n) as well. Prove that. Then prove that it is still O(n log n) when we start from an empty tree and perform n arbitrary operations from the set. Hint for part 2: think about the cost of the rebuild operation in terms of the time since the last rebuild. Solution First of all, if a tree with n nodes of which 50% are marked deleted is given, then we can make a new red-black tree consisting only of the good (non-deleted) nodes in time about n. To do so, first list the nodes in order (in linear time), omitting marked nodes of course. Then make a binary search tree of minimal height h = floor(log_2 n) in linear time (as we saw in class). In this tree, all levels except the last one are full. Color all the nodes of the full levels black. Make all the nodes in the last level red, and you have a red-black tree. The time for this rebuilding is equal to n, say. Consider the first part, in which we can clearly perform at most one rebuild operation in the given n/2 iterations. In that case, we have at most C (n/2) log (3n/2) cost for the standard operations. To this, we add a time cost of n for a rebuild at the end (because a rebuild can only occur if all n/2 operations were lazy-deletes). For part 2, we will divide the time cost of a rebuild of a tree of size n over the n/2 previous iterations (none of which were rebuilds, by the way), giving a cost of 2 to each iteration. This is just another way of accounting. In this manner, each iteration "costs" 2 plus the real time it takes for whatever we want to do, the total being less than ``` 2 + C log (n) ``` as the total tree size is at most n at any given time. Thus, summed over all iterations, the time is bounded by ``` n( 2 + C log (n) ) = O(n log n). ```