School of Computer Science

Winter 1997 Class Notes for 308-251B

Balanced search trees are trees whose height in the worst case is and for which the operations INSERT, DELETE, and SEARCH can be implemented in time. Complete binary trees, if used as search trees, cannot be updated after an insertion or deletion in time less than . They are too rigid. The key to the design of balanced search trees is the introduction of glue, or degrees of freedom. This glue takes the form of red nodes in a red-black tree, nodes with 2 children (instead of three) in 2-3 trees, and nodes with a height-imbalance of one in AVL trees.

Binary search trees (see Figure 1) work well for many applications
but they are limiting because of their bad worst-case performance (height =
O(n)).
A binary search tree with this worst-case structure is no more efficient than
a regular linked list.

There exist over one hundred types of balanced search trees. Some of the more common types are: AVL trees, B-trees, and red-black trees. B-trees are a category of tree designed to work well on magnetic disks and other direct-access secondary storage devices (Cormen et al., 1996, p:381).

red-black tree:

AVL tree:

2-3 tree:

AVL trees were invented by two computer chess specialists in Russia, Adelson-Velskii and Landis in 1962 (hence the acronym AVL). See next section for detailed information regarding AVL trees.

A B-tree of order m is a tree exhibiting the following properties:

- the root is either a leaf or has between 2 and m children
- all nonleaf nodes (except the root) have between and m children
- all leaves are at the same depth

B-trees of order 4 are known as 2-3-4 trees, and a B-tree of order 3 is known as a 2-3 tree. The accompanying Java applet demonstrates insert operations on a 2-3 tree. In fact, 2-3 trees were "invented" by J.E. Hopcroft in 1970 (Cormen et al., 1996, p:399), prior to B-trees in 1972 by Bayer and McCreight.

Red-black trees are standard binary trees that satisfy the following conditions:

- every node is either red or black
- every external node (nil) is black
- no red node can have a red parent
- every simple path from a node to an external contains the same number of black nodes

AVL trees are identical to standard binary search trees except that for
every node in an AVL tree, the height of the left and right subtrees can differ
by at most 1 (Weiss, 1993, p:108). AVL trees are HB-k trees (height balanced
trees of order k) of order HB-1. The following is the height differential
formula:

When storing an AVL tree, a field must be added to each node with one of three values: 1, 0, or -1. A value of 1 in this field means that the left subtree has a height one more than the right subtree. A value of -1 denotes the opposite. A value of 0 indicates that the heights of both subtrees are the same.

Updates of AVL trees require up to rotations, whereas updating red-black trees can be done using only one or two rotations (up to color changes). For this reason, they (AVL trees) are considered a bit obsolete by some.

Sparse AVL trees are defined as AVL trees of height h with the fewest possible nodes. Figure 3 shows sparse AVL trees of heights 0, 1, 2, and 3.

We first analyze the number of nodes in a sparse AVL tree of height **k**.

From Figure 3, we can deduce the following recurrence relation:

which looks like the Fibonacci sequence given by:

Adding **1** on both sides of the equalities we obtain:

Taking to be , we rewrite the recurrence relation as:

which is the usual Fibonacci recurrence relation.

By inspecting the Fibonacci sequence, one suspects that it grows
somehow "exponentially" i.e. that it is of the form: ,
for some suitable constants **A** and **c** that we have to find.

Replacing by in the Fibonacci recurrence relation, we get:

After cancelling **A**, dividing by on both sides, the
following quadratic equation:

is solved with the usual method, and:

The theory for * linear homogeneous recurrence relations* (cf:
* Rosen, K. "Discrete Mathematics and its Applications", third edition.*)
says that the general solution of such a recurrence relation is given by:

Note that: is an integer. Also:

since , we have fast.

We still have to find the two unknowns and . Using the initial conditions in the recurrence relation, we know that and , so the constants are extracted from:

Solving we get: and,

Hence, a sparse AVL tree of height **k** has:

If an AVL tree in general has height **k**, and **n** nodes, then:

and therefore, after trivial operations,

Note: is known under the name * golden ratio* and powers of it
are very close to be integers.

In the worst case, a 2-3 tree is a complete binary tree.
Complete binary tree of height h has:

1 + 2 + 4 + ... + 2^h

= (2^(h+1) - 1)/(2 - 1)

= 2^(h+1) - 1 nodes.

Therefore, the height of such a tree with n nodes can be computed as:

n = 2^(h+1) - 1,

2^(h+1) = n + 1,

h + 1 = (n+1),

h = (n+1) - 1 ~= . (n >>)

The best case of a 2-3 tree is a complete ternary tree.
A complete ternary tree of height h has:

1 + 3 + 9 + ... + 3^h

= (3^(h+1) - 1)/(3 - 1)

= (3^(h+1) - 1)/2 nodes.

Thus, the height of such a tree with n node can be calculated in the same
way.

n = (3^(h+1) - 1)/2,

3^(h+1) = 2n + 1,

h + 1 = (2n+1) (base 3),

h = (2n+1) - 1 ~= . (n>>)

Therefore, the height of a 2-3 tree is sandwiched between and , thus, always beats a complete binary tree which has a height ~= .

- the internal(non-terminal) nodes have two or three children

- the 2-3 tree is an empty tree
- the tree has a root node with a single data item and each of its children are 2-3 trees
- the tree has a root node with two data items and three children, each of which is a 2-3 tree

Although searching operations on a 2-3 balanced search tree are guaranteed a very good worst case, this property comes at the expense of sometimes costly insertions. Figure 6 illustrates such a situation and the Java applet below performs the insertion.

http://www.cs.oberlin.edu/classes/cs275/labs/lab4/lab45.html#_15

Introduction to AVL trees. Explanation of construction and
insertion. Rhys Price Jones, Fritz Ruehr, Richard Salter, Oberlin College.

http://www.cs.nwu.edu/academics/courses/c11/notes/balanced-trees.html

Brief survey of some of the many kinds of balanced search trees. Chris
Riesbeck, Northwestern University.

http://www.research.digital.com/SRC/zeus/redblack3D.html

3D view illustrates the mapping between a 2-3-4 tree and a red-black
tree. Marc H. Brown, Marc A. Najork, Digital Equipment Corporation.

http://huckleberry.sfsu.edu/~cs410/12-AdvancedTablesSlides.frame.html

Lecture slides covering various balanced searc tree types: 2-3 trees,
2-3-4 trees, red-black trees, AVL trees. Jack Hodges Ph.D., San Francisco State University.

http://www.crl.com/www/users/wi/windbase/avltree.htm

AVL and Threaded AVL Balanced Binary Trees with source code in C and
C++. Windbase Software Inc..

http://orange.mcs.dundee.ac.uk:8080/cs202/tuts/mosaic/trees/AVLDemo/AVLTree.html

Java Applet. Construction of AVL Tree. Stepped or continuous Animation.
Dr. Glenn W. Rowe, University of Dundee.

http://www.cise.ufl.edu/~shan/Java/TwoThree/

Java Applet. Insertion and deletion in a 2-3 Tree. Woo Han Sang,
University of Florida.

Bell, Jim. |
Author introduces self-adjusting binary trees and shows the results of testing done also with BST and AVL tree and compares the results. |

Cormen, Thomas H., Charles E. Leiserson, and Ronald L. Rivest. |
Chapter 14 covers red-black trees: their properties, rotations, insertions, and deletion. |

Horowitz, Ellis. Sahni, Sartaj. |
AVL implementation in Sparks(insersion). Time analysis of operations with comparison with sequential list and linked list. |

Gotlieb, C. |
Lists and explains different ways of achieving completeness of the tree. M-ary trees(including 2-3 Tree). Visual representaton of insertion methods. Deletion algorithm and analysis. |

Koster, C.H.A. Van der Weide, Th.P. |
Maintaining efficiency is very complex in balanced trees such as AVL Tree and 2-3 Tree. Thus we can consider such class of trees (that the author refers to as "Hairy Tree") whose insersion algoriths perform local balancing around newly inserted node, without any backtrackings of the search path. Author performs time analysis of the algorithm and it's comparison with AVL, and 2-3 Trees. |

Reingold, Edward M. |
Rebalancing, deletion alogrithms and their time analysis. |

Robert Sedgewick. |
Chapter 15 covers Balanced Trees: Top-Down 2-3-4 trees, red-black trees, and other algorithms. |

Sorenson, Trembly. |
Different search trees. Search, intersion, rebalancing, and deletion algorithms. |

Weiss, Mark Allen. |
Section 4.4 covers single and double rotation of AVL trees. |

This web page was created by:

- Martin Beaugrand (beaugrand@cs.mcgill.ca) prepared the analysis of AVL trees and produced all mathematical symbols in this document.
- Tomozumi Kanayama (BVC9@musicb.mcgill.ca) prepared the analysis of 2-3 trees and found relevant links and references.
- Marc-Andre Sauve (mas@cs.mcgill.ca) programmed the Java applet.
- Marc van Kralingen (mvankr@cs.mcgill.ca) wrote the HTML, designed the tree diagrams, and found relevant links.

The tree diagrams were designed using Pacestar Software's Edge Diagrammer, imported into JASC Inc's Paint Shop Pro 3.0 and converted into GIF format.

The mathematical graphics and text were prepared with Latex and converted to GIF format using Nick Drako's latex to html translator (Copyright © 1993, 1994).

The Java applet was compiled with Sun's Java Development Kit.

Last updated March 27, 1997 by Marc van Kralingen.

Copyright © 1997, Martin Beaugrand, Tomozumi Kanayama, Marc-Andre Sauve, and Marc van Kralingen. All rights reserved. Reproduction of all or part of this work is permitted for educational or research use provided that this copyright notice is included in any copy.