Lectures in COMP 252--Winter 2025


CLR refers to Cormen, Leiserson, Rivest and Stein: "Introduction to Algorithms (Third Edition)".

Lecture 1 (Jan 7)

General introduction. Definition of an algorithm. Examples of simple computer models: Turing model, RAM model, pointer based machines, comparison model. RAM model (or uniform cost model), bit model (or: logarithmic cost model). Definition and examples of oracles (not done explicitly),. The Fibonacci example: introduce recursion, iteration (the array method), and fast exponentiation. Worst-case time complexity. Landau symbols: O, Omega, Theta, small o, small omega. Examples of many complexity classes (linear, quadratic, polynomial, exponential).

Resources:

Exercise mentioned during the lecture:

  • Show that for the fast exponentiation method in the RAM model, T_n = O(log(n)) even if n is not a power of 2.
  • Show that (log n)^1000 = o(n^0.001), n^1000=o(1.001^n), and 1000^n = o(n!).

Exercise to chew on, but not mentioned in class:

  • Determine in Big Theta notation the bit model time complexities of the recursion method, the array method, and the fast exponentiation method. In class, we only did the RAM model complexities.

Lecture 2 (Jan 9)

Introduction to the divide-and-conquer principle. Fast exponentiation revisited. Introduction to recurrences. The divide-and-conquer solution to the chip testing problem in CLR. Divide-and-conquer (Karatsuba) multiplication. Mergesort, convex hulls in the plane. We covered all of chapter 2 and part of chapter 3 of CLR.

Resources:

Exercises mentioned during the lecture:

  • Describe in algorithmic form the linear time upper hull merge algorithm described in class.

Lecture 3 (Jan 14)

Binary search using a ternary oracle (with analysis by the substitution method). Strassen matrix multiplication. The master theorem (statement only). Recursion tree to explain the master theorem. Ham-sandwich and pancake theorems and halfspace counting.

Resources:

Lecture 4 (Jan 16)

Linear time selection: finding the k-th smallest by the algorithm of Blum, Floyd, Pratt, Rivest and Tarjan. O(n) complexity proof based on induction. Linear expected time selection by Hoare's algorithm "FIND". The induction method illustrated on mergesort. Euclid's algorithm for the greatest common divisor.

Resources:

Exercises mentioned during the lecture:

  • For the binary search algorithm with ternary oracle, show by induction on n that the worst-case number of comparisions T_n satisfies
    T_n ≤ ⌈log_2 (n+1)⌉
    for all n.
  • For the binary search algorithm with binary oracle, show by induction on n that the worst-case number of comparisions T_n satisfies
    T_n ≤ 2 + ⌈log_2 (n)⌉
    for all n. Convince yourself that T_1 =2, and that
    T_n = 1 + T_{⌊(1+n)/2⌋}.
  • Show that the bit complexity of Euclid's algorithm for gcd (n,m) is O( log n . log m ).

Lecture 5 (Jan 21)

Assignment 1 due at 4 pm on myCourses. Assignment 2 handed out (on myCourses). Due January 30, 2025, 4 pm. The Fast Fourier Transform. Multiplying polynomials in time O(n log n).

Resources:

  • CLR, chapter 30.
  • Printed notes posted on myCourses.

Lecture 6 (Jan 23)

Dynamic programming. Computing binomial coefficients and partition numbers. The traveling salesman problem. The knapsack problem. The longest common subsequence problem.

Resources:

Exercises mentioned in class:

  • Modify the knapsack algorithm for the case that we have an unlimited supply of items of the given sizes.
  • Can you count the number of knapsack solutions by dynamic programming?
  • Can you count the number of ways of paying a bill of n dollars with coins of values c_1 < c_2 < ... < c_k? Here the input is n, k, c_1,...,c_k.
  • In the shortest path problem, formulate the sub-solutions necessary so that when asked, one can get the shortest path sequence from an arbitrary node j to node 1 in time O(n), where n is the number of nodes. And tell us how to modify or update the TSP algorithm seen in class.

Lecture 7 (Jan 28)

Lower bounds. Oracles. Viewing algorithms as decision trees. Proof of lower bound for worst-case complexity based on decision trees. Examples include sorting and searching. Lower bounds for board games. Finding the median of 5 with 6 comparisons. Merging two sorted arrays using finger lists. Lower bounds via witnesses: finding the maximum.

Resources:

Practice: What is meant by the statement that for an algorithm A, the worst-case time T_n (A) is Ω(n^2)? And what is meant when we say that for a given problem, the lower bound is Ω(n^7)?

Lecture 8 (Jan 30)

Assignment 2 due at 4 pm. Assignment 3 handed out. It is due on February 9 at 10 pm. Lower bounds via adversaries. Guessing a password. The decision tree method as a special case of the method of adversaries. Finding the maximum and the minimum. Finding the median. Finding the two largest elements.

Resources:

An exercise and a question asked in class:

  • We mentioned that the height of a complete binary tree on n leaves is ⌈log_2 n⌉. Prove this. And show that for any binary tree, this is a lower bound.
  • On to Andy Yao's proof. A student asked whether the items eaten by the eventual winner had to be somewhere in a witness tree *after* they were eaten, i.e., after their weight was reduced to zero. Convince yourself that that is indeed the case (and that, in fact, is at the core of Yao's argument).

Lecture 9 (Feb 4)

Introductory remarks about ADTs (abstract data types). Lists. Primitive ADTs such as stacks, queues, deques, and lists. Classification and use of linked lists, including representations of large integers and polynomials, window managers, and available space lists. Uses of stacks and queues in recursions and as scratchpads for organizing future work. Introduction to trees: free trees, ordered trees, position trees, binary trees. Bijection between ordered trees (forests) and binary trees via oldest child and next sibling pointers. Relationship between number of leaves and number of internal nodes with two children in a binary tree. Complete binary trees: implicit storage, height.

Resources:

Exercises mentioned in class:

  • We saw the way of navigating a complete binary tree using the locations of the nodes in an array. Generalize the navigation rules by computing for a given node i in a complete k-ary tree with n nodes, the parent, the oldest child and the youngest child (and indicate when any of these do not exist).
  • What is the height of a complete k-ary tree on n nodes?
  • Prove by induction that in a binary tree, the number of leaves is equal to the number of nodes with two children plus one.
  • Write an O(n) time program for creating the oldest-child-next-sibling binary tree seen in class from a given ordered tree in which the children of a node are stored from oldest to youngest in a linked list. Conversely, from a binary tree in which the root only has a left child, recreate in O(n) time the corresponding ordered tree.

Lecture 10 (Feb 6)

Assignment 3 due on February 9 at 10 pm. Continuing with trees. Depth, height, external nodes, leaves, preorder numbers, postorder numbers. Preorder, postorder, inorder and level order listings and traversals. Recursive and nonrecursive traversals. Stackless traversal if parent pointers are available. Representing and storing trees: binary trees require about 2n bits. Counting binary trees (Catalan's formula) via counting Dyck paths. Expression trees, infix, postfix (reverse Polish), and prefix (Polish) notation.

Resources:

Exercises mentioned in class:

  • Write linear time algorithms to go between any of the 2n-bit representations of binary trees, and a binary tree.
  • Write an O(n) algorithm that computes for all nodes in an ordered tree of size n the sizes of their subtrees (and stores those sizes in the corresponding cells).
  • How many bits are needed to store the shape of a binary expression tree on n nodes?
  • How many bits are needed to store the shape of a rooted ordered tree on n nodes?

Lecture 11 (Feb 11)

Introduction to the ADT Dictionary. Definition of a binary search tree. Standard operations on binary search trees, all taking time O(h), where h is the height. Generic tree sort. The browsing operations. Definition of red-black trees. Red-black trees viewed as 2-3-4 trees.

Resources:

Exercises mentioned in class:

  • Show that √n sorted lists, each of length √n, can be merged into one sorted list of length n using only about (1/2) n log_2 n comparisons. Furthermore, show that an unsorted list of n items can be turned ino √n sorted lists, each of length √n using only about (1/2) n log_2 n comparisons.
  • Show by induction on h (the height) that h = O(log n) for AVL trees.

Exercise not mentioned in class:

  • We are given an arbitrary binary tree representing an electrical network. Within each node, we store the resistance of the left and right edges emanating from the node (if they exist). If one or two of those edges are missing, replace them by zero. Imagine connecting all leaves to the ground (run a wire through all of them). Then your task is to write a linear time algorithm that computes the resistance between the root and the ground.

Lecture 12 (Feb 13)

Relationships between rank, height and number of nodes. Insert operation on red-black trees. Standard deletion is not covered in class. Lazy delete. Split and join in red-black trees. Application to list implementations (concatenate and sublist in O(log n) time). Definition of HB-k, AVL and 2-3 trees.

Resources:

Exercises mentioned in class:

  • Write a simple recursive algorithm to construct a red-black tree from a sorted array of $n$ elements. Your algorithm should take O(n) RAM time, and should not use any key comparisons.
  • Show by induction that for a red-black tree, r ≤ h ≤ 2r, where r is the rank of the root and h is the height of a red-black tree.
  • Convince yourself that the split operation can be implemented in O(log n) time.

Lecture 13 (Feb 14)

Assignment 4 handed out, due February 27, 9 pm. Augmented data structures. Examples:

  • Data that participate in several data structures.
  • Combining search trees and sorted linked lists.
  • Order statistics trees. Operations rank and selection.
  • Interval trees. VLSI chip layout application. Sweepline paradigm.
Not done this year:
  • k-d-trees. Range search, partial match search, quadtrees.
  • Quadtrees.
  • Definition of BSP (binary space partition) trees. Painter's algorithm.

Resources:

Lecture 14 (Feb 18)

Midterm examination, McMed 504: 4:30-6 pm.

Lecture 15 (Feb 20)

Cartesian trees. Random binary search trees. Treaps. Analysis of the expected time of the main operations. Quicksort revisited.

Resources:

  • CLR, section 12.4 (for random binary search trees).
  • CLR, chapter 7 (for quicksort).
  • Scribed notes.

Lecture 16 (Feb 25)

Hashing. Direct addressing. Hashing with chaining. Open addressing. Bloom filters.

Resources:

  • CLR, chapter 11 (hash tables).
  • Scribed notes. We did not cover dynamic hashing, radix sort and bucket sort.

Lecture 17 (Feb 27)

Assignment 4 due at 2:37 am on March 1, 2025. Priority queues (as abstract data types). Uses, such as in discrete event simulation and operating systems. Binary heaps. Heapsort. Pointer-based implementation of binary heaps. Tournament trees. k-ary heaps.

Resources:

Suggested exercise: in class, we implemented a lazy deletemin in a tournament tree. How would you implement a true deletemin?

Lecture 18 (Mar 11)

Assignment 5 posted on myCourses on March 2, 2025. It is due on March 15, at 1:11 am. Amortization. Potential functions. Examples:

  • Stack with multipop.
  • Lazy delete in red-black trees.
  • Binary addition.
  • Splay trees.

Resources:

Exercise suggested in class: if one wants to keep the load factor in a hash table with chaining between 1/2 and 2, one may occasionally rehash, with the load factor exactly one (after rehashing). Show that the expected amortized complexity of insert, delete, search, and rehashing is O(1). The key is to invent an appropriate potential.

Lecture 19 (Mar 13)

Stringology. Data structures for strings and compression. Tries, PATRICIA trees, digital search trees. Suffix tries, suffix trees, suffix arrays. String searching in a large file via the Knuth-Morris-Pratt algorithm.

Resources:

  • CLR, section 32.4 (for the Knuth-Morris-Pratt algorithm).
  • Scribed notes.

Exercises:

  • Show that for a k-ary PATRICIA tree on n strings, and for a suffix tree on a text of n symbols, the number of the nodes in the tree is at most 2n.
  • Give an algorithm for insertion into a trie.
  • Add one line to the KMP algorithm to make it return all pattern matches, not just one.

Lecture 20 (Mar 18)

Assignment 6 handed out. It is due on March 28 at 11:59pm. Introduction to compression and coding. Fixed and variable width codes. Prefix codes. Huffman trees. Huffman code. An O(n log n) algorithm for finding the Huffman tree. Definition of a greedy algorithm. Data structures for Huffman coding. Merging sorted lists by the greedy method.

Resources:

  • Scribed notes: introduction to information theory.
  • CLR, section 16.2 (for prefix and Huffman codes).

Exercises:

  • Design an algorithm that outputs all codewords given a prefix coding tree.
  • Show by example that there is a set of symbol weights for which the table of codewords for the Huffman codes takes Ω (n^2) space.
  • Why is the expected length of a Huffman codeword for an n-symbol alphabet at most ⌈log_2 n⌉?

Lecture 21 (Mar 20)

Definition of entropy and Shannon's result on best possible compression ratios. Entropy of an input source. Kraft's inequality. Proof of Shannon's theorem. The Shannon-Fano code. Lempel-Ziv compression. Data structures for Lempel-Ziv compression. Digital search tree.

Resources:

Exercises:

  • Prove Kraft's inequality by induction on the height.
  • Prove Kraft's inequality for infinite trees by induction on the number of leaves. (Hard)
  • Write the full one-pass linear time algorithms for Lempel-Ziv compression and decoding.

Lecture 22 (Mar 21)

Introduction to graphs: notation, definitions, adjacency matrix, adjacency lists. Graph traversals as the fundamental procedures for most graph algorithms: depth first search. DFS forest and DFS properties. Classification of edges. Detecting cycles.

Resources:

  • CLR, sections 22.1, 22.2 and 22.3.
  • Scribed notes on graph algorithms.

Exercises:

  • Show that one can order all adjacency lists in O(|V|+|E|) time.
  • What is the worst-case time of the DFS algorithm for detecting a cycle that we saw in class?

Lecture 23 (Mar 25)

Euler tour. Breadth first search. BFS tree. Shortest path problem. Shortest path properties. Dijkstra's greedy algorithm for shortest paths.

Resources:

  • CLR, section 22.2.
  • Scribed notes on graph algorithms.
  • CLR, section 24.3 (Dijkstra's algorithm for shortest paths).

Exercises:

  • Modify the BFS algorithm to decide if a given graph is bipartite.
  • Suggest a data structure for managing the adjacency lists in the Euler tour algorithm.

Lecture 24 (Mar 27)

Minimal spanning tree. Main properties. Greedy approach for the MST: the Prim-Dijkstra algorithm. Application in hierarchical clustering. Kruskal's algorithm. A two-approximation for the traveling salesman problem. The all-pairs shortest path algorithm of Floyd and Warshall. Transitive closure of a graph and matrix multiplication.

Resources:

  • CLR, chapter 23 (minimal spanning trees).
  • CLR, section 35.2 (the traveling salesman problem).
  • CLR, section 25.1 (transitive closure of a graph).
  • CLR, section 25.2 (the Floyd-Warshall algorithm).
  • Scribed notes on MST and shortest paths.

Exercises:

  • How would you find a spanning tree that minimizes the maximal weight (not the sum of the weights)?
  • Discuss the choice of priority queue in Dijkstra's algorithms for shortest path and MST when all edge weights are integers between $1$ and $10$. What is the overall complexity?

Lecture 25 (Apr 8)

Directed acyclic graphs (or: dags). Topological sorting. Activity (PERT) networks. NIM games. Strongly connected components in directed graphs.

Resources:

  • Scribed notes.
  • CLR, section 22.5 (strongly connected components).

Exercises:

  • Add a few lines to the algorithms for NIM games and PERT networks to compute for each node the perfect move (in a NIM game) and the critical path (in a PERT network), respectively.
  • How would you determine if a dag contains a node u that can be reached from all roots?

Lecture 27 (Apr 10)

Network flows. The Ford-Fulkerson method. The Edmonds-Karp algorithm.

Resources:

  • CLR, section 26.1 (network flows) and 26.2 (the Ford-Fulkerson method).
  • Scribed notes.