Lectures in COMP 252--Winter 2023


CLR refers to Cormen, Leiserson, Rivest and Stein: "Introduction to Algorithms (Third Edition)".

Lecture 1 (Jan 5)

General introduction. Definition of an algorithm. Examples of simple computer models: Turing model, RAM model, pointer based machines, oracles, comparison model. RAM model (or uniform cost model), bit model (or: logarithmic cost model). Definition and examples of oracles. The Fibonacci example: introduce recursion, iteration (the array method), and fast exponentiation. Worst-case time complexity. Examples of many complexity classes (linear, quadratic, polynomial, exponential). Landau symbols: O, Omega, Theta, small o, small omega.

Resources:

Exercises mentioned during the lecture:

  • Show that n log n = n^{ BigTheta (1) }.
  • Determine in Big Theta notation the bit model time complexities of the second, third and fourth methods seen in class for computing the n-th Fibonacci number.

Lecture 2 (Jan 10)

Introduction to the divide-and-conquer principle. Fast exponentiation revisited. Introduction to recurrences. The divide-and-conquer solution to the chip testing problem in CLR. Divide-and-conquer (Karatsuba) multiplication. Mergesort, convex hulls in the plane. We covered all of chapter 2 and part of chapter 3 of CLR.

Resources:

Exercises mentioned during the lecture:

  • Rank the following complexities from small to large: 2^{n^2}, n, log n, n^2, 2^n, n!, (log n)^n, n^n.
  • Analyze the bit model complexity of the most efficient algorithm you can think of for computing 7^{3^n}.
  • Describe in algorithmic form the linear time upper hull merge algorithm described in class.

Lecture 3 (Jan 12)

Binary search using a ternary oracle (with analysis by the substitution method). Strassen matrix multiplication. The master theorem (statement only). Recursion tree to explain the master theorem. Ham-sandwich and pancake theorems and halfspace counting.

Resources:

Exercises mentioned during the lecture:

  • Write the binary search algorithm using a binary oracle.
  • What is the "big theta" order of the solution of T_n = T_(n/2) + log n?

Assignment 1 handed out (on myCourses). Due January 19, 2023, 2:30pm.

Lecture 4 (Jan 17)

Linear time selection: finding the k-th smallest by the algorithm of Blum, Floyd, Pratt, Rivest and Tarjan. O(n) complexity proof based on induction. Linear expected time selection by Hoare's algorithm "FIND". Euclid's algorithm for the greatest common divisor. The induction method illustrated on mergesort.

Resources:

Exercises mentioned during the lecture:

  • For the binary search algorithm with ternary oracle, show by induction on n that the worst-case number of comparisions T_n satisfies T_n <= ceiling ( log_2 (n+1) ) for all n.
  • For the binary search algorithm with bianry oracle, show by induction on n that the worst-case number of comparisions T_n satisfies T_n <= 2 + ceiling ( log_2 (n) ) for all n.
  • (Hard) Show that the bit complexity of gcd (n,m) is O( log n . log m ).
  • Analyze T(n) = T(n/3) + T(2n/3) + n using the recursion tree.
  • Look at the median-of-3 version of the selection algorithm (instead the median-of-5 version seen in class) and show, by using recursion trees, that its complexity is big theta of n log n.

Lecture 5 (Jan 19)

Assignment 1 due. The Fast Fourier Transform. Multiplying polynomials in time O(n log n).

Resources:

  • CLR, chapter 30.
  • Printed notes posted on myCourses.

Lecture 6 (Jan 23)

Assignment 2 will become available on myCourses at 4pm. It will be due on January 31 at 2:30pm. Dynamic programming. Computing binomial coefficients and partition numbers. The traveling salesman problem. The longest common subsequence problem. The knapsack problem.

Resources:

Exercises mentioned in class:

  • From the nxm matrix of lengths of longest common subsequences, determine in time O(n+m) a longest matching.
  • Modify the knapsack algorithm for the case that we have an unlimited supply of items of the given sizes.
  • Can you count the number of knapsack solutions by dynamic programming?

Lecture 7 (Jan 26)

Lower bounds. Oracles. Viewing algorithms as decision trees. Proof of lower bound for worst-case complexity based on decision trees. Examples include sorting and searching. Lower bounds for board games. Finding the median of 5 with 6 comparisons. Merging two sorted arrays using finger lists. Lower bounds via witnesses: finding the maximum.

Resources:

Lecture 8 (Jan 31)

Assignment 2 due at 2:30pm. Assignment 3 will become available on myCourses at 4pm. It will be due on February 9 at 2:30pm. Lower bounds via adversaries. Guessing a password. The decision tree method as a special case of the method of adversaries. Finding the maximum and the minimum. Finding the median. Finding the two largest elements.

Resources:

Lecture 9 (Feb 2)

Introductory remarks about ADTs (abstract data types). Lists. Primitive ADTs such as stacks, queues, deques, and lists. Classification and use of linked lists, including representations of large integers and polynomials, iwindow managers, and available space lists. Uses of stacks and queues in recursions and as scratchpads for organizing future work. Introduction to trees: free trees, ordered trees, position trees, binary trees. Bijection between ordered trees (forests) and binary trees via oldest child and next sibling pointers. Relationship between number of leaves and number of internal nodes with two children in a binary tree. Complete binary trees: implicit storage, height.

Resources:

Exercises mentioned in class:

  • We saw the way of navigating a complete binary tree using the locations of the nodes in an array. Generalize the navigation rules by computing for a given node i in a complete k-ary tree with n nodes, the parent, the oldest child and the youngest child (and indicate when any of these do not exist).
  • What is the height of a complete k-ary tree on n nodes?
  • Prove by induction that in a binary tree, the number of leaves is equal to the number of nodes with two children plus one.

Lecture 10 (Feb 7)

Continuing with trees. Depth, height, external nodes, leaves, preorder numbers, postorder numbers. Preorder, postorder, inorder and level order listings and traversals. Recursive and nonrecursive traversals. Stackless traversal if parent pointers are available. Representing and storing trees: binary trees require about 2n bits. Counting binary trees (Catalan's formula) via counting Dyck paths. Expression trees, infix, postfix (reverse Polish), and prefix (Polish) representation.

Resources:

Exercises mentioned in class:

  • Write linear time algorithms to go between any of the 2n-bit representations of binary trees, and a binary tree.
  • How many bits are needed to store the shape of a binary expression tree on n nodes?
  • How many bits are needed to store the shape of a rooted ordered tree on n nodes?

Lecture 11 (Feb 9)

Assignment 3 due, 2:30pm. The next assignment will be handed out on February 21. Introduction to the ADT Dictionary. Definition of a binary search tree. Standard operations on binary search trees, all taking time O(h), where h is the height. Generic tree sort. The browsing operations. Definition of red-black trees.

Resources:

Exercises mentioned in class:

  • Show by induction on h (the height) that h = O(log n) for AVL trees.
`

Lecture 12 (Feb 14)

Red-black trees viewed as 2-3-4 trees. Relationships between rank, height and number of nodes. Insert operation on red-black trees. Standard deletion is not covered in class. Lazy delete. Split and join in red-black trees. Application to list implementations (concatenate and sublist in O(log n) time).

Resources:

Exercises mentioned in class:

  • Show by induction that for a red-black tree, r <= h <= 2r, where r is the rank of the root and h is the height of a red-black tree.
  • Convince yourself that the split operation can be implemented in O(log n) time.

Lecture 13 (Feb 16)

Midterm examination.

Lecture 14 (Feb 21)

Assignment 4 handed out, due March 7, 2:30pm, preferably on paper. Augmented data structures. Examples:

  • Data that participate in several data structures.
  • Combining search trees and sorted linked lists.
  • Order statistics trees. Operations rank and selection.
  • Interval trees. VLSI chip layout application. Sweepline paradigm.
  • k-d-trees. Range search, partial match search, quadtrees. Not done in 2023.
  • Definition of BSP (binary space partition) trees. Painter's algorithm.

Resources:

Lecture 15 (Feb 23)

Cartesian trees. Random binary search trees. Treaps. Analysis of the expected time of the main operations. Quicksort revisited.

Resources:

  • CLR, section 12.4 (for random binary search trees).
  • CLR, chapter 7 (for quicksort).
  • Scribed notes.

Lecture 16 (Mar 7)

Priority queues (as abstract data types). Uses, such as in discrete event simulation and operating systems. Binary heaps. Heapsort. Pointer-based implementation of binary heaps. Tournament trees. k-ary heaps.

Resources:

Exercises mentioned in class:

  • Determine the navigation details for array-based k-ary heaps.
  • Show that the second to last level of a complete binary tree has at most n/2 elements.
  • Give the details of the deletemin operation in a tournament tree.

Lecture 17 (Mar 9)

Assignment 5 handed out, due March 16, 2:30pm. Amortization. Potential functions. Examples:

  • Lazy delete in red-black trees.
  • Binary addition.
  • Dynamic tables.
  • Splay trees.

Resources:

Lecture 18 (Mar 14)

The union-find ADT. Data structures for disjoint sets. Implementation using linked lists. Implementation using parentpointer trees.

Resources:

Exercise:

  • Consider parent pointer forests. Using the potential introduced for splay trees, show that the amortized time of each operation when using path compression (but no union-by-rank) is O(log n).

Lecture 19 (Mar 16)

Assignment 5 due today, 2:30pm. Stringology. Data structures for strings and compression. Tries, PATRICIA trees, digital search trees. Suffix tries, suffix trees, suffix arrays. String searching in a large file via the Knuth-Morris-Pratt algorithm.

Resources:

  • CLR, section 32.4 (for the Knuth-Morris-Pratt algorithm).
  • Scribed notes.

Exercises:

  • Show that for a k-ary PATRICIA tree on n strings, and for a suffix tree on a text of n symbols, the number of the nodes in the tree is at most 2n.
  • Give an algorithm for insertion into a trie.
  • Add one line to the KMP algorithm to make it return all pattern matches, not just one.

Lecture 20 (Mar 17)

Assignment 6 handed out, due March 28, 2:30pm. Introduction to compression and coding. Fixed and variable width codes. Prefix codes. Definition of entropy and Shannon's result on best possible compression ratios. Huffman trees. Huffman code. An O(n log n) algorithm for finding the Huffman tree. Definition of a greedy algorithm. Data structures for Huffman coding. Merging sorted lists by the greedy method. Generating a random integer.

Resources:

  • Scribed notes: introduction to information theory.
  • Scribed notes: Shannon's theory.
  • CLR, section 16.2 (for prefix and Huffman codes).

Exercises:

  • Give an algorithm that makes the tree and fills in the thresholds for the random integer generation example.
  • Design an algorithm that outputs all codewords given a prefix coding tree.
  • Show by example that there is a set of symbol weights for which the table of codewords for the Huffman codes takes Omega (n^2) space.
  • Why is the expected length of a Huffman codeword for an n-symbol alphabet at most the ceiling of log_2 n?

Lecture 21 (Mar 21)

Entropy of an input source. Kraft's inequality. Proof of Shannon's theorem. The Shannon-Fano code. Lempel-Ziv compression. Data structures for Lempel-Ziv compression. Digital search tree.

Resources:

Exercises:

  • Prove Kraft's inequality by induction on the height.
  • Write the full one-pass algorithm for Lempel-Ziv compression.

Lecture 22 (Mar 23)

Introduction to graphs: notation, definitions, adjacency matrix, adjacency lists. Graph traversals as the fundamental procedures for most graph algorithms: depth first search. DFS forest and DFS properties. Classification of edges. Detecting cycles. Euler tour.

Resources:

  • CLR, sections 22.1, 22.2 and 22.3.
  • Scribed notes on graph algorithms.

Exercises:

  • Show that one can order all adjacency lists in O(|V|+|E|) time.
  • What is the worst-case time of the DFS algorithm for detecting a cycle that we saw in class?

Lecture 23 (Mar 28)

Assignment 6 due today, 2:30pm. Breadth first search. BFS tree. Shortest path problem. Shortest path properties. Dijkstra's greedy algorithm for shortest paths.

Resources:

  • CLR, section 22.2.
  • Scribed notes on graph algorithms.
  • CLR, section 24.3 (Dijkstra's algorithm for shortest paths).

Exercises:

  • Modify the BFS algorithm to detect a bipartite graph.
  • A graph is given in the cell model, with a link to just one node. Write an algorithm that determines whether the graph is isomorphic to a hypercube.

Lecture 24 (Mar 30)

Minimal spanning tree. Main properties. Greedy approach for the MST: the Prim-Dijkstra algorithm. Application in hierarchical clustering. Kruskal's algorithm. A two-approximation for the traveling salesman problem. The all-pairs shortest path algorithm of Floyd and Warshall.

Resources:

Exercises:

  • How would you pick k if a k-ary heap is used in Dijkstra's algorithms for shortest path and MST?
  • How would you find a spanning tree that minimizes the maximal weight (not the sum of the weights)?

Lecture 25 (Apr 4)

Directed acyclic graphs (or: dags). Topological sorting. Activity (PERT) networks. NIM games. Strongly connected components in directed graphs. The stable matching algorithm of Gale and Shapley.

Resources:

  • Scribed notes.
  • CLR, section 22.5 (strongly connected components).

Exercises:

  • Add a few lines to the algorithms for NIM games and PERT networks to compute for each node the perfect move (in a NIM game) and the critical path (in a PERT network), respectively.

Lecture 26 (Apr 6)

Network flows. The Ford-Fulkerson method. The Edmonds-Karp algorithm.

Resources:

  • CLR, section 26.1 (network flows) and 26.2 (the Ford-Fulkerson method).
  • Scribed notes.