Lectures in COMP 252--Winter 2024


CLR refers to Cormen, Leiserson, Rivest and Stein: "Introduction to Algorithms (Third Edition)".

Lecture 1 (Jan 4)

General introduction. Definition of an algorithm. Examples of simple computer models: Turing model, RAM model, pointer based machines, oracles, comparison model. RAM model (or uniform cost model), bit model (or: logarithmic cost model). Definition and examples of oracles. The Fibonacci example: introduce recursion, iteration (the array method), and fast exponentiation. Worst-case time complexity. Landau symbols: O, Omega, Theta, small o, small omega.

Resources:

Exercise mentioned during the lecture:

  • Determine in Big Theta notation the bit model time complexities of the recursion, array and fast exponentiation methods seen in class for computing the n-th Fibonacci number.

Lecture 2 (Jan 9)

Examples of many complexity classes (linear, quadratic, polynomial, exponential). Introduction to the divide-and-conquer principle. Fast exponentiation revisited. Introduction to recurrences. The divide-and-conquer solution to the chip testing problem in CLR. Divide-and-conquer (Karatsuba) multiplication. Mergesort, convex hulls in the plane. We covered all of chapter 2 and part of chapter 3 of CLR.

Resources:

Exercises mentioned during the lecture:

  • Describe in algorithmic form the linear time upper hull merge algorithm described in class.

Assignment 1 handed out (on myCourses). Due January 18, 2024, 1 pm.

Lecture 3 (Jan 11)

Binary search using a ternary oracle (with analysis by the substitution method). Strassen matrix multiplication. The master theorem (statement only). Recursion tree to explain the master theorem. Ham-sandwich and pancake theorems and halfspace counting.

Resources:

Exercises mentioned during the lecture:

  • Write the binary search algorithm using a binary oracle.
  • What is the "big theta" order of the solution of T_n = T_(n/2) + log n, n > 0? (This was not mentioned but may make you think about recursion trees.)

Lecture 4 (Jan 16)

Linear time selection: finding the k-th smallest by the algorithm of Blum, Floyd, Pratt, Rivest and Tarjan. O(n) complexity proof based on induction. Linear expected time selection by Hoare's algorithm "FIND". The induction method illustrated on mergesort. Euclid's algorithm for the greatest common divisor.

Resources:

Exercises mentioned during the lecture:

  • For the binary search algorithm with ternary oracle, show by induction on n that the worst-case number of comparisions T_n satisfies T_n <= ceiling ( log_2 (n+1) ) for all n.
  • For the binary search algorithm with binary oracle, show by induction on n that the worst-case number of comparisions T_n satisfies T_n <= 2 + ceiling ( log_2 (n) ) for all n.
  • Show that the bit complexity of Eclid's algorithm for gcd (n,m) is O( log n . log m ).
  • Analyze T(n) = T(n/3) + T(2n/3) + n using the recursion tree.
  • Look at the median-of-3 version of the selection algorithm (instead the median-of-5 version seen in class) and show, by using recursion trees, that its complexity is big theta of n log n.

Lecture 5 (Jan 18)

Assignment 1 due at 1 pm on myCourses. Assignment 2 posted on myCourses. The Fast Fourier Transform. Multiplying polynomials in time O(n log n).

Resources:

  • CLR, chapter 30.
  • Printed notes posted on myCourses.

Lecture 6 (Jan 23)

Dynamic programming. Computing binomial coefficients and partition numbers. The traveling salesman problem. The longest common subsequence problem. The knapsack problem.

Resources:

Exercises mentioned in class:

  • Modify the knapsack algorithm for the case that we have an unlimited supply of items of the given sizes.
  • Can you count the number of knapsack solutions by dynamic programming?
  • In the shortest path problem, formulate the sub-solutions necessary so that when asked, one can get the shortest path sequence from an arbitrary node j to node 1 in time O(n), where n is the number of nodes. And tell us how to modify or update the TSP algorithm seen in class.

Lecture 7 (Jan 25)

Lower bounds. Oracles. Viewing algorithms as decision trees. Proof of lower bound for worst-case complexity based on decision trees. Examples include sorting and searching. Lower bounds for board games. Finding the median of 5 with 6 comparisons. Merging two sorted arrays using finger lists. Lower bounds via witnesses: finding the maximum.

Resources:

Lecture 8 (Jan 30)

Assignment 2 due at 11 pm. Assignment 3 handed out. It is due on February 9 at 10pm. Lower bounds via adversaries. Guessing a password. The decision tree method as a special case of the method of adversaries. Finding the maximum and the minimum. Finding the median. Finding the two largest elements.

Resources:

Lecture 9 (Feb 1)

Introductory remarks about ADTs (abstract data types). Lists. Primitive ADTs such as stacks, queues, deques, and lists. Classification and use of linked lists, including representations of large integers and polynomials, window managers, and available space lists. Uses of stacks and queues in recursions and as scratchpads for organizing future work. Introduction to trees: free trees, ordered trees, position trees, binary trees. Bijection between ordered trees (forests) and binary trees via oldest child and next sibling pointers. Relationship between number of leaves and number of internal nodes with two children in a binary tree. Complete binary trees: implicit storage, height.

Resources:

Exercises mentioned in class:

  • We saw the way of navigating a complete binary tree using the locations of the nodes in an array. Generalize the navigation rules by computing for a given node i in a complete k-ary tree with n nodes, the parent, the oldest child and the youngest child (and indicate when any of these do not exist).
  • What is the height of a complete k-ary tree on n nodes?
  • Prove by induction that in a binary tree, the number of leaves is equal to the number of nodes with two children plus one.
  • Write an O(n) time program for creating the oldest-child-next-sibling binary tree seen in class from a given ordered tree in which the children of a node are stored from oldest to youngest in a linked list. Conversely, from a binary tree in which the root only has a left child, recreate in O(n) time the corresponding ordered tree.

Lecture 10 (Feb 6)

Continuing with trees. Depth, height, external nodes, leaves, preorder numbers, postorder numbers. Preorder, postorder, inorder and level order listings and traversals. Recursive and nonrecursive traversals. Stackless traversal if parent pointers are available. Representing and storing trees: binary trees require about 2n bits. Counting binary trees (Catalan's formula) via counting Dyck paths. Expression trees, infix, postfix (reverse Polish), and prefix (Polish) representation.

Resources:

Exercises mentioned in class:

  • Write linear time algorithms to go between any of the 2n-bit representations of binary trees, and a binary tree.
  • Determine in time $O(n)$ whether a coded bit sequence of length $2n$ that uses the codes 00, 01, 10 and 11 for the four kinds of nodes, is a legal sequence, i.e., one that corresponds to a binary tree on $n$ nodes.
  • Write an $O(n)$ algorithm that computes for all nodes in an ordered tree of size $n$ the sizes of their subtrees (and stores those sizes in the corresponding cells).

Lecture 11 (Feb 8)

Assignment 3 due tomorrow at 10 pm on myCourses. Introduction to the ADT Dictionary. Definition of a binary search tree. Standard operations on binary search trees, all taking time O(h), where h is the height. Generic tree sort. The browsing operations. Definition of red-black trees.

Resources:

Exercises mentioned in class:

  • Given is an arbitrary binary tree representing an electrical network. Within each node, we also store the resistance of the left and right edges emanating from the node (if they exist). If one or two of those edges are missing, replace them by zero. Imagine connecting all leaves to the ground (run a wire through all of them). Then your task is to write a linear time algorithm that computes the resistance between the root and the ground.
  • If we have square root of n sorted lists, each of length square root of n, then they can be merged into one sorted list of length n using only about (1/2) n log_2 n comparisons. And from an unsorted list of n items, we can make square root of n sorted lists, each of length square root of n. Tell me why these two statements are true.

Lecture 12 (Feb 13)

Red-black trees viewed as 2-3-4 trees. Relationships between rank, height and number of nodes. Insert operation on red-black trees. Standard deletion is not covered in class. Lazy delete. Split and join in red-black trees. Application to list implementations (concatenate and sublist in O(log n) time).

Resources:

Exercises mentioned in class:

  • Write a simple recursive algorithm to construct a red-black tree from a sorted array of $n$ elements. Your algorithm should take O(n) RAM time, and should not use any key comparisons.
  • Show by induction that for a red-black tree, r <= h <= 2r, where r is the rank of the root and h is the height of a red-black tree.
  • Convince yourself that the split operation can be implemented in O(log n) time.

Lecture 13 (Feb 15)

Midterm examination.

Lecture 14 (Feb 20)

Assignment 4 handed out, due February 27, 9 pm. Augmented data structures. Examples:

  • Data that participate in several data structures.
  • Combining search trees and sorted linked lists.
  • Order statistics trees. Operations rank and selection.
  • Interval trees. VLSI chip layout application. Sweepline paradigm.
  • k-d-trees. Range search, partial match search, quadtrees.
  • Quadtrees.
  • Definition of BSP (binary space partition) trees. Painter's algorithm.

Resources:

Lecture 15 (Feb 22)

Cartesian trees. Random binary search trees. Treaps. Analysis of the expected time of the main operations. Quicksort revisited.

Resources:

  • CLR, section 12.4 (for random binary search trees).
  • CLR, chapter 7 (for quicksort).
  • Scribed notes. These notes will be updated in 2024, hopefully by March 10.

Lecture 16 (Feb 23)

Priority queues (as abstract data types). Uses, such as in discrete event simulation and operating systems. Binary heaps. Heapsort. Pointer-based implementation of binary heaps. Tournament trees. k-ary heaps.

Resources:

Exercises mentioned in class:

  • Give the details of the deletemin operation in a tournament tree.

Lecture 17 (Feb 27)

Assignment 4 due tonight at 9 pm on myCourses. Amortization. Potential functions. Examples:

  • Lazy delete in red-black trees.
  • Binary addition.
  • Fibonacci heaps.

Resources:

Lecture 18 (Feb 29)

Stringology. Data structures for strings and compression. Tries, PATRICIA trees, digital search trees. Suffix tries, suffix trees, suffix arrays. String searching in a large file via the Knuth-Morris-Pratt algorithm.

Resources:

  • CLR, section 32.4 (for the Knuth-Morris-Pratt algorithm).
  • Scribed notes.

Exercises:

  • Show that for a k-ary PATRICIA tree on n strings, and for a suffix tree on a text of n symbols, the number of the nodes in the tree is at most 2n.
  • Give an algorithm for insertion into a trie.
  • Add one line to the KMP algorithm to make it return all pattern matches, not just one.

Lecture 19 (Mar 12)

Introduction to compression and coding. Fixed and variable width codes. Prefix codes. Huffman trees. Huffman code. An O(n log n) algorithm for finding the Huffman tree. Definition of a greedy algorithm. Data structures for Huffman coding. Merging sorted lists by the greedy method. Generating a random integer.

Resources:

  • Scribed notes: introduction to information theory.
  • Scribed notes: Shannon's theory.
  • CLR, section 16.2 (for prefix and Huffman codes).

Exercises:

  • Give a linear time algorithm that makes the tree and fills in the thresholds for the random integer generation example.
  • Design an algorithm that outputs all codewords given a prefix coding tree.
  • Show by example that there is a set of symbol weights for which the table of codewords for the Huffman codes takes Omega (n^2) space.

Lecture 20 (Mar 14)

Assignment 5 due tomorrow, March 15, 9 pm. Definition of entropy and Shannon's result on best possible compression ratios. Entropy of an input source. Kraft's inequality. Proof of Shannon's theorem. The Shannon-Fano code. Lempel-Ziv compression. Data structures for Lempel-Ziv compression. Digital search tree.

Resources:

Exercises:

  • Prove Kraft's inequality by induction on the height.
  • Prove Kraft's inequality for infinite trees by induction on the number of leaves. (Hard)
  • Write the full one-pass algorithm for Lempel-Ziv compression.

Lecture 21 (Mar 15)

Introduction to graphs: notation, definitions, adjacency matrix, adjacency lists. Graph traversals as the fundamental procedures for most graph algorithms: depth first search. DFS forest and DFS properties. Classification of edges. Detecting cycles. Euler tour.

Resources:

  • CLR, sections 22.1, 22.2 and 22.3.
  • Scribed notes on graph algorithms.

Exercises:

  • Show that one can order all adjacency lists in O(|V|+|E|) time.
  • What is the worst-case time of the DFS algorithm for detecting a cycle that we saw in class?

Lecture 22 (Mar 19)

Breadth first search. BFS tree. Shortest path problem. Shortest path properties. Dijkstra's greedy algorithm for shortest paths.

Resources:

  • CLR, section 22.2.
  • Scribed notes on graph algorithms.
  • CLR, section 24.3 (Dijkstra's algorithm for shortest paths).

Exercises:

  • Modify the BFS algorithm to detect a bipartite graph.
  • A graph is given in the cell model, with a link to just one node. Write an algorithm that determines whether the graph is isomorphic to a hypercube.

Lecture 23 (Mar 21)

Minimal spanning tree. Main properties. Greedy approach for the MST: the Prim-Dijkstra algorithm. Application in hierarchical clustering. Kruskal's algorithm. A two-approximation for the traveling salesman problem. The all-pairs shortest path algorithm of Floyd and Warshall.

Resources:

Exercises:

  • How would you pick k if a k-ary heap is used in Dijkstra's algorithms for shortest path and MST?
  • How would you find a spanning tree that minimizes the maximal weight (not the sum of the weights)?
  • Show that the set membership contribution to Kruskal's algorithm takes only time $O(|V| \log |V|)$ if we always change the labels of the nodes in the smallest set. And how would you then keep track of the sizes of the sets?
  • Discuss the choice of priority queue in Dijkstra's algorithms for shortest path and MST when all edge weights are integers between $1$ and $10$. What is the overall complexity?

Lecture 24 (Apr 2)

Directed acyclic graphs (or: dags). Topological sorting. Activity (PERT) networks. NIM games. Strongly connected components in directed graphs. Linear time 2SAT by Tarjan's algorithm.

Resources:

  • Scribed notes.
  • CLR, section 22.5 (strongly connected components).

Exercises:

  • Add a few lines to the algorithms for NIM games and PERT networks to compute for each node the perfect move (in a NIM game) and the critical path (in a PERT network), respectively.
  • How would you determine if a dag contains a node u that can be reached from all roots?

Lecture 25 (Apr 4)

Hungarian method for the assignment problem. The stable matching algorithm of Gale and Shapley.

Resources:

Lecture 26 (Apr 9)

Network flows. The Ford-Fulkerson method. The Edmonds-Karp algorithm.

Resources:

  • CLR, section 26.1 (network flows) and 26.2 (the Ford-Fulkerson method).
  • Scribed notes.