Graduate Algorithms
 

Home

Publications

Teaching Data Structures

 

 

CS 105 : Algorithms and Data Structures (Graduate Level)

Term: Winter Term 2006 : Jan 04, 2006 - March 15, 2006
Time: 10 a.m. - 11:05 a.m./ x-period: Th 12:00 - 12:50
Location: Sudikoff 115
Frequency: M W F + x-period on Th
Course Description: This course provides an introduction to algorithms and algorithm design techniques at a level more advanced than a typical undergraduate course in algorithms, yet basic enough to be applicable widely across the many subfields of computer science. There is a strong emphasis on rigorously proving the correctness of and running time bounds on the algorithms studied. Typical techniques covered include, but are not limited to, randomization, amortized analysis, linear programming and approximation. Typical problems include, but are not limited to, splay trees, perfect hashing, skip lists, fast randomized algorithms for minimum cuts and minimum spanning trees, maximum matching, pattern matching, and some simple approximation algorithms: both combinatorial and based on linear programming relaxation.
Prerequisites: Computer Science 25 and 39. Additionally, students are expected to be familiar with basic concepts from graph theory, discrete mathematics, linear algebra, and probability.

Administrative Basics Syllabus Announcements Schedule Handouts References

Administrative Basics

Lecture

10 A.M. on MWF
Instructor

Rahul Ray | Office: Sudikoff 216 | 6-1613 | Office Hours: MW 11:30 a.m to 12:30 p.m and by appointment
Textbook(s)

"Introduction to Algorithms", by Cormen, Leiserson, Rivest and Stein, MIT Press. It will be referred to as "CLRS". Make sure that you have the second edition.
"Randomized Algorithms", by Motwani and Raghavan, Cambridge University Press. We will refer this book by [MR]

Prerequisites

CS 25, CS 39
Grading

  • 70% Homework + Class presentation (mandatory)
  • 30% Final Exam
  • Homework and Class Presentation

    Assignments will be given on Monday every week ( except for the first week, which will be given on Wednesday 1/4) and is strictly due on next Monday. You can drop it to my mail-box or hand it over in the class on or before due date. Students can do it in group of at most 3 people or individually. Class presentation for each student is mandatory and usually will be on Fridays. Please have a look at this page for an idea on how to give good talk. Topics for the talk will be assigned at least 1.5 week ahead of the scheduled presentation. If for any reason the assigned speaker is unable to attend the class, there will be NOTHING to do for us. So, inform me as early as possible for such an unlikely event. Occassionally, there will be class/presentation during x-period, which will be notified early.
    Final Exam

    It will be a take-home final exam, due around March 15 (date to be confirmed later).

    Syllabus

    Announcements

    Week 1

    Week 2

    • 1/9: We concluded hashing with open addressing scheme and perfect hashing. We gave a brief introduction of skip-list. From now on, homeworks will be given on Wednesdays and will be due on following Wednesday.
    • 1/11: We discussed skip-list. We proved the expected search complexity is O(log n) in skip-list. Reading material is posted on-line for skip-list. Homework 2 is distributed.
    • 1/13: David Callender presented Cuckoo Hashing by Rasmus Pagh & Flemming Friche Rodler. This paper appeared in Journal of Algorithms 51 (2004), p. 122-144. He explained what is cuckoo-hashing and how it can be used to have worst-case O(1) look-up time. David's presentation is here.

    Week 3

    • 1/16: No class today - MLK day.
    • 1/18: We discussed the definition of splay-tree and explained the zig, zig-zig, and zig-zag operations. We introduced amortized analysis and stated the access lemma.
    • 1/20: No class: Rahul is away

    Week 4

    • 1/23: We proved access lemma and then discussed the operations on a splay tree like Access, Splay, Join, Split, Delete. Also, we have proved the amortized complexity of splaying operation.
    • 1/25: We introduced graph algorithms. We defined few technical terms related to min-cut. Then, described the contraction procedure and stated KARGER's randomized min-cut algorithm. See the reading material in the table below.
    • 1/26 (x-period): Nicolas Lane presented "Randomized Dining Philosphers to TDMA Scheduling in Wireless Sensor Networks" by Injong Rhee et. al. He described an algorithm for one-time dining philosopher problem (ODP) and proved the Liveness theorem that gives an upper bound on the expected number of rounds during which a process remains in the trying and hopeful state.
    • 1/27: We discussed in detail the randomized min-cut algorithm by Karger et. al. and then showed how to improve it to O(n2 log n ) by recursion.

    Week 5

    • 1/30: Priya Natarajan presented how the weights of the items in a splay tree can be cleverly chosen to obtain strong results from Access Lemma like balance theorem, static optimality theorem and static finger theorem. She also gave a proof of working set theorem and some of the amortized times of splay tree operations from Update Lemma.
    • 2/1: We set the background of max-flow problem by defining some terminologies and proving some lemmas and gave an intuition about max-flow min-cut theorem. We also described the residual network.
    • 2/3: We showed how to construct residual network and proved the Max-flow min-cut theorem. We also presented the FORD-FULKERSON (F-F) algorithm and analyzed its running time. We showed how EDMONDS-KARP algorithm improves the F-F's running time to O(V E2 ).

    Week 6

    • 2/6: We defined the Minimum Spanning Trees (MST) and cited some properties of MST. We gave the generic algorithm for computing MST and then described the Red rule and Blue rule for MST in a graph and gave their proofs. We described Kruskal's algorithm for computing MST.
    • 2/8: We described PRIM's and Boruvka's algorithms for computing MST and then showed how a hybrid approach can make use of Boruvka and Prim's MST algorithm to improve the running time to O(m log (log n)). Next, we defined some terminologies to set the background for expected linear time algorithm for computing MST (minimum spanning forest as well) and then described the algorithm and proved the running time to be linear ( expected running time).
    • 2/10: We had a day off for winter carnival.

    Week 7

    • 2/13: We started a new topic which is string matching . After describing some terminologies and definitions, we detailed the naive string matching algorithm and then analyzed its complexity, which was O(n2). We showed how it can be improved to expected linear time using some number theoretic properties. We set the background for Rabin-Karp's string matching algorithm.
    • 2/15: We described the Rabin-Karp string matching algorithm and explained its running time which is expected linear time. We then concluded with the Knuth-Morris-Pratt (KMP) algorithm for exact string matching.
    • 2/16 (x-period): Matt Bell presented the Fibonacci Heaps (F-Heap) data structure today. He explained the operations in F-heap such as INSERT, MINIMUM, EXTRACT-MIN, UNION, DECREASE-KEY and DELETEMIN and showed that the operations that do not involve deleting an element run in O(1) amortized time.
    • 2/17: Jianyang Zeng started with the small-world phenomenon adopted from Stanley Milgram, and current famous small-world models. Then he presented the Kleinberg's model, and existing results on decentralized routing in small-world networks. He gave the proof of the upper bound and lower bound of routing complexity in Kleinberg's paper. He also talked about the method to obtain the O(log n) diameter of Kleinberg's small-world networks.

    Week 8

    • 2/20: Matching algorithms was the topic today. We started with some fundamental definitions like matching, maximal matching, maximum matching, perfect matching, M-alternating path, M-augmenting path. Then we gave a generic algorithm to find out a maximum matching in a graph. We proved Berge's theorem and showed how a maximum matching can be found in a bipartite graph. We shall handle the general graph in the next class.
    • 2/22: Patrick Tsang presented the topic of data structures for disjoint sets that find a lot of applications, e.g. algorithms for finding MST. The data structure can be implemented using linked lists or trees. Using tree-implementation, together with two heuristics Union by Rank and Path Compression, he showed that the amortized running time of m operations with n MAKE-SETs is O(ma(n)) . He discussed the slowly growing function a , which is known as inverse Ackermann function.
    • 2/23 (x-perid): John MacMaster presented "Randomized Primality Testing". This presentation covered some group and number theory fundamentals that are important for understanding the methods, and appreciating the challenges, of testing large integers for primality. The talk was focused on explaining the Miller-Rabin randomized primality testing algorithm and the auxiliary procedures that it uses.
    • 2/24: We proved the O( √n m) time bound for Hopcroft-Karp algorithm for maximum cardinality matching in bipartite graphs. Then we discussed the maximum matching for general graphs and proved the cycle shrinking lemma and then described Edmonds' algorithm for computing the maximum matching in general graphs.

    Week 9

    • 2/27: Rong Zhang presented Single pair shortest paths. First, she introduced "Bellman-Ford Algorithm", which solves the single-source shortest-paths problem in the general case in which edge weights may be negative and then proved its correctness and analyzed its running time. She also introduced Dijkstra's algorithm, which solves the problem for the case in which all edge weights are nonnegative. She showed that with good implementation the running of Dijkstra's algorithm is lower than that of Bellman-Ford algorithm.
    • 3/1: Keren Tan presented "Analysis of the Greedy Approach in Problems of Maximum k-Coverage" by by Dorit S. Hochbaum and Any Pathria. A brief introduction to approximation algorithm is given in the first part of this presentation, and the second part provides an analysis of the quality of solution of a greedy algorithm for the general problem of covering the maximum weight set of elements of a universal set via k structurally constrained subsets. As indicated, performing each greedy step may itself be NP-hard. This analysis provides a performance bound that is a function of beta, a measure of the quality of the pseudo-greedy solution found at each stage. Additional analysis about measuring solution quality in a particular instance is also discussed.
    • 3/2 (x-period): We discussed Linear Programing (LP) today in standard and slack form. Then we proved the convexity of the simplex region if the solution space is bounded. We also showed how to convert a given LP into standard form and then to slack form for simplex algorithm to work on. We ended our class showing how simplex algorithm runs for a given example and showed when simplex can cycle and eventually give worst-case running time.
    • 3/3: Nihal presented Huffman codes. In his presentation, he (1) showed how Huffman codes help in data compression, (2) gave & explained the Huffman code constructing algorithm & (3) proved its correctness via two lemmas.

    Week 10

    • 3/6: Vibhor presented the first interior point algorithm for linear programming invented by Narendra Karmarkar in 1984. This required to give the geometric picture of the linear programming problem in terms of convex sets. Then he showed how this algorithm by using repeated projective transformations of a simplex, minimizes over a sphere in the simplex. Finally he explained why this algorithm is linear time.
    • 3/8: We discussed the LP duality and showed how an LP in standard form can be converted to its dual form and proved the weak duality theorem and we discussed the final exam format. The question paper will be made available by 10th and is due on 15th by 5p.m. in my office.

    Schedule

    Date Lecture Topic Assignments/Readings Remark
    Jan 4 Introduction, syllabus, course structure, etc. Hash Tables and a quick overview of some mathematical fundamentals. Will be following chapter 11 from CLRS (second edition). Lecture Notes 1 . Homework 1 Due on 1/9
    Jan 6 We discussed universal hashing and a construction of universal hash function.    
    Jan 9 We discussed perfect hashing and open addressing scheme and gave a brief introduction of skip-list data structure. Homework 1 is due today .    
    Jan 11 We completed the discussion on skip-list. Recommended reading for skip-list is two papers by William Pugh, which are here (long) and here (short) too. Brief lecture note is here. Homework 2 Due on 1/18
    Jan 13 David Callender presented Cuckoo Hashing by Rasmus Pagh & Flemming Friche Rodler. This paper appeared in Journal of Algorithms 51 (2004), p. 122-144. © 2004 by Academic Presss. David's presentation is here . David's presentation  
    Jan 16 No Class Today - MLK Day    
    Jan 18 We discussed the definition of splay-tree and explained the zig, zig-zig, and zig-zag rotations. We introduced amortized analysis and stated the access lemma. Read the paper : Self-adjusting binary search trees by Daniel Sleator and Robert Endre Tarjan (1985, Journal of ACM). Homework 3 Due on 1/28
    Jan 20 Rahul away, no class today!!    
    Jan 23 We proved access lemma and then discussed the operations on a splay tree like Access, Splay, Join, Split, Delete. Also, we have proved the amortized complexity of splaying operation.    
    Jan 25 Graph Algorithms, Min-cut Reading [1] , [2] and Chapter 10 of Randomized Algorithms [MR] and CLRS.  
    Jan 26 (x-period) Nicolas Lane presented "Randomized Dining Philosphers to TDMA Scheduling in Wireless Sensor Networks" by Injong Rhee et. al. He described an algorithm for one-time dining philosopher problem (ODP) and proved the Liveness theorem that gives an upper bound on the expected number of rounds during which a process remains in the trying and hopeful state. Suggested reading [1] & [2] . Nic's presentation .  
    Jan 27 We discussed in detail the randomized min-cut algorithm by Karger et. al. and then showed how to improve it to O(n2 log n ) by recursion. Exercise 10.9 from [MR] Deterministic min-cut algorithm is still open for presenation. Suggested reading: "A Simple Min-cut Algorithm" by Stoer and Wagner and "A correctness certificate for the Stoer-Wagner mincut algorithm" by S. Arikati and K. Mehlhorn.
    Jan 30 Priya Natarajan presented how the weights of the items in a splay tree can be cleverly chosen to obtain strong results from Access Lemma like balance theorem, static optimality theorem and static finger theorem. She also gave a proof of working set theorem and some of the amortized times of splay tree operations from Update Lemma. She spoke about long splay theorem too. Read Self-adjusting binary search trees by Daniel Sleator and Robert Endre Tarjan (1985, Journal of ACM). Homework 2 returned
    Feb 1 We set the background of max-flow problem by defining some terminologies and proving some lemmas and gave an intuition about max-flow min-cut theorem. We also described the residual network. Read from CLRS.
    Homework 4
    Due on 2/8.
    Announcement: Every student's presentation will be a part of the syllabus for the final exam.
    Feb 3 We showed how to construct residual network and proved the Max-flow min-cut theorem. We also presented the FORD-FULKERSON (F-F) algorithm and analyzed its running time. We showed how EDMONDS-KARP algorithm improves the F-F's running time to O(V E2 ).    
    Feb 6 We defined the Minimum Spanning Trees (MST) and cited some properties of MST. We gave the generic algorithm for computing MST and then described the Red rule and Blue rule for MST in a graph and gave their proofs. We described Kruskal's algorithm for computing MST. [1] Read CLRS & [MR]
    [2] "A Randomized Linear-Time Algorithm to Find Minimum Spannning Trees" by Karger, Klein, Tarjan. JACM 42(2), 1995 and
    [3] "Verification and Sensitivity Analysis Of Minimum Spanning Trees In Linear Time" by Brandon Dixon, Monika Rauch, Robert E. Tarjan.
     
    Feb 8 We described PRIM's and Boruvka's algorithms for computing MST and then showed how a hybrid approach can make use of Boruvka and Prim's MST algorithm to improve the running time to O(m log (log n)). Next, we defined some terminologies to set the background for expected linear time algorithm for computing MST (minimum spanning forest as well) and then described the algorithm and proved the running time to be linear ( expected running time). Homework 5 Due on 2/15
    Feb 9 (x-period) CANCELLED !!
    John MacMaster will give his presentation on 2/23.
      "PRIMES is in P" by Agrawal et. al. and for background material read the article by Folkmar Bornemann.  
    Feb 13 We started a new topic which is string matching . After describing some terminologies and definitions, we detailed the naive string matching algorithm and then analyzed its complexity, which was O(n2). We showed how it can be improved to expected linear time using some number theoretic properties. We set the background for Rabin-Karp's string matching algorithm. Read Chapter 32 from CLRS  
    Feb 15 We described the Rabin-Karp string matching algorithm and explained its running time which is expected linear time. We then concluded with the Knuth-Morris-Pratt (KMP) algorithm for exact string matching. Homework 6 Due on 2/22
    Feb 16 (x-period) Matt Bell presented the Fibonacci Heaps (F-Heaps) data structure. He explained the operations on F-heap such as INSERT, MINIMUM, EXTRACT-MIN, UNION, DECREASE-KEY and DELETEMIN and showed that the operations that do not involve deleting an element run in O(1) amortized time. Read from CLRS &
    the original paper Fibonacci heaps and their uses in improved network optimization algorithms by Fredman and Tarjan (1998), who introduced the idea.
    Matt Bell's presentation .
     
    Feb 17 Jianyang Zeng started with the small-world phenomenon, adopted from Stanley Milgram and current famous small-world models. Then he presented the Kleinberg's model, and existing results on decentralized routing in small-world networks. He gave the proof of the upper bound and lower bound of routing complexity in Kleinberg's paper. He also talked about the method to obtain the O(log n) diameter of Kleinberg's small-world networks. Reading: The Small-World Phenomenon: An Algorithmic Perspective by Jon Kleinberg & Analyzing Kleinberg's (and other) small-world Models by Chip Martel , Van Nguyen.
    Jianyang's slides .
     
    Feb 20 Matching Algorithms was the topic today. We started with some fundamental definitions like matching, maximal matching, maximum matching, perfect matching, M-alternating path, M-augmenting path. Then we gave a generic algorithm to find out a maximum matching in a graph. We proved Berge's theorem and showed how a maximum matching can be found in a bipartite graph. We shall handle the general graph in the next class. Read Santosh Vempala's notes .  
    Feb 22 Patrick Tsang presented the topic of data structures for disjoint sets that find a lot of applications, e.g. algorithms for finding MST. The data structure can be implemented using linked lists or trees. Using tree-implementation, together with two heuristics Union by Rank and Path Compression, he showed that the amortized running time of m operations with n MAKE-SETs is O(ma(n)) . He discussed the slowly growing function a , which is known as inverse Ackermann function. Read Chapter 21 from CLRS.
    Patrick's presentation .
     
    Feb 23 (x-period) John MacMaster presented "Randomized Primality Testing". This presentation covered some group and number theory fundamentals that are important for understanding the methods, and appreciating the challenges, of testing large integers for primality. The talk was focused on explaining the Miller-Rabin randomized primality testing algorithm and the auxiliary procedures that it uses. Read chapter 31 from CLRS
    John's presentation .
     
    Feb 24 We proved the O( √n m) time bound for Hopcroft-Karp algorithm for maximum cardinality matching in bipartite graphs. Then we discussed the maximum matching for general graphs and proved the cycle shrinking lemma and then described Edmonds' algorithm for computing the maximum (cardinality) matching in general graphs. Homework 7
    Class-note is here.
    A survey paper on matching is here.
     
    Feb 27 Rong Zhang presented Single pair shortest paths. First, she introduced "Bellman-Ford Algorithm", which solves the single-source shortest-paths problem in the general case in which edge weights may be negative and then proved its correctness and analyzed its running time. She also introduced Dijkstra's algorithm, which solves the problem for the case in which all edge weights are nonnegative. She showed that with good implementation the running of Dijkstra's algorithm is lower than that of Bellman-Ford algorithm. Read chapter 24 from CLRS  
    March 1 Keren Tan presented "Analysis of the Greedy Approach in Problems of Maximum k-Coverage" by by Dorit S. Hochbaum and Any Pathria. A brief introduction to approximation algorithm is given in the first part of this presentation, and the second part provides an analysis of the quality of solution of a greedy algorithm for the general problem of covering the maximum weight set of elements of a universal set via k structurally constrained subsets. As indicated, performing each greedy step may itself be NP-hard. This analysis provides a performance bound that is a function of beta, a measure of the quality of the pseudo-greedy solution found at each stage. Additional analysis about measuring solution quality in a particular instance is also discussed.
    Read Analysis of the Greedy Approach in Problems of Maximum k-Coverage by Dorit S. Hochbaum and Any Pathria.
    Keren Tan's presentation.
     
    March 2 (x-period) We discussed Linear Programing (LP) today in standard and slack form. Then we proved the convexity of the simplex region if the solution space is bounded. We also showed how to convert a given LP into standard form and then to slack form for simplex algorithm to work on. We ended our class showing how simplex algorithm runs for a given example and showed when simplex can cycle and eventually give worst-case running time. Chapter 29 from CLRS  
    March 3 Nihal presented Huffman codes. In his presentation, he (1) showed how Huffman codes help in data compression, (2) gave & explained the Huffman code constructing algorithm & (3) proved its correctness via two lemmas. Read Chapter 16.3 from CLRS
    & Design and Analysis of Dynamuc Huffman Codes by J.S.Vitter
    Nihal's presentation.
     
    March 6 Vibhor presented the first interior point algorithm for linear programming invented by Narendra Karmarkar in 1984. This required to give the geometric picture of the linear programming problem in terms of convex sets. Then he showed how this algorithm by using repeated projective transformations of a simplex, minimizes over a sphere in the simplex. Finally he explained why this algorithm is linear time. Interior point method in LP
    Read this paper by N. Karmarkar.
     
    March 8 We discussed the Duality in LP and showed how an LP in standard form can be converted to its dual form and proved the weak duality theorem.
    We discussed the final exam format. The question paper will be made available by 10th and is due on 15th by 5p.m. in my office.
       
    March 10 Questions for FINAL EXAM are online Final exam Strictly due on 3/15/2006 by 5PM

    Handouts

    References

    Administrative Basics Syllabus Announcements Schedule Handouts References