Course Description: This course provides an
introduction to algorithms and algorithm design techniques at a level
more advanced than a typical undergraduate course in algorithms, yet
basic enough to be applicable widely across the many subfields of
computer science. There is a strong emphasis on rigorously proving the
correctness of and running time bounds on the algorithms studied.
Typical techniques covered include, but are not limited to,
randomization, amortized analysis, linear programming and
approximation. Typical problems include, but are not limited to, splay
trees, perfect hashing, skip lists, fast randomized algorithms for
minimum cuts and minimum spanning trees, maximum matching, pattern
matching, and some simple approximation algorithms: both combinatorial
and based on linear programming relaxation.
Prerequisites: Computer Science 25 and 39. Additionally, students
are expected to be familiar with basic concepts from graph theory,
discrete mathematics, linear algebra, and probability.
Rahul Ray | Office: Sudikoff 216 | 6-1613 | Office Hours: MW 11:30 a.m to 12:30 p.m and by appointment
Textbook(s)
"Introduction to Algorithms", by Cormen, Leiserson, Rivest
and Stein, MIT Press. It will be referred to as "CLRS". Make sure that you have the second edition.
"Randomized Algorithms", by Motwani and Raghavan, Cambridge
University Press. We will refer this book by [MR]
Prerequisites
CS 25, CS 39
Grading
70% Homework + Class presentation (mandatory)
30% Final Exam
Homework and Class Presentation
Assignments will be given on Monday every week ( except for the first week, which will be
given on Wednesday 1/4) and is strictly due on next Monday. You can drop it to my mail-box
or hand it over in the class on or before due date.
Students can do it in group of at most 3 people or individually.
Class presentation for each student is mandatory and
usually will be on Fridays. Please
have a look at
this page
for an idea on how to give good talk. Topics for the talk will be assigned at least 1.5 week ahead
of the scheduled presentation. If for any reason the assigned speaker is unable to attend
the class, there will be NOTHING to do for us. So, inform me as early as possible for such
an unlikely event. Occassionally, there will be class/presentation during x-period, which
will be notified early.
Final Exam
It will be a take-home final exam, due around March 15 (date to be
confirmed later).
6/1: We discussed universal hashing and a construction of universal hash function. We showed
how to prove universality of hash function and construct them. We saw the construction
of a class of universal hash function.
Week 2
1/9: We concluded hashing with open addressing scheme and perfect hashing. We gave a brief
introduction of skip-list. From now on, homeworks will be given on Wednesdays and will be due on following
Wednesday.
1/11: We discussed skip-list. We proved the expected search complexity
is O(log n) in skip-list. Reading material is posted on-line for skip-list.
Homework 2 is distributed.
1/13: David Callender presented Cuckoo Hashing by Rasmus Pagh & Flemming Friche Rodler.
This paper appeared in Journal of Algorithms 51 (2004), p. 122-144. He explained what is cuckoo-hashing and
how it can be used to have worst-case O(1) look-up time.
David's presentation is here.
Week 3
1/16: No class today - MLK day.
1/18: We discussed the definition of splay-tree and explained
the zig, zig-zig, and zig-zag operations. We introduced amortized
analysis and stated the access lemma.
1/20: No class: Rahul is away
Week 4
1/23: We
proved access lemma and then discussed the operations on a splay tree
like Access, Splay, Join, Split, Delete. Also, we have proved the
amortized complexity of splaying operation.
1/25: We introduced graph algorithms. We defined few technical terms related to min-cut. Then, described the
contraction procedure and stated KARGER's randomized min-cut algorithm. See the reading material in the table below.
1/26 (x-period): Nicolas Lane presented "Randomized Dining Philosphers to
TDMA Scheduling in Wireless Sensor Networks" by Injong Rhee et. al.
He described
an algorithm for one-time dining philosopher problem (ODP) and proved the
Liveness theorem that gives an upper bound on the expected number of rounds
during which a process remains in the trying and hopeful state.
1/27: We discussed in detail the randomized min-cut algorithm by Karger et. al. and then
showed how to improve it to O(n2 log n ) by recursion.
Week 5
1/30: Priya Natarajan presented how the weights of the items in a
splay tree can be cleverly chosen to obtain strong results
from Access Lemma like balance theorem, static optimality theorem and
static finger theorem. She also gave a proof of working set theorem and
some of the amortized times of splay tree operations from Update Lemma.
2/1: We set the background of max-flow problem by defining
some terminologies and proving some lemmas and gave an intuition about max-flow min-cut theorem.
We also described the residual network.
2/3: We showed how to construct residual network and proved
the Max-flow min-cut theorem.
We also presented the FORD-FULKERSON (F-F) algorithm and analyzed its
running time. We showed how EDMONDS-KARP algorithm improves the
F-F's running time to O(V E2 ).
Week 6
2/6: We defined the Minimum Spanning Trees (MST) and cited some properties of MST.
We gave the generic algorithm for computing MST and then
described the Red rule and Blue rule for MST in a graph and gave their proofs. We described
Kruskal's algorithm for computing MST.
2/8: We described PRIM's and Boruvka's algorithms for computing MST and then showed
how a hybrid approach can make use of Boruvka and Prim's MST algorithm to improve the running time
to O(m log (log n)). Next, we defined some terminologies to set the background for
expected linear time algorithm for computing MST (minimum spanning forest as well) and then described the algorithm and proved
the running time to be linear ( expected running time).
2/10: We had a day off for winter carnival.
Week 7
2/13: We started a new topic which is string matching . After describing
some terminologies and definitions, we detailed the naive string matching algorithm and
then analyzed its complexity, which was O(n2). We showed how it can be improved to expected linear
time using some number theoretic properties. We set the background for Rabin-Karp's string matching algorithm.
2/15: We described the Rabin-Karp string matching algorithm
and explained its running time which is expected linear time. We then
concluded with the Knuth-Morris-Pratt (KMP) algorithm for exact string matching.
2/16 (x-period): Matt Bell presented the Fibonacci Heaps (F-Heap) data structure today.
He explained the operations in F-heap such as INSERT, MINIMUM, EXTRACT-MIN, UNION, DECREASE-KEY
and DELETEMIN and showed that the operations that do not involve deleting an element run in
O(1) amortized time.
2/17: Jianyang Zeng started with the small-world phenomenon adopted from
Stanley Milgram, and current famous
small-world models. Then he
presented the Kleinberg's model, and existing results on decentralized routing in small-world networks. He
gave the proof of the upper bound and lower bound of routing complexity in Kleinberg's paper. He also talked
about the method to obtain the O(log n) diameter of Kleinberg's small-world networks.
Week 8
2/20: Matching algorithms was the topic today. We started with
some fundamental definitions like matching, maximal matching, maximum matching,
perfect matching, M-alternating path, M-augmenting path. Then we gave a generic
algorithm to find out a maximum matching in a graph. We proved Berge's theorem and
showed how a maximum matching can be found in a bipartite graph. We shall handle the general
graph in the next class.
2/22: Patrick Tsang presented the topic of data structures for disjoint
sets that find a lot of applications, e.g.
algorithms for finding MST. The data structure can be implemented using linked lists or trees. Using
tree-implementation, together with two heuristics Union by Rank and Path Compression,
he showed that the
amortized running time of m operations with n MAKE-SETs is
O(ma(n)) . He discussed the slowly
growing function a , which is known as inverse Ackermann function.
2/23 (x-perid): John MacMaster presented "Randomized Primality Testing".
This presentation covered some group and number theory fundamentals that are important for understanding the
methods, and appreciating the challenges, of testing large integers for primality. The talk was focused on
explaining the Miller-Rabin randomized primality testing algorithm and the auxiliary
procedures that it uses.
2/24: We proved the O( √n m) time bound for
Hopcroft-Karp algorithm for maximum cardinality matching in bipartite graphs.
Then we discussed the maximum matching for general graphs and proved the cycle
shrinking lemma and then described
Edmonds' algorithm for computing the maximum matching in general graphs.
Week 9
2/27: Rong Zhang presented Single pair shortest paths.
First, she introduced "Bellman-Ford Algorithm", which
solves the single-source shortest-paths problem in the general case in which edge weights
may be negative and then proved its correctness and analyzed its running time. She also introduced Dijkstra's
algorithm, which solves the problem for the case in which all edge weights
are nonnegative. She showed that with good implementation the running of Dijkstra's algorithm is lower than
that of Bellman-Ford algorithm.
3/1: Keren Tan presented "Analysis of the Greedy
Approach in Problems of Maximum k-Coverage" by by Dorit S. Hochbaum and
Any Pathria. A brief introduction to approximation algorithm is given
in the first part
of this presentation, and the second part provides an analysis of the
quality of solution of a greedy algorithm for the general problem of
covering the maximum weight set of elements of a universal set via k
structurally constrained subsets. As indicated, performing each greedy
step may itself be NP-hard. This analysis provides a performance bound
that is a function of beta, a measure of the quality of the
pseudo-greedy
solution found at each stage. Additional analysis about measuring
solution
quality in a particular instance is also discussed.
3/2 (x-period): We discussed Linear Programing (LP) today in
standard and slack form. Then we proved the convexity of the simplex
region if the solution space is bounded. We also showed how to convert a given
LP into standard form and then to slack form for simplex algorithm
to work on. We ended our class showing how simplex algorithm runs for a given
example and showed when simplex can cycle and eventually give
worst-case running time.
3/3: Nihal presented Huffman codes. In his presentation, he
(1) showed how Huffman codes help in data compression,
(2) gave & explained the Huffman code constructing algorithm &
(3) proved its correctness via two lemmas.
Week 10
3/6: Vibhor presented the first interior point algorithm for linear programming
invented by Narendra Karmarkar in 1984. This required to give the geometric
picture of the linear programming problem in terms of convex sets. Then he
showed how this algorithm by using repeated projective transformations of a
simplex, minimizes over a sphere in the simplex. Finally he explained why this
algorithm is linear time.
3/8: We discussed the LP duality and showed
how an LP in standard form can be converted to its dual form
and proved the weak duality theorem and
we discussed the final exam format. The question paper will be
made available by 10th and is due on 15th by 5p.m. in my office.
Introduction, syllabus, course structure, etc. Hash Tables
and a quick overview of some mathematical fundamentals.
Will be following chapter 11 from CLRS (second edition).
Lecture Notes 1 .
We discussed universal hashing and a construction of universal hash function.
Jan 9
We discussed perfect hashing and open addressing scheme and gave a brief introduction of
skip-list data structure.
Homework 1
is due today .
Jan 11
We completed the discussion on skip-list. Recommended reading for skip-list is two
papers by William Pugh, which are here (long) and
here (short) too. Brief
lecture note is here.
We discussed the definition of splay-tree and explained
the zig, zig-zig, and zig-zag rotations. We introduced amortized
analysis and stated the access lemma. Read the paper : Self-adjusting
binary search trees by Daniel Sleator and Robert Endre Tarjan (1985, Journal of ACM).
We proved access lemma and then discussed the operations on a splay tree like
Access, Splay, Join, Split, Delete. Also, we have proved the amortized complexity of splaying operation.
Jan 25
Graph Algorithms, Min-cut
Reading [1] ,
[2] and Chapter 10 of Randomized Algorithms [MR] and CLRS.
Jan 26 (x-period)
Nicolas Lane presented "Randomized Dining
Philosphers to TDMA Scheduling in Wireless Sensor Networks"
by Injong Rhee et. al. He described
an algorithm for one-time dining philosopher problem (ODP) and proved the
Liveness theorem that gives an upper bound on the expected number of rounds
during which a process remains in the trying and hopeful state.
Priya Natarajan presented how the weights of the items in a
splay tree
can be cleverly chosen to obtain strong results from Access Lemma like
balance theorem, static optimality theorem and static finger theorem.
She also gave a proof of working set theorem and some of the amortized
times of splay tree operations from Update Lemma. She spoke about long splay
theorem too.
We set the background of max-flow problem by defining
some terminologies and proving some lemmas and gave an intuition about max-flow min-cut theorem.
We also described the residual network.
Due on 2/8. Announcement: Every student's presentation will be a part of the syllabus for the final exam.
Feb 3
We showed how to construct residual network
and proved the Max-flow min-cut theorem.
We also presented the FORD-FULKERSON (F-F) algorithm and analyzed its
running time. We showed how EDMONDS-KARP algorithm improves the F-F's
running time to O(V E2 ).
Feb 6
We defined the Minimum Spanning Trees (MST) and cited some properties of MST.
We gave the generic algorithm for computing MST and then
described the Red rule and Blue rule for MST in a graph and gave their proofs. We described
Kruskal's algorithm for computing MST.
We described PRIM's and Boruvka's algorithms for computing MST and then showed
how a hybrid approach can make use of Boruvka and Prim's MST algorithm to improve the running time
to O(m log (log n)). Next, we defined some terminologies to set the background for
expected linear time algorithm for computing MST (minimum spanning forest as well) and then described the algorithm and proved
the running time to be linear ( expected running time).
We started a new topic which is string matching . After describing
some terminologies and definitions, we detailed the naive string matching algorithm and
then analyzed its complexity, which was O(n2). We showed how it can be improved to expected linear
time using some number theoretic properties. We set the background for Rabin-Karp's string matching algorithm.
Read Chapter 32 from CLRS
Feb 15
We described the Rabin-Karp string matching algorithm
and explained its running time which is expected linear time. We then
concluded with the Knuth-Morris-Pratt (KMP) algorithm for exact string matching.
Matt Bell presented the Fibonacci Heaps (F-Heaps) data structure.
He explained the operations on F-heap such as INSERT, MINIMUM, EXTRACT-MIN, UNION, DECREASE-KEY
and DELETEMIN and showed that the operations that do not involve deleting an element run in
O(1) amortized time.
Jianyang Zeng started with the small-world phenomenon, adopted from
Stanley Milgram and current famous
small-world models. Then he
presented the Kleinberg's model, and existing results on decentralized routing in small-world networks. He
gave the proof of the upper bound and lower bound of routing complexity in Kleinberg's paper. He also talked
about the method to obtain the O(log n) diameter of Kleinberg's small-world networks.
Matching Algorithms was the topic today. We started with
some fundamental definitions like matching, maximal matching, maximum matching,
perfect matching, M-alternating path, M-augmenting path. Then we gave a generic
algorithm to find out a maximum matching in a graph. We proved Berge's theorem and
showed how a maximum matching can be found in a bipartite graph. We shall handle the general
graph in the next class.
Patrick Tsang presented the topic of data structures for disjoint sets that find a lot of applications, e.g.
algorithms for finding MST. The data structure can be implemented using linked lists or trees. Using
tree-implementation, together with two heuristics Union by Rank and Path Compression,
he showed that the
amortized running time of m operations with n MAKE-SETs is
O(ma(n)) . He discussed the slowly
growing function a , which is known as inverse Ackermann function.
John MacMaster presented "Randomized Primality Testing".
This presentation covered some group and number theory fundamentals that are important for understanding the
methods, and appreciating the challenges, of testing large integers for primality. The talk was focused on
explaining the Miller-Rabin randomized primality testing algorithm and the auxiliary
procedures that it uses.
We proved the O( √n m) time bound for
Hopcroft-Karp algorithm for maximum cardinality matching in bipartite graphs.
Then we discussed the maximum matching for general graphs and proved the cycle
shrinking lemma and then described
Edmonds' algorithm for computing the maximum (cardinality) matching in general graphs.
Rong Zhang presented Single pair shortest paths.
First, she introduced "Bellman-Ford Algorithm", which
solves the single-source shortest-paths problem in the general case in which edge weights
may be negative and then proved its correctness and analyzed its running time. She also introduced Dijkstra's
algorithm, which solves the problem for the case in which all edge weights
are nonnegative. She showed that with good implementation the running of Dijkstra's algorithm is lower than
that of Bellman-Ford algorithm.
Read chapter 24 from CLRS
March 1
Keren Tan presented "Analysis of the Greedy Approach in
Problems of Maximum k-Coverage" by by Dorit S. Hochbaum and Any Pathria. A brief introduction
to approximation algorithm is given in the first part
of this presentation, and the second part provides an analysis of the
quality of solution of a greedy algorithm for the general problem of
covering the maximum weight set of elements of a universal set via k
structurally constrained subsets. As indicated, performing each greedy
step may itself be NP-hard. This analysis provides a performance bound
that is a function of beta, a measure of the quality of the pseudo-greedy
solution found at each stage. Additional analysis about measuring solution
quality in a particular instance is also discussed.
We discussed Linear Programing (LP) today in
standard and slack form. Then we proved the convexity of the simplex
region if the solution space is bounded. We also showed how to convert a given
LP into standard form and then to slack form for simplex algorithm
to work on. We ended our class showing how simplex algorithm runs for a given
example and showed when simplex can cycle and eventually give
worst-case running time.
Chapter 29 from CLRS
March 3
Nihal presented Huffman codes. In his presentation, he
(1) showed how Huffman codes help in data compression,
(2) gave & explained the Huffman code constructing algorithm &
(3) proved its correctness via two lemmas.
Vibhor presented the first interior point algorithm for linear programming
invented by Narendra Karmarkar in 1984. This required to give the geometric
picture of the linear programming problem in terms of convex sets. Then he
showed how this algorithm by using repeated projective transformations of a
simplex, minimizes over a sphere in the simplex. Finally he explained why this
algorithm is linear time.
Interior point method in LP
Read this paper by N.
Karmarkar.
March 8
We discussed the Duality in LP and showed
how an LP in standard form can be converted to its dual form
and proved the weak duality theorem.
We discussed the final exam format. The question paper will be
made available by 10th and is due on 15th by 5p.m. in my office.