The purpose of this course is to introduce students to large scale machine learning. The focus is as much on optimization algorithms as on distributed systems.
Teacher: Giovanni Neglia
Main references:
Léon Bottou, Frank E. Curtis, Jorge Nocedal, Optimization Methods for Large-Scale Machine Learning, available here
Joseph E. Gonzalez, Emerging Systems for Large-Scale Machine Learning, invited tutorial at ICML 2014,
slides [pdf], [pptx]
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, see dedicated page
Mu Li, David G. Andersen, Alexander Smola, and Kai Yu,
Communication Efficient Distributed Machine Learning with the Parameter Server, NIPS 2014
available here
Abadi et al, TensorFlow: A System for Large-Scale Machine Learning, OSDI 2016, [pdf]
Feng Niu, Benjamin Recht, Christopher Ré and Stephen J. Wright,
Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent,
available here
Eric P. Xing, Qirong Ho, Pengtao Xie, Wei Dai, Strategies and Principles of Distributed Machine Learning on Big Data, available here
Giovanni Neglia, Gianmarco Calbi, Don Towsley, and Gayane Vardoyan, The Role of Network Topology for Distributed Machine Learning, Infocom 2019
Other resources:
Sébastien Bubeck, Convex Optimization: Algorithms and Complexity, available here
Evaluation: 30% classwork (a 10-minute test at every lesson, only 5 best marks will be considered), 30% individual project to be delivered at week 7, 40% final exam.
Lessons
You can freely use the slides below for your presentations, but I would like to be informed and please acknowledge the source in your presentation. Any comment is welcome.
First lesson (December 19, 2018): introduction to the course, math refresher (gradient, hessian, convex sets and functions), introduction to ML optimization and analysis of stochastic gradient methods (sections 1-4 of Bottou et al)
Individual project
The individual project is an opportunity for the student to actively use the material taught in the course.
The student is free to choose the goal of its project, but is invited to discuss it with the teacher.
Possible goals are
reproduce an experimental result in a paper,
design or perform an experiment to support/confute a statement in a paper,
apply some of the optimization algorithms described in the course to a specific problem the student is interested in (e.g. for another course, his/her final project, etc.),
compare different algorithms,
implement an algorithm in a distributed system (Spark, TensorFlow, …),
…
The mark will take into account: originality of the project, presentation quality, technical correctness, and task difficulty. Any form of plagiarism will lead to reduction of the final mark.
A list of possible projects is provided below.
Submission rules
The student will provide
1) a 3-page report formatted according to ICLR template, with unlimited additional pages for bibliography and eventual unlimited appendices to contain proofs, description of code or additional experiments,
2) code developed,
3) a readme file containing instructions to run the code and reproduce the experiments in the report
The report must clearly describe and motivate the goal of the project, provide any relevant background and explain the original contribution of the student.
What is explained in the course can be considered of general knowledge and should not be repeated in the report.
The code must be well commented.
The student will made the material above available online in a zipped folder named with his/her name, and will send the link to the teacher