Advanced Machine Learning

Course objectives

General objectives: The course will present to students advanced and most recent concepts of machine learning and their application in computer vision via deep neural network (DNN) models. It will include theory and practical coding, as well as a final hands-on project. Towards the coding assignments and the final project, the students will work in teams and present their ideas and project outcome to the class. Specific objectives The first part of the course includes delving into state-of-the-art DNN models for classification and regression applied to detection (where the objects are in the image), pose estimation (whether people stand, sit or crunch) and re-identification (estimating a unique vector representation for each person). The course further discusses DNNs for multi-task objectives (joint detection, pose estimation, re-identification, segmentation, depth estimation etc). This first part would include DNNs which apply to video sequences, by leveraging memory (e.g. LSTMs) or attention (Transformers). The second part of the course delves into models, training techniques and data manipulation for generalization, domain adaptation and meta-learning. Further to transfer learning (how pre-trained models may be deployed for other tasks), it discusses multi-modal (with different sensor modalities such as depth or thermal cameras) and self-supervision (e.g. training the DNN model by solving jigsaw puzzles) to auto-annotate large amounts of data. Finally, it presents domain adaptation (e.g. apply daytime-detectors for night vision) and meta-learning, a most recent framework to learn how to learn a task, e.g. online or from little available data. Knowledge and understanding: At the end of the course students will be familiar with state-of-the-art DNN models for multiple tasks and multi-task objectives, as well as generalization and the effective use of labelled and unlabelled data for learning, self-supervision and meta-learning. Apply knowledge and understanding: At the end of the course students will have become familiar with the most recent advances in machine learning across a variety of tasks, their adaptation to novel domains and the continual self-learning of algorithms. They will be able to explain the algorithms and choose the most appropriate techniques for a given problem. They will be able to experiment with existing implementations and design and write programs for new solutions for a given task or problem in the two fields. Critical and judgment skills: Students will be able to analyse a problem or task and identify the most suitable methodologies and techniques to apply in terms of the effective resolution of the problem (accuracy) and its feasibility, including the efficiency, the required amount of data and annotation. Further to class discussions, critical and judgemental skills would be the result of assignments, a course project and a final project report. Communication skills: Students will acquire the ability to expose their knowledge in a clear and organized way, which will be verified through a final project presentation and its discussion. Students will be able to express their solutions rigorously and to explain the structure of the code they have written. Learning ability: The acquired knowledge will enable students to face the study of other problems in machine learning and computer vision. Learning ability would result from the chosen lecture topics, covering most broad areas in advanced machine learning, as well as from the final project, for which students would deep dive into a new topic, beyond the thought material.

Channel 1
FABIO GALASSO Lecturers' profile

Program - Frequency - Exams

Course program
The course will present advanced concepts of machine learning and their application in computer vision via deep neural network (DNN) models. It will include theory and practical coding, as well as a final hands-on project. In the first part of the course, I will introduce state-of-the-art DNN models for classification, showing how to estimate which objects are within an image. I will then showcase regression, as applied to detection (where the objects are in the image), pose estimation (whether people stand, sit or crunch) and re-identification (estimating a unique vector representation for each person). I will further discuss DNNs for multi-task objectives (joint detection, pose estimation, re-identification, segmentation, depth estimation etc). This first part will include DNNs for video sequences, by leveraging memory (e.g. LSTMs), attention (Transformers), and graph structures (Graph Neural Networks). In the second part of the course, I will discuss generalization and the effective use of labelled and unlabelled data for learning. Further to transfer learning (how pre-trained models may be deployed for other tasks), I will discuss multi-modal (with different sensor modalities such as text, audio, video and event-based cameras) and self-supervision (e.g. training the DNN model by solving jigsaw puzzles, and via the use of contrastive learning) to auto-annotate large amounts of data. Also, I will present domain adaptation (e.g. apply daytime-detectors for night vision) and meta-learning, a most recent framework to learn how to learn a task, e.g. online or from little available data. Finally, I will introduce novel machine learning trends such as hyperbolic neural networks and generative techniques such as diffusion, adversarial and auto-encoding models, and their applications to tasks such as anomaly detection while estimating the model uncertainty. For more information, see the course websites of the 24/25 academic year: https://sites.google.com/di.uniroma1.it/aml-2024-2025 and of the 25/26 academic year: https://sites.google.com/di.uniroma1.it/amlcv-2025-2026
Prerequisites
- Proficiency in Python, some high-level familiarity with C/C++ - All class assignments will be in Python (and use Numpy), but some of the deep learning libraries may be in C++ - Calculus and Linear Algebra - taking derivatives, understanding matrix vector operations and notation - Basic Probability and Statistics - basics of probabilities, gaussian distributions, mean, standard deviation, etc - Basic Machine Learning - cost functions, derivatives and optimization with gradient descent
Books
Slides and coding scripts will be distributed after lectures, as well as references to online material, including papers and blogs. Reference books Machine Learning: Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Reference books for Deep Learning: Aston Zhang, Zachary Lipton, Alexander J. Smola, Mu Li, 2023. Dive Into Deep Learning (available at: https://d2l.ai/) Francois Fleuret, 2024. The Little Book of Deep Learning. (available at: https://fleuret.org/dlc/) Michael M. Bronstein, Joan Bruna, Taco Cohen, Petar Veličković, 2024. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges (available at: https://geometricdeeplearning.com/book/) Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Reference books for Computer Vision: Antonio Torralba, Phillip Isola and William T. Freeman, 2024. Foundations of Computer Vision (https://mitpress.mit.edu/9780262048972/foundations-of-computer-vision/) Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Reference books for Robotics: Frank Dellaert, 2024, Robotics. (available at: https://www.roboticsbook.org/) Reference book for Python and Pytorch: Aston Zhang, Zachary Lipton, Alexander J. Smola, Mu Li, 2023. Dive Into Deep Learning (available at https://d2l.ai/ , select Notebooks/Pytorch) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook) Online tutorials for Python: https://docs.python.org/3/tutorial/ Online tutorials for Pytorch: https://pytorch.org/tutorials/
Teaching mode
Lectures with blackboard and slides, coding and project assignments and discussions.
Frequency
Please refer to the degree course regulations.
Exam mode
Exam: 1) Theory: 50% (oral) 2) Practise: 50%, including assignments in Python and Pytorch, to be submitted by given deadlines during the course, and a final project and presentation Theory: - oral exam Assignments: - The assignments and the final projects must be submitted in groups of size [ 2 – 4 ] Final project: - Ideas for projects and resources will be discussed in class
Bibliography
Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Michael M. Bronstein, Joan Bruna, Taco Cohen, Petar Veličković, 2024. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges (available at: https://geometricdeeplearning.com/book/) Frank Dellaert, 2024, Robotics. (available at: https://www.roboticsbook.org/) Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Francois Fleuret, 2024. The Little Book of Deep Learning. (available at: https://fleuret.org/dlc/) Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/) Antonio Torralba, Phillip Isola and William T. Freeman, 2024. Foundations of Computer Vision (https://mitpress.mit.edu/9780262048972/foundations-of-computer-vision/) Aston Zhang, Zachary Lipton, Alexander J. Smola, Mu Li, 2023. Dive Into Deep Learning (available at: https://d2l.ai/) Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook)
Lesson mode
Lectures consist of frontal teaching, based on the alternation of slide presentations and writing at the blackboard. The course additionally includes small group exercises during lectures, as well as assignments and projects, to be prepared by the students off the lectures. The student presentations of their final project is also part of the course. All material will be distributed via the course mailing list. The course additionally includes guest lectures by field experts.
  • Lesson code10589621
  • Academic year2025/2026
  • CourseData Science
  • CurriculumSingle curriculum
  • Year2nd year
  • Semester1st semester
  • SSDINF/01
  • CFU6