ADVANCED MACHINE LEARNING Canale unico

Docente coordinatore e verbalizzante: FABIO GALASSO

Docenti

Obiettivi formativi

General objectives:
The course will present to students advanced and most recent concepts of machine learning and their application in computer vision via deep neural network (DNN) models. It will include theory and practical coding, as well as a final hands-on project. Towards the coding assignments and the final project, the students will work in teams and present their ideas and project outcome to the class.

Specific objectives
The first part of the course includes delving into state-of-the-art DNN models for classification and regression applied to detection (where the objects are in the image), pose estimation (whether people stand, sit or crunch) and re-identification (estimating a unique vector representation for each person). The course further discusses DNNs for multi-task objectives (joint detection, pose estimation, re-identification, segmentation, depth estimation etc). This first part would include DNNs which apply to video sequences, by leveraging memory (e.g. LSTMs) or attention (Transformers).
The second part of the course delves into models, training techniques and data manipulation for generalization, domain adaptation and meta-learning. Further to transfer learning (how pre-trained models may be deployed for other tasks), it discusses multi-modal (with different sensor modalities such as depth or thermal cameras) and self-supervision (e.g. training the DNN model by solving jigsaw puzzles) to auto-annotate large amounts of data. Finally, it presents domain adaptation (e.g. apply daytime-detectors for night vision) and meta-learning, a most recent framework to learn how to learn a task, e.g. online or from little available data.

Knowledge and understanding:
At the end of the course students will be familiar with state-of-the-art DNN models for multiple tasks and multi-task objectives, as well as generalization and the effective use of labelled and unlabelled data for learning, self-supervision and meta-learning.

Apply knowledge and understanding:
At the end of the course students will have become familiar with the most recent advances in machine learning across a variety of tasks, their adaptation to novel domains and the continual self-learning of algorithms. They will be able to explain the algorithms and choose the most appropriate techniques for a given problem. They will be able to experiment with existing implementations and design and write programs for new solutions for a given task or problem in the two fields.

Critical and judgment skills:
Students will be able to analyse a problem or task and identify the most suitable methodologies and techniques to apply in terms of the effective resolution of the problem (accuracy) and its feasibility, including the efficiency, the required amount of data and annotation. Further to class discussions, critical and judgemental skills would be the result of assignments, a course project and a final project report.

Communication skills:
Students will acquire the ability to expose their knowledge in a clear and organized way, which will be verified through a final project presentation and its discussion.
Students will be able to express their solutions rigorously and to explain the structure of the code they have written.

Learning ability:
The acquired knowledge will enable students to face the study of other problems in machine learning and computer vision. Learning ability would result from the chosen lecture topics, covering most broad areas in advanced machine learning, as well as from the final project, for which students would deep dive into a new topic, beyond the thought material.

Risultati di apprendimento attesi

General objectives:
The course will present to students the advanced and most recent concepts of machine learning and their application in computer vision via deep neural network (DNN) models. It will include theory and practical coding, as well as a final hands-on project. Towards the coding assignments and the final project, the students will work in teams and present their ideas and project outcome to the class.

Specific objectives
The first part of the course includes delving into state-of-the-art DNN models for classification and regression applied to detection (where the objects are in the image), pose estimation (whether people stand, sit, or crunch), and re-identification (estimating a unique vector representation for each person). The course further discusses DNNs for multi-task objectives (joint detection, pose estimation, re-identification, segmentation, depth estimation, etc). This first part includes DNNs that apply to video sequences, by leveraging memory (e.g. LSTMs) or attention (Transformers).
The second part of the course delves into models, training techniques, and data manipulation for generalization, domain adaptation, and meta-learning. Further to transfer learning (how pre-trained models may be deployed for other tasks), it discusses multi-modal (with different sensor modalities such as depth or thermal cameras) and self-supervision (e.g. training the DNN model by solving jigsaw puzzles) to auto-annotate large amounts of data. Finally, it presents domain adaptation (e.g. apply daytime-detectors for night vision) and meta-learning, a most recent framework to learn how to learn a task, e.g. online or from little available data.

Knowledge and understanding:
At the end of the course, students will be familiar with state-of-the-art DNN models for multiple tasks and multi-task objectives, as well as generalization and the effective use of labeled and unlabelled data for learning, self-supervision, and meta-learning.

Apply knowledge and understanding:
At the end of the course, students will have become familiar with the most recent advances in machine learning across a variety of tasks, their adaptation to novel domains, and the continual self-learning of algorithms. They will be able to explain the algorithms and choose the most appropriate techniques for a given problem. They will be able to experiment with existing implementations and design and write programs for new solutions for a given task or problem in the two fields.

Critical and judgment skills:
Students will be able to analyze a problem or task and identify the most suitable methodologies and techniques to apply in terms of the effective resolution of the problem (accuracy) and its feasibility, including the efficiency, the required amount of data, and annotation. Further to class discussions, critical and judgemental skills will be the result of assignments, a course project, and a final project report.

Communication skills:
Students will acquire the ability to expose their knowledge in a clear and organized way, which will be verified through a final project presentation and its discussion.
Students will be able to express their solutions rigorously and explain the structure of the code they have written.

Learning ability:
The acquired knowledge will enable students to face the study of other problems in machine learning and computer vision. Learning ability will result from the chosen lecture topics, covering most broad areas in advanced machine learning, as well as from the final project, for which students will deep dive into a new topic, beyond the thought material.

Prerequisiti

- Proficiency in Python, some high-level familiarity with C/C++
- All class assignments will be in Python (and use Numpy), but some of the deep learning libraries may be in C++
- Calculus and Linear Algebra
- taking derivatives, understanding matrix vector operations and notation
- Basic Probability and Statistics
- basics of probabilities, gaussian distributions, mean, standard deviation, etc
- Basic Machine Learning
- cost functions, derivatives and optimization with gradient descent

Programma dell’insegnamento

The course will present advanced concepts of machine learning and their application in computer vision via deep neural network (DNN) models. It will include theory and practical coding, as well as a final hands-on project.
In the first part of the course, I will introduce state-of-the-art DNN models for classification, showing how to estimate which objects are within an image. I will then showcase regression, as applied to detection (where the objects are in the image), pose estimation (whether people stand, sit or crunch) and re-identification (estimating a unique vector representation for each person). I will further discuss DNNs for multi-task objectives (joint detection, pose estimation, re-identification, segmentation, depth estimation etc). This first part will include DNNs which apply to video sequences, by leveraging memory (e.g. LSTMs) or attention (Transformers).

In a second part of the course, I will discuss generalization and the effective use of labelled and unlabelled data for learning. Further to transfer learning (how pre-trained models may be deployed for other tasks), I will discuss multi-modal (with different sensor modalities such as depth or thermal cameras) and self-supervision (e.g. training the DNN model by solving jigsaw puzzles) to auto-annotate large amounts of data. Also, I will present domain adaptation (e.g. apply daytime-detectors for night vision) and meta-learning, a most recent framework to learn how to learn a task, e.g. online or from little available data. Finally, I will introduce novel machine learning trends such as hyperbolic neural networks and generative diffusion models, and their applications to tasks such as anomaly detection while estimating the model uncertainty.

For more information see also the '22-'23 course website: https://sites.google.com/di.uniroma1.it/aml-2022-2023

Testi di riferimento

Slides and coding scripts will be distributed after lectures, as well as references to online material including papers and blogs.

Reference books Machine Learning:
Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning
Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/)

Reference books for Deep Learning:
Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/)
Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html)
Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/)

Reference books for Computer Vision:
Richard Szeliski, 2010. Computer Vision: Algorithms and Applications

Reference book for Python:
Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist
Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook)

Online tutorials for Python: https://docs.python.org/3/tutorial/
Online tutorials for Pytorch: https://pytorch.org/tutorials/

Bibliografia

More references for Deep Learning:
- Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/)

Reference book for Python:
- Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist

Online tutorials for Python: https://docs.python.org/3/tutorial/
Online tutorials for Pytorch: https://pytorch.org/tutorials/

Modalità di svolgimento

Lectures consist of frontal teaching, based on the alternation of slide presentations and writing at the blackboard. The course additionally includes small group exercises during lectures, as well as assignments and projects, to be prepared by the students off the lectures. The student presentations of their final project is also part of the course. All material will be distributed via the course mailing list. The course additionally includes guest lectures by field experts.

Frequenza

Please refer to the degree course regulations.

Modalità di esame

Exam:
1) Theory: 50% (oral)
2) Practise: 50%, including assignments in Python and Pytorch, to be submitted by given deadlines during the course, and a final project and presentation

Theory:
- oral exam

Assignments:
- The assignments and the final projects must be submitted in groups of size [ 2 – 4 ]

Final project:
- Algorithms, objectives and topics for the final project may be freely chosen
- Ideas for projects and resources for it will be discussed in class

Esempi di domande

Typical project presentation questions regard the insights which emerged from the projects and the analysis of the presented results.

Oral exams test knowledge of the course program and the design of a machine learning system.

A few sample questions will be provided during the course and distributed by the course mailing list.

Programmazione delle attività didattiche

  • General Concepts of Deep Learning [10 hours]
    • Testi di riferimento: Aston Zhang, Zachary Lipton, Alexander J. Smola, Mu Li, 2023. Dive Into Deep Learning 

  • Advanced Concepts of Deep Learning [25 hours]
    • Testi di riferimento: Aston Zhang, Zachary Lipton, Alexander J. Smola, Mu Li, 2023. Dive Into Deep Learning 

  • Coding of Advance Machine Learning and Computer Vision [10 hours]
    • Testi di riferimento: Aston Zhang, Zachary Lipton, Alexander J. Smola, Mu Li, 2023. Dive Into Deep Learning 

  • Advanced Concepts of Computer Vision [15 hours]
    • Testi di riferimento: Antonio Torralba, Phillip Isola and William T. Freeman, 2024. Foundations of Computer Vision

Obiettivi per lo sviluppo sostenibile - Agenda ONU 2030

  • Goal4
  • Anno accademico2024/2025
  • Corso di studio a cui afferisce l’insegnamentoData Science
  • Codice insegnamento10589621
  • Anno e semestre2º anno - 1º semestre
  • TipologiaAttività formative caratterizzanti
  • AmbitoFormazione informatica e dell'informazione
  • SSDINF/01
  • Presenza obbligatoriaNo
  • Linguaeng
  • CFU6 CFU
  • Durata complessiva60 ore
  • Distribuzione delle ore36 classroom hours, 24 training hours