FOUNDATIONS OF DATA SCIENCE

Course objectives

General goals: Acquiring the basics of data science and machine learning. Specific goals: To make students aware of the theoretical and practical tools of data science and machine learning, as well as of their intrinsical limitations; to make students able to tackle real problems through the most appropriate tools. Knowledge and understanding: The course provides the basic notions, techniques and methodologies employed in data science and machine learning. It gives also the fundamental programming abilities needed to apply the theory to real-world scenarios. Applying knowledge and understanding: At the end of the course, students will be able to deal with real-world data science problems, from casting them into a theoretical framework to manipulating the actual data with the right software tools. Critical and judgmental abilities: Students will be able to select the techniques to be applied to the case at hand and to evaluate their performance. Communication skills: Students will we able to represent and communicate the information extracted from data, through the rational use of graphics and indicators. Ability of learning: Students will be able to learn autonomously both the theory and the practice of the field.

Channel 1
MATTEO CINELLI Lecturers' profile

Program - Frequency - Exams

Course program
The course is built around three core dimensions. Machine Learning Foundations: Datasets and their representation (6h), Linear Regression with bias-variance trade-off and regularization (7h), Classification, Calibration, and Performance Evaluation (6h), Non-Parametric models: K-NN, Decision Trees, Random Forest, and XGBoost (5h), Neural Networks and Backpropagation (4h), Image Representation and Convolution (3h), CNNs and other Network Components (5h), Autoencoders and Variational Inference (5h), Text Representation, Self-Attention, and Transformers (3h), Multimodal Machine Learning (2h). Complex Networks and Network Science: Introduction to Network Data and Structural Properties of Networks (10h), Generative Models of Network Formation (7h), Mechanistic Models of Network Formation (5h), Community Detection and Graph Clustering Methods (8h). Programming and Practice: Each objective will be addressed theoretically and through practical programming exercises with Python.
Prerequisites
Calculus and Linear Algebra, including taking derivatives, understanding matrix vector operations and notation Basic Probability and Statistics, including basics of probabilities, gaussian distributions, mean and standard deviation
Books
Data Science: Bertsimas, O'Hair, Pulleyblank. The Analytics Edge. Machine Learning Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Deep learning S. Prince. Understanding Deep Learning. MIT Press, 2023. Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Image Analysis and Recognition, Computer Vision: Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Network Science Newman, Mark, Networks, 2nd edn (Oxford, 2018; online edn, Oxford Academic, 18 Oct. 2018), https://doi.org/10.1093/oso/9780198805090.001.0001 Barabási, A.-L., Pósfai, M. (2016). Network science. Cambridge: Cambridge University Press. ISBN: 9781107076266 1107076269 Python Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook)
Frequency
In person lessons
Exam mode
The evaluation for the course is thoughtfully structured into three distinct components, each designed to assess different skill sets and areas of competence. 1) Theory (34%): A 30-minute multiple-choice exam evaluates core conceptual understanding, testing fluency in foundational data science and machine learning principles. 2) Practice (33%): Focused on hands-on application, of a group of 3-5 students combining: a) Coding Assignments (16.5%): Two Python-based tasks emphasizing technical implementation. b) Final Project & Presentation (16.5%): A real-world problem, from model design to stakeholder communication. 3) Network Science Laboratory (33%): A specialized 30-minute multiple-choice exam assesses mastery of network science concepts.
Lesson mode
In person lessons
MATTEO CINELLI Lecturers' profile

Program - Frequency - Exams

Course program
The course is built around three core dimensions. Machine Learning Foundations: Datasets and their representation (6h), Linear Regression with bias-variance trade-off and regularization (7h), Classification, Calibration, and Performance Evaluation (6h), Non-Parametric models: K-NN, Decision Trees, Random Forest, and XGBoost (5h), Neural Networks and Backpropagation (4h), Image Representation and Convolution (3h), CNNs and other Network Components (5h), Autoencoders and Variational Inference (5h), Text Representation, Self-Attention, and Transformers (3h), Multimodal Machine Learning (2h). Complex Networks and Network Science: Introduction to Network Data and Structural Properties of Networks (10h), Generative Models of Network Formation (7h), Mechanistic Models of Network Formation (5h), Community Detection and Graph Clustering Methods (8h). Programming and Practice: Each objective will be addressed theoretically and through practical programming exercises with Python.
Prerequisites
Calculus and Linear Algebra, including taking derivatives, understanding matrix vector operations and notation Basic Probability and Statistics, including basics of probabilities, gaussian distributions, mean and standard deviation
Books
Data Science: Bertsimas, O'Hair, Pulleyblank. The Analytics Edge. Machine Learning Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Deep learning S. Prince. Understanding Deep Learning. MIT Press, 2023. Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Image Analysis and Recognition, Computer Vision: Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Network Science Newman, Mark, Networks, 2nd edn (Oxford, 2018; online edn, Oxford Academic, 18 Oct. 2018), https://doi.org/10.1093/oso/9780198805090.001.0001 Barabási, A.-L., Pósfai, M. (2016). Network science. Cambridge: Cambridge University Press. ISBN: 9781107076266 1107076269 Python Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook)
Frequency
In person lessons
Exam mode
The evaluation for the course is thoughtfully structured into three distinct components, each designed to assess different skill sets and areas of competence. 1) Theory (34%): A 30-minute multiple-choice exam evaluates core conceptual understanding, testing fluency in foundational data science and machine learning principles. 2) Practice (33%): Focused on hands-on application, of a group of 3-5 students combining: a) Coding Assignments (16.5%): Two Python-based tasks emphasizing technical implementation. b) Final Project & Presentation (16.5%): A real-world problem, from model design to stakeholder communication. 3) Network Science Laboratory (33%): A specialized 30-minute multiple-choice exam assesses mastery of network science concepts.
Lesson mode
In person lessons
INDRO SPINELLI Lecturers' profile

Program - Frequency - Exams

Course program
Machine Learning Foundations: Datasets and their representation (6h), Linear Regression with bias-variance trade-off and regularization (7h), Classification, Calibration, and Performance Evaluation (6h), Non-Parametric models: K-NN, Decision Trees, Random Forest, and XGBoost (5h), Neural Networks and Backpropagation (4h), Image Representation and Convolution (3h), CNNs and other Network Components (5h), Autoencoders and Variational Inference (5h), Text Representation, Self-Attention, and Transformers (3h), Multimodal Machine Learning (2h). Programming and Practice: Each objective will be addressed theoretically and through practical programming exercises with Python.
Prerequisites
Calculus and Linear Algebra, including taking derivatives, understanding matrix vector operations and notation Basic Probability and Statistics, including basics of probabilities, gaussian distributions, mean and standard deviation
Books
Data Science: Bertsimas, O'Hair, Pulleyblank. The Analytics Edge. Machine Learning: Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Deep learning: S. Prince. Understanding Deep Learning. MIT Press, 2023. Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Image Analysis and Recognition, Computer Vision: Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Python Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook)
Frequency
Please refer to the degree course regulations.
Exam mode
The evaluation for the course is thoughtfully structured into three distinct components, each designed to assess different skill sets and areas of competence. 1) Theory (50%): A 30-minute multiple-choice exam evaluates core conceptual understanding, testing fluency in foundational data science and machine learning principles. 2) Practice (50%): Focused on hands-on application, of a group of 3-5 students combining: a) Coding Assignments (25%): Python-based task emphasizing technical implementation. b) Final Project & Presentation (25%): A real-world problem, from model design to stakeholder communication.
Lesson mode
Traditional frontal lectures
INDRO SPINELLI Lecturers' profile

Program - Frequency - Exams

Course program
Machine Learning Foundations: Datasets and their representation (6h), Linear Regression with bias-variance trade-off and regularization (7h), Classification, Calibration, and Performance Evaluation (6h), Non-Parametric models: K-NN, Decision Trees, Random Forest, and XGBoost (5h), Neural Networks and Backpropagation (4h), Image Representation and Convolution (3h), CNNs and other Network Components (5h), Autoencoders and Variational Inference (5h), Text Representation, Self-Attention, and Transformers (3h), Multimodal Machine Learning (2h). Programming and Practice: Each objective will be addressed theoretically and through practical programming exercises with Python.
Prerequisites
Calculus and Linear Algebra, including taking derivatives, understanding matrix vector operations and notation Basic Probability and Statistics, including basics of probabilities, gaussian distributions, mean and standard deviation
Books
Data Science: Bertsimas, O'Hair, Pulleyblank. The Analytics Edge. Machine Learning: Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Deep learning: S. Prince. Understanding Deep Learning. MIT Press, 2023. Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Image Analysis and Recognition, Computer Vision: Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Python Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook)
Frequency
Please refer to the degree course regulations.
Exam mode
The evaluation for the course is thoughtfully structured into three distinct components, each designed to assess different skill sets and areas of competence. 1) Theory (50%): A 30-minute multiple-choice exam evaluates core conceptual understanding, testing fluency in foundational data science and machine learning principles. 2) Practice (50%): Focused on hands-on application, of a group of 3-5 students combining: a) Coding Assignments (25%): Python-based task emphasizing technical implementation. b) Final Project & Presentation (25%): A real-world problem, from model design to stakeholder communication.
Lesson mode
Traditional frontal lectures
  • Lesson code1047627
  • Academic year2025/2026
  • CourseComputer Science
  • CurriculumSingle curriculum
  • Year2nd year
  • Semester1st semester
  • SSDINF/01
  • CFU6