Data Science

Statistical Machine Learning

Course objectives

Devising new machine learning methods and statistical models is a fun and extremely fruitful “art”. But these powerful tools are not useful unless we understand when they work, and when they fail. The main goal of statistical learning theory is thus to study, in a statistical framework, the properties of learning algorithms mainly in the form of so-called error bounds. This course introduces the techniques that are used to obtain such results, combining methodology with theoretical foundations and computational aspects. It treats both the basic principles to design successful learning algorithms and the “science” of analyzing an algorithm’s statistical properties and performance guarantees. Theorems are presented together with practical aspects of methodology and intuition to help students develop tools for selecting appropriate methods and approaches to problems in their own data analyses. Methods for a wide variety of applied problems will be explored and implemented on open-source software like R (www.r-project.org), Keras (https://keras.io/) and TensorFlow (https://www.tensorflow.org/). Knowledge and understanding On successful completion of this course, students will: know the main learning methodologies and paradigms with their strengths and weakness; be able to identify a proper learning model for a given problem; assess the empirical and theoretical performance of different learning models; know the main platforms, programming languages and solutions to develop effective implementations. Applying knowledge and understanding Besides the understanding of theoretical aspects, thanks to applied homeworks and a final project possibly linked to hackathons or other data analysis competitions, the students will constantly be challenged to use and evaluate modern learning techniques and algorithms. Making judgements On successful completion of this course, students will develop a positive critical attitude towards the empirical and theoretical evaluation of statistical learning paradigms and techniques. Communication skills In preparing the report and oral presentation for the final project, students will learn how to effectively communicate original ideas, experimental results and the principles behind advanced data analytic techniques in written and oral form. They will also understand how to offer constructive critiques on the presentations of their peers. Learning skills In this course the students will develop the skills necessary for a successful understanding as well as development of new learning methodologies together with their effective implementation. The goal is of course to grow an active attitude towards continued learning throughout a professional career.

Channel 1

PIERPAOLO BRUTTI Lecturers' profile

Program - Frequency - Exams

Course program

1. Review of basic probability and inference. 2. Concentration of measure. 3. Basics of convex optimization. 4. Statistical functional: bootstrap & subsampling. 5. Nonparametric Regression and Density estimation: kernels and RKHS. 6. Nonparametric Classification. 7. Nonparametric Clustering: k-means, density clustering. 8. Graphical Models and their applications: parametric and nonparametric approaches. 9. Hints of Nonparametric Bayes 10. Minimaxity & Sparsity Theory

Prerequisites

Statistical Inference, Basic Probability, Linear Algebra, Multivariable Calculus

Books

Main references - Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (2018). Fundamentals of Machine Learning. MIT Press Available at: https://cs.nyu.edu/~mohri/mlbook/ - Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (2013). An Introduction to Statistical Learning with Applications in R. Available at: http://www-bcf.usc.edu/~gareth/ISL/ - Larry Wasserman (2005). All of Nonparametric Statistics. - Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Available at: http://stanford.edu/~boyd/cvxbook/

Teaching mode

Lectures

Frequency

Not mandatory

Exam mode

Homeworks + Final Project/Hackathon

Bibliography

More Advanced/In Depth: - Hastie et al., The Elements of Statistical Learning (https://web.stanford.edu/~hastie/ElemStatLearn/) - Mohri et al., Foundations of Machine Learning (2018) - Tsybakov, Introduction to Nonparametric Estimation (2009) - Shawe-Taylor and Cristianini, Kernel Methods for Pattern Analysis (2004) Deadly Alternatives: - Devroye et al., A Probabilistic Theory of Pattern Recognition (1996) - Wainwright, High-Dimensional Statistics (2019) R Programming - R for Data Science (http://r4ds.had.co.nz/) - Applied Predictive Modeling (http://appliedpredictivemodeling.com/) - Feature Engineering and Selection (http://www.feat.engineering/)

Lesson mode

Class and Lab lectures

Lesson code10621099
Academic year2025/2026
CourseData Science
CurriculumSingle curriculum
Year1st year
Semester2nd semester
SSDSECS-S/01
CFU6

Course catalogue

11/11/2025 - Recruiting Day Iconsulting - December 3 at 10:00 AM

04/11/2025 - Bando per conferimento Borse di Studio intitolate alla memoria di "Antonio Ventura" e dedicate ai laureati magistrali delle Facoltà di Ingegneria

03/11/2025 - Call for Applications for 27 Collaboration Grants for Support Activities

Statistical Machine Learning

Course objectives

Program - Frequency - Exams

Course program

Prerequisites

Books

Teaching mode

Frequency

Exam mode

Bibliography

Lesson mode

Data Science

Featured announcements

11/11/2025 - Recruiting Day Iconsulting - December 3 at 10:00 AM

04/11/2025 - Bando per conferimento Borse di Studio intitolate alla memoria di "Antonio Ventura" e dedicate ai laureati magistrali delle Facoltà di Ingegneria

03/11/2025 - Call for Applications for 27 Collaboration Grants for Support Activities

Statistical Machine Learning

Course objectives

Program - Frequency - Exams

Course program

Prerequisites

Books

Teaching mode

Frequency

Exam mode

Bibliography

Lesson mode