FOUNDATIONS OF DATA SCIENCE

Obiettivi formativi

Obiettivi generali: Acquisire i fondamenti della scienza dai dati e dell'apprendimento automatico. Obiettivi specifici: Rendere gli studenti consapevoli degli strumenti teorici e pratici della scienza dei dati e dell'apprendimento automatico, nonché dei loro limiti intrinseci; rendere gli studenti in grado di affrontare problemi reali attraverso gli strumenti più appropriati. Conoscenza e comprensione: Il corso fornisce le nozioni, tecniche e metodologie di base utilizzate nell'ambito della scienza dei dati e dell'apprendimento automatico. Fornisce inoltre i rudimenti di programmazione necessari ad applicare la teoria a casi reali Applicare conoscenza e comprensione: Alla fine del corso, gli studenti sapranno affrontare problemi concreti di scienza dei dati, dalla loro formalizzazione sino alla manipolazione dei dati attraverso appropriati strumenti software. Capacità  critiche e di giudizio: Gli studenti saranno in grado di scegliere le tecniche da applicare al caso specifico e di valutarne le prestazioni. Capacità  comunicative: Gli studenti saranno in grado di rappresentare e comunicare l'informazione estratta dai dati, attraverso l'uso razionale di grafici e indicatori. Capacità  di apprendimento: Gli studenti saranno messi in grado di apprendere autonomamente nozioni sia teoriche che pratiche del campo.

Canale 1
FABIO GALASSO Scheda docente

Programmi - Frequenza - Esami

Programma
The course is an introduction to the basic toolkit of data analysis and machine learning using the Python programming language. Covered topics include: basics of digital image processing (8 hours); regression (6 hours); classification with discriminative and generative models (16 hours); optimization (8 hours); bias/variance (6 hours); regularization (5 hours); clustering (3 hours); dimensionality reduction (3 hours); introduction to neural networks (5 hours). For more information see the course website: https://sites.google.com/di.uniroma1.it/fds-2022-2023
Prerequisiti
Calculus and Linear Algebra, including taking derivatives, understanding matrix vector operations and notation Basic Probability and Statistics, including basics of probabilities, gaussian distributions, mean and standard deviation
Testi di riferimento
Data Science: Bertsimas, O'Hair, Pulleyblank. The Analytics Edge. Jure Leskove, Anand Rajaraman, Jeffrey D. Ullman, 2019. Mining of Massive Datasets. Cambridge University Press (available at: http://infolab.stanford.edu/~ullman/mmdsn.html) Machine Learning: Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Deep learning: Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/) Image Analysis and Recognition, Computer Vision: Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Python Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook) Online Python tutorials: https://docs.python.org/3/tutorial/ Other recommended books: - Kleinberg, Tardos. Algorithm Design. Addison Wesley. - Boyd, Vandenberghe. Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares. Cambridge University Press.
Modalità insegnamento
Traditional frontal lectures
Frequenza
Please refer to the degree course regulations.
Modalità di esame
Exam: 1) Theory: 50% (written) 2) Practise: 50%, of which - 2/3 from assignments in Python, to be submitted by given deadlines during the course - 1/3 from a final project and presentation Theory: - written exam with open-ended questions - duration: 30 minutes Assignments: - The assignments and the final projects must be submitted in groups of size [ 3 – 5 ] Final project: - Algorithms, objectives and topics for the final project may be freely chosen - Ideas for projects and resources for it would be discussed in class
Bibliografia
More references for Data Science: - Jure Leskove, Anand Rajaraman, Jeffrey D. Ullman, 2019. Mining of Massive Datasets. Cambridge University Press. (available at: http://infolab.stanford.edu/~ullman/mmdsn.html) For Deep Learning: - Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) - Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/) Other recommended books: - Kleinberg, Tardos. Algorithm Design. Addison Wesley. - Boyd, Vandenberghe. Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares. Cambridge University Press. Book reference for Python: - Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Online tutorials for Python: - https://docs.python.org/3/tutorial/
Modalità di erogazione
Traditional frontal lectures
FABIO GALASSO Scheda docente

Programmi - Frequenza - Esami

Programma
The course is an introduction to the basic toolkit of data analysis and machine learning using the Python programming language. Covered topics include: basics of digital image processing (8 hours); regression (6 hours); classification with discriminative and generative models (16 hours); optimization (8 hours); bias/variance (6 hours); regularization (5 hours); clustering (3 hours); dimensionality reduction (3 hours); introduction to neural networks (5 hours). For more information see the course website: https://sites.google.com/di.uniroma1.it/fds-2022-2023
Prerequisiti
Calculus and Linear Algebra, including taking derivatives, understanding matrix vector operations and notation Basic Probability and Statistics, including basics of probabilities, gaussian distributions, mean and standard deviation
Testi di riferimento
Data Science: Bertsimas, O'Hair, Pulleyblank. The Analytics Edge. Jure Leskove, Anand Rajaraman, Jeffrey D. Ullman, 2019. Mining of Massive Datasets. Cambridge University Press (available at: http://infolab.stanford.edu/~ullman/mmdsn.html) Machine Learning: Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Deep learning: Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/) Image Analysis and Recognition, Computer Vision: Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Python Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook) Online Python tutorials: https://docs.python.org/3/tutorial/ Other recommended books: - Kleinberg, Tardos. Algorithm Design. Addison Wesley. - Boyd, Vandenberghe. Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares. Cambridge University Press.
Modalità insegnamento
Traditional frontal lectures
Frequenza
Please refer to the degree course regulations.
Modalità di esame
Exam: 1) Theory: 50% (written) 2) Practise: 50%, of which - 2/3 from assignments in Python, to be submitted by given deadlines during the course - 1/3 from a final project and presentation Theory: - written exam with open-ended questions - duration: 30 minutes Assignments: - The assignments and the final projects must be submitted in groups of size [ 3 – 5 ] Final project: - Algorithms, objectives and topics for the final project may be freely chosen - Ideas for projects and resources for it would be discussed in class
Bibliografia
More references for Data Science: - Jure Leskove, Anand Rajaraman, Jeffrey D. Ullman, 2019. Mining of Massive Datasets. Cambridge University Press. (available at: http://infolab.stanford.edu/~ullman/mmdsn.html) For Deep Learning: - Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) - Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/) Other recommended books: - Kleinberg, Tardos. Algorithm Design. Addison Wesley. - Boyd, Vandenberghe. Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares. Cambridge University Press. Book reference for Python: - Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Online tutorials for Python: - https://docs.python.org/3/tutorial/
Modalità di erogazione
Traditional frontal lectures
INDRO SPINELLI Scheda docente

Programmi - Frequenza - Esami

Programma
The course is an introduction to the basic toolkit of data analysis and machine learning using the Python programming language. Covered topics include: basics of digital image processing (8 hours); regression (6 hours); classification with discriminative and generative models (16 hours); optimization (8 hours); bias/variance (6 hours); regularization (5 hours); clustering (3 hours); dimensionality reduction (3 hours); introduction to neural networks (5 hours). For more information see the course website: https://sites.google.com/di.uniroma1.it/fds-2022-2023
Prerequisiti
Calculus and Linear Algebra, including taking derivatives, understanding matrix vector operations and notation Basic Probability and Statistics, including basics of probabilities, gaussian distributions, mean and standard deviation
Testi di riferimento
Data Science: Bertsimas, O'Hair, Pulleyblank. The Analytics Edge. Jure Leskove, Anand Rajaraman, Jeffrey D. Ullman, 2019. Mining of Massive Datasets. Cambridge University Press (available at: http://infolab.stanford.edu/~ullman/mmdsn.html) Machine Learning: Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Deep learning: Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/) Image Analysis and Recognition, Computer Vision: Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Python Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook) Online Python tutorials: https://docs.python.org/3/tutorial/ Other recommended books: - Kleinberg, Tardos. Algorithm Design. Addison Wesley. - Boyd, Vandenberghe. Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares. Cambridge University Press.
Frequenza
Please refer to the degree course regulations.
Modalità di esame
1) Theory: 50% (written) 2) Practise: 50%, of which - 2/3 from assignments in Python, to be submitted by given deadlines during the course - 1/3 from a final project and presentation Theory: - written exam with open-ended questions - duration: 30 minutes Assignments: - The assignments and the final projects must be submitted in groups of size [ 3 – 5 ] Final project: - Algorithms, objectives and topics for the final project may be freely chosen - Ideas for projects and resources for it would be discussed in class
Modalità di erogazione
Traditional frontal lectures
INDRO SPINELLI Scheda docente

Programmi - Frequenza - Esami

Programma
The course is an introduction to the basic toolkit of data analysis and machine learning using the Python programming language. Covered topics include: basics of digital image processing (8 hours); regression (6 hours); classification with discriminative and generative models (16 hours); optimization (8 hours); bias/variance (6 hours); regularization (5 hours); clustering (3 hours); dimensionality reduction (3 hours); introduction to neural networks (5 hours). For more information see the course website: https://sites.google.com/di.uniroma1.it/fds-2022-2023
Prerequisiti
Calculus and Linear Algebra, including taking derivatives, understanding matrix vector operations and notation Basic Probability and Statistics, including basics of probabilities, gaussian distributions, mean and standard deviation
Testi di riferimento
Data Science: Bertsimas, O'Hair, Pulleyblank. The Analytics Edge. Jure Leskove, Anand Rajaraman, Jeffrey D. Ullman, 2019. Mining of Massive Datasets. Cambridge University Press (available at: http://infolab.stanford.edu/~ullman/mmdsn.html) Machine Learning: Christopher M. Bishop, 2006. Pattern Recognition and Machine Learning Deisenroth, Faisal, Ong, 2020. Mathematics for Machine Learning (available at: https://mml-book.github.io/) Deep learning: Ian Goofellow, Yoshua Bengio, Aaron Courville, 2017. Deep Learning (available at: https://www.deeplearningbook.org/) Zhang Lipton Li Smola Book, 2019 Dive into Deep Learning (interactive book and code at: http://d2l.ai/index.html) Andrew Ng, 2019. Machine Learning Yearning (available at: https://www.deeplearning.ai/machine-learning-yearning/) Image Analysis and Recognition, Computer Vision: Richard Szeliski, 2010. Computer Vision: Algorithms and Applications (available at: http://szeliski.org/Book) Python Allen B. Downey, 2015. Think Python: How to Think Like a Computer Scientist (available at: https://www.greenteapress.com/thinkpython/thinkpython.html) Jake VanderPlas, 2016. Python Data Science Handbook: Tools and Techniques for Developers: Essential Tools for working with Data (Book and notebooks available at: https://github.com/jakevdp/PythonDataScienceHandbook) Online Python tutorials: https://docs.python.org/3/tutorial/ Other recommended books: - Kleinberg, Tardos. Algorithm Design. Addison Wesley. - Boyd, Vandenberghe. Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares. Cambridge University Press.
Frequenza
Please refer to the degree course regulations.
Modalità di esame
1) Theory: 50% (written) 2) Practise: 50%, of which - 2/3 from assignments in Python, to be submitted by given deadlines during the course - 1/3 from a final project and presentation Theory: - written exam with open-ended questions - duration: 30 minutes Assignments: - The assignments and the final projects must be submitted in groups of size [ 3 – 5 ] Final project: - Algorithms, objectives and topics for the final project may be freely chosen - Ideas for projects and resources for it would be discussed in class
Modalità di erogazione
Traditional frontal lectures
  • Codice insegnamento1047627
  • Anno accademico2024/2025
  • CorsoComputer Science - Informatica
  • CurriculumCurriculum unico
  • Anno2º anno
  • Semestre1º semestre
  • SSDINF/01
  • CFU6
  • Ambito disciplinareAttività formative affini o integrative