Fundamentals of Data Science
Obiettivi formativi
General objectives: This course introduces the foundational tools of data science by combining machine learning, statistical modeling, and network science to explore real-world data in its structural and dynamic complexity. It equips students to treat data as a strategic asset by combining Python programming, data analysis, machine learning, and approaches from complex systems to develop a more interpretive and systemic understanding of data. Through industry-standard methods, participants will learn to analyze datasets, uncover meaningful patterns, and produce accurate predictions. The curriculum provides the skills to design discriminative models for classification and regression and generative models for tasks such as data synthesis and significance evaluation. Specific objectives: The course is built around three core dimensions. Machine Learning Foundations: Datasets and their representation (6h), Linear Regression with bias-variance trade-off and regularization (7h), Classification, Calibration, and Performance Evaluation (6h), Non-Parametric models: K-NN, Decision Trees, Random Forest, and XGBoost (5h), Neural Networks and Backpropagation (4h), Image Representation and Convolution (3h), CNNs and other Network Components (5h), Autoencoders and Variational Inference (5h), Text Representation, Self-Attention, and Transformers (3h), Multimodal Machine Learning (2h). Complex Networks and Network Science: Introduction to Network Data and Structural Properties of Networks (10h), Generative Models of Network Formation (7h), Mechanistic Models of Network Formation (5h), Community Detection and Graph Clustering Methods (8h). Programming and Practice: Each objective will be addressed theoretically and through practical programming exercises with Python. Knowledge and understanding: This course comprehensively introduces the foundational concepts, theories, techniques, and methodologies in data science. It elucidates the core principles behind this discipline and critically examines their inherent limitations. Additionally, the course highlights practical applications with focused computer vision and network science case studies, providing students with a well-rounded understanding of theory and practice. Apply knowledge and understanding: By the end of the course, students will be proficient in tackling real-world data science challenges by translating complex phenomena into formal analytical and machine learning frameworks. They will be able to select and apply appropriate algorithms, refine models, and extract actionable insights from data across domains. The curriculum emphasizes a full data science workflow—data acquisition, representation, preprocessing, and exploratory analysis—followed by model training, tuning, evaluation, and deployment. This course systematically cultivates the advanced programming and modeling competencies that are indispensable for the contemporary data scientist. Critical and judgment skills: Students will develop the ability to analyze real-world challenges and select the most suitable data science techniques by weighing data characteristics, computational constraints, and domain-specific objectives. They will evaluate their solutions models using quantitative metrics to make informed, context-driven decisions that balance technical excellence with broader societal impact. Communication skills: Students will cultivate the ability to effectively present and communicate data-driven insights using well-designed visualizations and key performance indicators. They will learn to rigorously articulate their analytical solutions and systematically explain the structure of their code. This emphasis on communication is further reinforced through a final project presentation and an interactive discussion session, ensuring that students can clearly convey complex technical concepts to both technical and non-technical audiences. Learning ability: Students will be able to learn both the theory and the practice of the field autonomously to face other problems in data analysis, machine learning, computer vision, and network science.
Programmi - Frequenza - Esami
Programma
Prerequisiti
Testi di riferimento
Frequenza
Modalità di esame
Modalità di erogazione
Programmi - Frequenza - Esami
Programma
Prerequisiti
Testi di riferimento
Frequenza
Modalità di esame
Modalità di erogazione
- Codice insegnamento1047224
- Anno accademico2025/2026
- CorsoData Science
- CurriculumCurriculum unico
- Anno1º anno
- Semestre1º semestre
- SSDINF/01
- CFU9