DATA ANALYSIS AND DATA MINING

Course objectives

To provide fundamental principles of multidimensional statistics for the study of complex phenomena at an exploratory non-probabilistic level. To learn the processes aimed at reducing the quantity of information by creating global synthetic indicators or multidimentional classification models. To experiment such methods in real data matrices in accordance with the data mining approach. At the end of the course the student will be able to elaborate complex multidimentional data systems deriving from different types of databases. He will be able to choose the data mining technique more suitable to the analysis objective and to the type of data: quantitative variables as well as categorical variables (structured qualitative data) and textual data (unstructured qualitative data).

Channel 1
LUCA SALVATI Lecturers' profile

Program - Frequency - Exams

Course program
1. General intro. 2. Operational Data mining. 3. Data base theory and big data. 4. Operational principles of multivariate statistics and data interpretation; Software for multivariate statistics. 5. Factor analysis and principal component analysis. 6. Cluster analysis. 7. Metric and non-metric multi-dimensional scaling (MDS). 8. Correspondence analysis and Canonical correlation analysis (CCA). 9. Regression analysis.
Prerequisites
Linear algebra; Descriptive and inferential statistics
Books
Slides in Google Drive Theory: Maialetti M. - Sateriano A. (2024). Analisi esplorativa dei dati. CISU, Roma (ultima edizione). Exercises/applications: Maialetti M. - Salvati L. (2024). Sostenibilità e resilienza. Analisi quantitativa e applicazioni economiche. Franco Angeli, Milano (ultima edizione). A third book is adopted for remote students: Orlandi V. - Maialetti M. - Salvati L. (2024). Indicatori territoriali e sviluppo locale. Verso un'economia del paesaggio. Carocci, Roma. Free software
Teaching mode
Class lesson. Laboratory with software.
Frequency
Class frequency
Exam mode
Written and oral; partial evaluations are possible during the class term; lab/project works allowed alone or in team
Bibliography
The same of above
Lesson mode
Class lesson. Laboratory with software.
LUCA SALVATI Lecturers' profile

Program - Frequency - Exams

Course program
1. General intro. 2. Operational Data mining. 3. Data base theory and big data. 4. Operational principles of multivariate statistics and data interpretation; Software for multivariate statistics. 5. Factor analysis and principal component analysis. 6. Cluster analysis. 7. Metric and non-metric multi-dimensional scaling (MDS). 8. Correspondence analysis and Canonical correlation analysis (CCA). 9. Regression analysis
Prerequisites
Linear algebra; Descriptive and inferential statistics
Books
Class notes (Google Drive) Theory: Maialetti M. - Sateriano A. (2024). Analisi esplorativa dei dati. CISU, Roma (ultima edizione). Exercises and examples: Anzalone F.M - Maialetti M. - Salvati L. (2024). I territori del PNRR. Applicazioni economiche con indicatori statistici CISU, Roma. Reading of a third book is compulsory for remote students: Salvati L. (2024). Statistica, economia e sostenibilità. Indicatori per l'analisi regionale. Franco Angeli, Milano. Free softwares for exercises.
Teaching mode
Class lesson. Laboratory with software.
Frequency
Class frequency
Exam mode
Written and oral; partial evaluations are possible during the class term; lab/project works allowed alone or in team
Bibliography
The same of above
Lesson mode
Class lesson. Laboratory with software.
  • Lesson code10592615
  • Academic year2024/2025
  • CourseBusiness Administration
  • CurriculumManagement delle aziende pubbliche
  • Year1st year
  • Semester2nd semester
  • SSDSECS-S/01
  • CFU9
  • Subject areaStatistico-matematico