MACHINE LEARNING AND COMPUTATIONAL BIOLOGY

Course objectives

General objectives The general objective of the course is to convey knowledge of the state of the art of machine learning and computational biology to students, following the advent of technologies of massive sequencing for the production of genomic and proteomic data. Such foundations are necessary to let students attain the skills for proper analysis of problems specific to the area and the ability to design and implement software suitable for solving the proposed problem. Therefore, the course aims to prepare a professional figure who can intervene in the resolution and management of computer projects in the bio-molecular field, with particular reference to machine learning techniques. Specific objectives The course aims to prepare future experts in machine learning techniques for the analysis of biomedical data and designers of software systems who possess the basic knowledge of molecular biology and the bioinformatics tools used to deal with the large flow of data generated in this field. People with such a professional profile will be able to determine which algorithms are of interest for the analysis of raw data, with particular reference to machine learning algorithms, starting from the experimental data production platform. A special focus will be devoted to issues arising from data produced with massive sequencing. They should also gain a critical mindset and be able to define an analysis protocol for the data, taking into account the available computational resources and, consequently, optimizing the analysis. At the end of the course, students will also present tools they have developed for managing, integrating and querying the vast amounts of data produced by the analysis to obtain organised and fruitful final results. Such tools will follow the standards required in software development typical of the bioinformatics community. 1. Application of knowledge and understanding: The training objectives are achieved through lectures, laboratory activities and exercise sessions. The activities include simulations of work projects, in-class collaboration or discussion with the direct participation of students regarding problems and analysis of case studies. 2. Autonomy of judgment: Students in the course will acquire the ability to process complex or fragmentary information. For example, they will handle sequence data that are annotated only in part (only some of them are associated with a chromosome interval of a sequenced organism) and often in a nonstandard manner. They will be required to yield a data model designed originally and autonomously, which is chosen accordingly to the biological scope of their experimental design. 3. Communication skills: Students will be able to converse with researchers in the biomedical area in a clear, logical, and effective way, using the methodological tools acquired during the course and through terms specific to computational biology. The acquisition of these skills will be tested through an oral examination and several projects developed in the laboratory. 4. Learning skills: Students should have acquired the critical, original and autonomous ability to relate to problems specific to computational biology projects and to independently apply the knowledge acquired during the course with a view to a possible continuation of studies at a higher level (master's degree) or in the broader perspective of cultural and professional deepening in the case of employment in the biomedical or bioinformatics area.

Channel 1
Antonio LUCIANO Lecturers' profile

Program - Frequency - Exams

Course program
Introduction ML and Computational Biology Machine Learning Environment Unsupervised Learning Dimensionality Reduction (PCA, Eigenvectors, SVD) Clustering (kmeans, GMM) Supervised Learning Non-parametric Decision trees Random Forest/Nearest Neigh. Supervised Learning Parametric Linear Regression Polynomial regression under/overfitting Logistic Regression (LR) SVM
Prerequisites
Probability Theory, Statistics, Linear Algebra and Programming Skills in Python.
Books
Course slides. Machine Learning: A Probabilistic Perspective Kevin P. Murphy, MIT Press Ltd Christopher M. Bishop, “Pattern Recognition and Machine Learning”
Frequency
Course attendance is recommended.
Exam mode
The examination may comprise a written part and a practical part in which a basic ML system has to be implemented and analysed. It can be carried out with homework during the course or with a final project. A mark of no less than 18/30 must be achieved to pass the examination. To obtain a mark of 30/30 with distinction, the student must demonstrate excellent knowledge of all the topics covered during the course, linking them logically and coherently.
Lesson mode
Lectures will be held in class.
  • Lesson code10602994
  • Academic year2024/2025
  • CourseMolecular Biology, Medicinal Chemistry and Computer Science for Pharmaceutical Applications
  • CurriculumSingle curriculum
  • Year3rd year
  • Semester1st semester
  • SSDINF/01
  • CFU6
  • Subject areaDiscipline Matematiche, Fisiche, Informatiche e Statistiche