Checking date: 21/03/2025 14:13:57


Course: 2025/2026

Machine learning applications
(16503)
Dual Bachelor Data Science and Engineering - Telecommunication Technologies Engineering (Study Plan 2020) (Plan: 456 - Estudio: 371)


Coordinating teacher: SEVILLA SALCEDO, CARLOS

Department assigned to the subject: Signal and Communications Theory Department

Type: Compulsory
ECTS Credits: 6.0 ECTS

Course:
Semester:




Requirements (Subjects that are assumed to be known)
It is recommended to have completed the subjects about mathematical foundations from the first year (Calculus I and II, Linear Algebra, Probability and Data Analysis), the subjects related to programming and algorithms (Programming and Data Structures and Algorithms), as well as subject Statistical Learning. It is also advised that the students have already taken the Machine Learning (I and II) courses.
Objectives
- Design a data model suitable for an analysis task. - Correctly and efficiently choose and use one or more data analysis methods including statistical or algorithmic techniques. - Evaluate the results of the analysis and propose modifications to the analysis process. - Know how to design and apply unsupervised inference methods for models with latent variables. - Know how to design and apply data adaptation and curation techniques. - Know how to design and apply natural language processing methods. - Know how to design and apply recommendation systems.
Learning Outcomes
LEARNING OUTCOMES RA1:Students should have acquired advanced knowledge and demonstrated an understanding of the theoretical and practical aspects and working methodology in the field of data science and engineering with a depth that reaches the forefront of knowledge. RA2:Be capable of applying their knowledge and problem-solving skills, through arguments or procedures developed and sustained by themselves, in complex or professional and specialized work settings that require the use of creative and innovative ideas RA3:Have the ability to collect and interpret data and information on which to base their conclusions including, where appropriate and pertinent, reflection on issues of a social, scientific or ethical nature within their field of study RA4:Be able to cope with complex situations or those that require the development of new solutions in the academic, work or professional field within their field of study RA5:Know how to communicate to all types of audiences (specialized or not) in a clear and precise manner, knowledge, methodologies, ideas, problems and solutions within the scope of their field of study RA6:Be able to identify their own training needs in their field of study and work or professional environment and organize their own learning with a high degree of autonomy in all types of contexts (structured or not). BASIC COMPETENCES CB1:Students have demonstrated possession and understanding of knowledge in an area of study that builds on the foundation of general secondary education, and is usually at a level that, while relying on advanced textbooks, also includes some aspects that involve knowledge from the cutting edge of their field of study CB2:Students are able to apply their knowledge to their work or vocation in a professional manner and possess the competences usually demonstrated through the development and defence of arguments and problem solving within their field of study. CB5:Students will have developed the learning skills necessary to undertake further study with a high degree of autonomy. GENERAL COMPETENCES CG1:Adequate knowledge and skills to analyze and synthesize basic problems related to engineering and data science, solve them and communicate them efficiently CG2:Knowledge of basic scientific and technical subjects that qualify for the learning of new methods and technologies, as well as providing a great versatility to adapt to new situations CG4:Ability to solve technological, computer, mathematical and statistical problems that may arise in data engineering and science CG5:Ability to solve mathematically formulated problems applied to various subjects, using numerical algorithms and computational techniques. CG6:Ability to synthesize the conclusions obtained from the analyses carried out and present them clearly and convincingly both in writing and orally SPECIFIC COMPETENCES CE1:Ability to solve mathematical problems that may arise in data engineering and science. Ability to apply knowledge of: algebra; geometry; differential and integral calculus; numerical methods; numerical algorithms; statistics and optimization CE2:Ability to correctly identify predictive problems corresponding to certain objectives and data and to use the basic results of regression analysis as the basis for prediction methods CE3:Ability to correctly identify classification problems corresponding to certain objectives and data and to use the basic results of multivariate analysis as the basis for classification, clustering and dimension reduction methods CE4:Capability for mathematical modeling, algorithmic implementation and optimization problem solving related to data science CE13:Ability to apply and design machine learning methods in classification, regression and clustering problems and for supervised, unsupervised and reinforcement learning tasks CE15:Ability to design solutions based on machine learning for applications in specific domains such as recommendation systems, natural language processing, Web or social networks TRANSVERSAL COMPETENCES CT1:Ability to communicate knowledge orally and in writing to both specialised and non-specialised audiences
Description of contents: programme
This course is divided into 3 thematic blocks. The first concerns the problem of adapting and cleaning a database, a critical preprocessing step that is addressed prior to any machine learning application. The next two blocks address two industry-relevant applications where machine learning techniques have achieved a great success. The understanding of how the different machine learning techniques have to be adapted to solve specific problems of interest to industry and society will provide students with a practical and general vision of applied Machine Learning. The course ends with a final block where visualization tools will be presented to the students, that will use them for the final project assignment. PART I: TECHNIQUES DATA CURATION AND CLEANING 1. Problem Introduction. Data representation and visualization. 2. Organization and integration of databases from different sources. 3. Feature extraction and selection. Multivariate Analysis and Mutual Information Methods. 4. Data cleaning: data characterization, detection and imputation of corrupt data. Outlier detection. PART II: NATURAL LANGUAGE PROCESSING 5. Text processing pipelines. Words and documents vectorization 6. Topic Modeling: Latent Dirichlet Allocation. 7. Introduction to transformers 8. Graphs visualization and analysis PART III: RECOMMENDATION SYSTEMS 9. Content-based recommendation systems. 10. Collaborative filtering recommendation systems. ALS and Prod2Vec. 11. DL based recommendation systems: Neutral Colaborative Filtering (NCF) and Prod2Vec BONUS TRACK: ADVANCED DATA VISUALIZATION TOOLS - Business Intelligence Tools: Dash
Learning activities and methodology
AF1: THEORETICAL-PRACTICAL CLASSES. They will present the knowledge that students should acquire. They will receive the class notes and will have basic texts of reference to facilitate the follow-up of the classes and the development of the subsequent work. Exercises, practical problems on the part of the student will be solved and workshops and evaluation test will be held to acquire the necessary skills. AF2: Updated to allegation AF3: INDIVIDUAL OR GROUP WORK OF THE STUDENT. AF9: FINAL EXAM. In which the knowledge, skills and abilities acquired throughout the course will be assessed globally. MD1: CLASS THEORY. Exhibitions in the teacher's class with support of computer and audiovisual media, in which the main concepts of the subject are developed and the materials and bibliography are provided to complement the students' learning. MD2: PRACTICES. Resolution of practical cases, problems, etc. raised by the teacher individually or in groups. MD3: TUTORIALS. Individualized assistance (individual tutorials) or group (collective tutorials) to students by the teacher.
Assessment System
  • % end-of-term-examination/test 30
  • % of continuous assessment (assigments, laboratory, practicals...) 70

Calendar of Continuous assessment


Extraordinary call: regulations
Basic Bibliography
  • . Data Visualization with Python for Beginners: Visualize Your Data using Pandas, Matplotlib and Seaborn. AI Publishing LLC. 2020
  • C.C. Aggarwal. Recommender Systems: The Textbook. Springer. 2016
  • D. Juravsky, J.H. Martin. Speech and Language Processing. Prentice Hall; 2nd edition. 2008
  • J. Eisenstein. Introduction to Natural Language Processing. MIT Press. 2019
  • J. Ham, M. Kamber. Data Mining: Concepts and Techniques (3rd. ed). Morgan Kaupfman. 2011
  • S. Bird, E. Klein, E. Loper. Natural Language Processing with Python. O'Reilly Media. 2009
Recursos electrónicosElectronic Resources *
Additional Bibliography
  • C. Manning, H. Schütze. Foundations of Statistical Natural Language Processing. MIT Press. 1999
  • Denis Rothman. Transformers for Natural Language Processing. Packt>. 2022 (2nd Ed)
  • K. Murphy. Machine Learning: A probabilistic Perspective. The MIT Press. 2012
  • M. W. Berry. Survey of Text Mining Clustering, Classification, and Retrieval. Springer. 2004
(*) Access to some electronic resources may be restricted to members of the university community and require validation through Campus Global. If you try to connect from outside of the University you will need to set up a VPN


The course syllabus may change due academic events or other reasons.