Checking date: 23/04/2024

Course: 2024/2025

Data Processing
(14311)
Master in Telecomunications Engineering (Plan: 171 - Estudio: 227)
EPI

Coordinating teacher: ARENAS GARCIA, JERONIMO

Department assigned to the subject: Signal and Communications Theory Department

Type: Compulsory
ECTS Credits: 6.0 ECTS

Course:
Semester:

Requirements (Subjects that are assumed to be known)
This is a first term course, so no other courses of the Master programme are key for this course. However, it is highly desirable that students are familiarized with basic concepts from statistics.
Objectives
After this course students will understand the principles underlying the general regression, classification and data analysis problems, and will become familiarized with the different approaches for dealing with them. Students will learn that, for the correct understanding of these problems, it is necessary to master three basic probability theory elements: 1) the likelihood, 2) the difference between a priori and a posteriori uncertainty, and 3) Bayes' Theorem. From a practical point of view, students will be presented different approaches for learning from data to solve these problems: non-parametric techniques, methods based on empirical risk minimization, or those that follow Bayesian principles. More specifically, the following list summarizes the main objectives of this course, enumerated as competences to be acquired by the students: - knowledge of the theoretic principles underlying several of the most important techniques for learning from data. - ability to apply such techniques on real problems, and to extract results and conclusions. - understanding of classic methods for estimation and classifications, and skills for their correct application. - ability to use machine learning tools: gaussian processes, support vector machines, non-parametric methods - knowledge of other data analysis problems, in particular, in the domain of natural language processing
Skills and learning outcomes
Description of contents: programme
Unit 0: Introduction to data processing Unit 1: Data preprocessing 1.1. Data normalization 1.2. Dimensionality reduction 1.3. Clustering Unit 2: Regression 2.1. The regression problem 2.2. Non-parametric regression: k-NN 2.3. Linear and polynomial least squares regression 2.4. Bayesian regression 2.5. Other regression algorithms Unit 3: Classification 3.1. Classification problema 3.2. Non-parametric methods: k-NN 3.3. Logistic regression 3.4. Other classification algorithms Tema 4: Deep learning 4.1. Introduction to pytorch 4.2. Multilayer perceptron 4.3. Convolutional Neural Networks Tema 5: Natural Language Processing 5.1. Text Preprocessing 5.2. Word embeddings 5.3. Transformers 5.4. Neural Topic Models
Learning activities and methodology
LECTURES AND PRACTICAL SESSIONS Theory sessions consist of lectures in which the basic concepts of the course will be introduced, illustrating them with a large number of examples. Exercises and problems similar to those to be proposed in the exam will also be solved along the course. LAB SESSIONS Sessions in which students will apply the concepts presented in the course with the help of a computer. Students will deal with real data analysis problems, and will have to evaluate the performance of the implemented systems
Assessment System
• % end-of-term-examination 45
• % of continuous assessment (assigments, laboratory, practicals...) 55

Calendar of Continuous assessment

Basic Bibliography
• Denis Rothman. Transformers for Natural Language Processing. Packt>. 2022 (2nd ed)
• R. O. Duda, P. E. Hart, D. G. Stork. Pattern Classification (2nd ed.). Wiley Interscience. 2001
Electronic Resources *