Ficha

Versión en español

Course: 2019/2020

Data Processing

(14311)

Master in Telecomunications Engineering (Plan: 171 - Estudio: 227)

EPI

Coordinating teacher: ARENAS GARCIA, JERONIMO

Department assigned to the subject: Signal and Communications Theory Department

Type: Compulsory

ECTS Credits: 6.0 ECTS

Course: 1º

Semester: 1º

Requirements (Subjects that are assumed to be known)

This is a first term course, so no other courses of the Master programme are key for this course. However, it is highly desirable that students are familiarized with basic concepts from statistics.

ObjectivesFurther information on this link

After this course students will understand the principles underlying the general regression, classification and data analysis problems, and will become familiarized with the different approaches for dealing with them. Students will learn that, for the correct understanding of these problems, it is necessary to master three basic probability theory elements: 1) the likelihood, 2) the difference between a priori and a posteriori uncertainty, and 3) Bayes' Theorem. From a practical point of view, students will be presented different approaches for learning from data to solve these problems: non-parametric techniques, methods based on empirical risk minimization, or those that follow Bayesian principles. More specifically, the following list summarizes the main objectives of this course, enumerated as competences to be acquired by the students: - knowledge of the theoretic principles underlying several of the most important techniques for learning from data. - ability to apply such techniques on real problems, and to extract results and conclusions. - understanding of classic methods for estimation and classifications, and skills for their correct application. - ability to use machine learning tools - knowledge of other data analysis problems, like topic modeling or recommendations systems

Description of contents: programme

Unit 0: Introduction to data processing Unit 1: Regression 1.1. The regression problem 1.2. Non-parametric regression: k-NN 1.3. Linear and polynomial least squares regression 1.4. Bayesian regression Unit 2: Classification 2.1. Classification problema 2.2. Non-parametric methods: k-NN 2.3. Logistic regression 2.4. Neural Networks Unit 3: Data clustering 3.1. k-means clustering 3.2. Spectral clustering Unit 4: Topic models 4.1. Text analysis 4.2. Latent Dirichlet Allocation

Learning activities and methodology

LECTURES AND PRACTICAL SESSIONS Theory sessions consist of lectures in which the basic concepts of the course will be introduced, illustrating them with a large number of examples. Exercises and problems similar to those to be proposed in the exam will also be solved along the course. LAB SESSIONS Sessions in which students will apply the concepts presented in the course with the help of a computer. Students will deal with real data analysis problems, and will have to evaluate the performance of the implemented systems

Assessment System

During the ordinary period, students will be graded according to:

* Continuous assessment: 70%:
   - Classification or regression challenge: 25%
   - Data analysis project: 25%
   - Intermediate assessment of the regression block: 10%
   - Intermediate assessment of the classification block: 10%

* Final assessment: 30%:
   - Theory exam: 10 %
   - Laboratory test: 20 %.

The intermediate assessments of the regression and classification blocks can be recovered during the final exam. Furthermore, students will have the option of a final assessment over 60% of the total mark; in this case, the final score will be multiplied by the corresponding correcting factor.

The extraordinary call will consist of three parts: data analysis project, theory exam and lab exam. The student can preserve the score obtained during the ordinary call, for the extraordinary call. However, the attendance of any of these tests on the extraordinary call implies the automatic withdrawal of the corresponding score from the ordinary call

The challenge will not be celebrated during the extraordinary call. The student can keep the mark of the challenge from the ordinary call, or discard it. In the latter case, the final grade will be computed as follows:

- Data analysis project (40 %)
- Lab test (30 %)
- Theory test (30 %)

% end-of-term-examination 30
% of continuous assessment (assigments, laboratory, practicals...) 70

Basic Bibliography

C. E. Rasmussen. Gaussian Processes for Machine Learning. MIT Press. 2006
R. O. Duda, P. E. Hart, D. G. Stork. Pattern Classification (2nd ed.). Wiley Interscience. 2001

Electronic Resources *

Jesús Cid Sueiro, Jerónimo Arenas García · Introductory Notebooks on Machine Learning topics. : https://github.com/ML4DS/ML4all

Additional Bibliography

C. M. Bishop. Pattern Recognition and Machine Learning. Springer. 2006

(*) Access to some electronic resources may be restricted to members of the university community and require validation through Campus Global. If you try to connect from outside of the University you will need to set up a VPN

The course syllabus may change due academic events or other reasons.