Ficha

Versión en español

Course: 2024/2025

(18496)

Bachelor in Sound and Image Engineering (Study Plan 2019) (Plan: 441 - Estudio: 214)

Coordinating teacher: FERNANDEZ TORRES, MIGUEL ANGEL

Department assigned to the subject: Signal and Communications Theory Department

Type: Electives

ECTS Credits: 3.0 ECTS

Course:

Semester:

Requirements (Subjects that are assumed to be known)

The students are expected to have studied Linear Systems Although not mandatory, basic knowledge on Digital Image Processing is welcome.

Objectives

Learning Results and their relation with course contents - To learn digital images and the spatial filtering operation over images. - To know basic concepts of Machine Learning: loss functions, regularization, hyperparameters, data augmentation, etc. - To understand deep neural networks and their training algorithms: gradient descent and back-propagation. - To learn Convolutional Neural Networks (CNN) and their most usual processing blocks/layers. - To understand, design and train CNN architectures for image classification. - To understand, design and train advanced CNN architectures to address other task of visual recognition: object detection, image captioning, image segmentation, image synthesis, etc.

Learning Outcomes

CB1: Students have demonstrated possession and understanding of knowledge in an area of study that builds on the foundation of general secondary education, and is usually at a level that, while relying on advanced textbooks, also includes some aspects that involve knowledge from the cutting edge of their field of study. CB2: Students are able to apply their knowledge to their work or vocation in a professional manner and possess the competences usually demonstrated through the development and defence of arguments and problem solving within their field of study. CG3: Knowledge of basic and technological subject areas which enable acquisition of new methods and technologies, as well as endowing the technical engineer with the versatility necessary to adapt to any new situation. RA1: To acquire the knowledge and understanding of the general basic fundamentals of engineering, as well as, in particular, of multimedia communications networks and services, audio and video signal processing, room acoustic control, distributed multimedia systems and interactive multimedia applications specific to Sound and Image Engineering within the telecommunications family. RA5: Be competent to apply the knowledge acquired to solve problems and design audiovisual networks and services, to configure their devices, as well as to deploy adaptive, personal audiovisual applications and services on them, bringing network intelligence to the value for the user, maximising the potential of multimedia networks and services in the different social and economic spheres, knowing the environmental, commercial and industrial implications of the practice of engineering in accordance with professional ethics.

Description of contents: programme

Unit 1. Basic concepts of visual recognition 1.1 Digital Images 1.2 Spatial Filtering 1.3 Part-models for object recognition Unit 2. Basic concepts of Deep learning 2.1 Machine Learning algorithms 2.2 Loss Functions 2.3 Regularization 2.4 Hyperparameters and validation 2.5 Deep Neural Networks 2.6 Gradient Decent-based learning algorithms 2.7 Backpropagation Unit 3 Convolutional Neural Networks (CNNs) for image classification 3.1 Introductions 3.2 Basic processing layers in a CNN 3.3 CNN architectures for image classification 3.4 Training a CNN for image classification: data pre-processing, data augmentation and initialization Unit 4 Deep networks for other image-related tasks: 4.1 Networks for object detection 4.2 Networks for semantic image segmentation 4.3 Networks for image synthesis: GANs, Diffusion Models, VAEs 4.4 Networks for image matching

Learning activities and methodology

Two teaching activities are proposed: lectures and lab sessions. LECTURES The lecture sessions will be supported by slides or by any other means to illustrate the concepts explained. In these classes the explanation will be completed with examples. In these sessions the student will acquire the basic concepts of the course. It is important to highlight that these classes require the initiative and the personal and group involvement of the students (there will be concepts that the students themselves should develop). LABORATORY SESSIONS This is a course with a high practical component, and students will attend to laboratory sessions very often. In them, the concepts explained during the lectures will be put into practice using deep learning software libraries (eg pytorch). In the laboratory, machines equipped with high-performance GPUs are available and free distributed computing systems such as Google Colab will also be used.

Assessment System

% end-of-term-examination/test 60
% of continuous assessment (assigments, laboratory, practicals...) 40

Calendar of Continuous assessment

The evaluation of the course will be carried out in its entirety through the continuous assessment of the students work throughout the semester.
For that end, 2 practices will be evaluated  (5 pts each), each one associated to one main block of the subject:
    1) Practice of classification of images with CNNs (5 points).
    2) Practice of another application of deep learning about images (5 points).

Extraordinary call: regulations

Basic Bibliography

Francois Chollet. Deep Learning with Python. Manning Publications. 2017
Ian Goodfellow, Yoshoua Bengio, and Aaron Courville. Deep Learning. The MIT Press. 2016

Additional Bibliography

Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer. 2006
Forsyth & Ponce. Computer Vision. Pearson. 2012

The course syllabus may change due academic events or other reasons.