Ficha

Curso Académico: 2025/2026

Audio processing, Video processing and Computer vision

(16508)

Dual Bachelor Data Science and Engineering - Telecommunication Technologies Engineering (Study Plan 2020) (Plan: 456 - Estudio: 371)

Coordinador/a: GONZALEZ DIAZ, IVAN

Departamento asignado a la asignatura: Signal and Communications Theory Department

Tipo: Compulsory

Créditos: 6.0 ECTS

Curso: 5º

Cuatrimestre: 1º

Requirements (Subjects that are assumed to be known)

Neural Networks Signals and Systems Machine Learning I and II

Objectives

Students must achieve the following objectives: 1) Know image, audio, sppech and video signals, as well as their main parameters and the digitization process. 2) Know the most important techniques of image, video and audio processing, as well as the main tasks in computer vision and audio. 3) Apply machine learning and deep learning techniques studied in previous subjects to the analysis of audiovisual content: images, video, audio, speech. 4) Develop intelligent applications that involve the automatic analysis of audiovisual content.

Description of contents: programme

The course is divided into two main blocks absed on the signal modalities: on the one hand, image and video and, on the other, voice and audio. In both cases, signals and their main characteristics are presented first, including certain notions of the visual and auditory systems. Next, the most common techniques for each signal processing are studied, illustrating their use in selected applications. Finally, most modern approaches are introduced, based on the application of deep learning (e.g. CNNs and RNNs), which constitute nowadays the state of the art of technology. The course program is organized as follows: Block 1: Processing of visual signals: image and video ============================================ Topic 1: Introduction to digital video and images Topic 2: Fundamentals of image and video processing Topic 3: Image Representation: low-level descriptors Topic 4: Image Segmentation Topic 5: Convolutional Neural Networks (CNNs) for image classification Topic 6: Other applications of CNNs in visual analysis: object detection, semantic segmentation, image generation, style transfer Block 2: Processing of speech and audio signals ======================================= Topic 7: Fundamentals of digital audio and speech: generation, perception and digitization Topic 8: Time-located analysis for speec and audio signals Topic 9: Low-level speech and audio descriptors Topic 10: Neural Networks for Sequential Data Analysis: Temporal CNNs, Recurrent Neural Networks, Transformers, State-Space Models (SSMs). Applications of these models to audio/voice signals.

Learning activities and methodology

Two teaching activities are proposed: lectures and lab sessions. LECTURES The lecture sessions will be supported by slides or by any other means to illustrate the concepts explained. In these classes the explanation will be completed with examples. In these sessions the student will acquire the basic concepts of the course. It is important to highlight that these classes require the initiative and the personal and group involvement of the students (there will be concepts that the students themselves should develop). LABORATORY SESSIONS This is a course with a high practical component, and students will attend to laboratory sessions very often. In them, the concepts explained during the lectures will be put into practice using the programming language python, and software libraries for image analysis and computer vision (scikit-image, PIL, OpenCV), audio analysis (scikit-sound), machine learning (scikit-learn) and deep learning (pytorch). In the laboratory, machines equipped with high-performance GPUs are available and but students can also use free distributed computing systems such as Google Colab.

Assessment System

Peso porcentual del Examen/Prueba Final 40
Peso porcentual del resto de la evaluación 60

Calendar of Continuous assessment

FINAL EXAM. This will assess, in a comprehensive manner, the knowledge, skills, and competencies acquired throughout the course. It will account for 40% of the final grade.

CONTINUOUS ASSESSMENT. This will evaluate the exercises and practical work completed during workshops throughout the course. Tests, short quizzes, and assessments through competitions or challenges will be used interchangeably. It will account for 60% of the final grade.

Extraordinary call: regulations

El programa de la asignatura podría sufrir alguna variación por causa de fuerza mayor debidamente justificada o por eventos académicos comunicados con antelación.