Checking date: 18/05/2023


Course: 2023/2024

Audio processing, Video processing and Computer vision
(16508)
Dual Bachelor Data Science and Engineering - Telecommunication Technologies Engineering (Plan: 456 - Estudio: 371)


Coordinating teacher: GONZALEZ DIAZ, IVAN

Department assigned to the subject: Signal and Communications Theory Department

Type: Compulsory
ECTS Credits: 6.0 ECTS

Course:
Semester:




Requirements (Subjects that are assumed to be known)
Neural Networks
Skills and learning outcomes
Description of contents: programme
The goal of this subject is to provide the student with an introduction to signal processing techniques with application to speech, audio, image and video. To that end, the emphasis is put on lab exercises, so that the student can be assessed according to her work on a mini project. The subject is divided into two main blocks: first, image and video processing and, second, voice and audio processing. In both blocks, the signals and their characteristics are presented first, including certain notions of the visual and auditory systems. Next, the fundamental processing techniques for specific signals are presented, illustrating the use of these techniques in selected applications. Then, the convolutional neuronal networks are introduced, and several applications are described in both areas (image & video and speech & audio). PROGRAMME Fundamentals of Image and Video Processing A first approach to Image Classification Convolutional Neural Networks (CNNs) - Brief Review of Neural Networks (NNs) and Deep Neural Networks (DNNs) - Fundamentals and Building Blocks - Applications in Computer Vision Recurrent Neural Networks (RNNs) - Fundamentals - Applications in Computer Vision Fundamentals of Speech and Audio Processing Overview of Speech and Audio Technologies Deep Learning-based Speech and Audio Technologies
Learning activities and methodology
AF1: THEORETICAL-PRACTICAL CLASSES. They will present the knowledge that students should acquire. They will receive the class notes and will have basic texts of reference to facilitate the follow-up of the classes and the development of the subsequent work. The student will solve exercises and practical problems. Workshops and evaluation tests will be held to acquire the necessary skills. AF3: INDIVIDUAL OR GROUP WORK OF THE STUDENT. AF8: WORKSHOPS AND LABORATORIES. AF9: FINAL EXAM. In which the knowledge, skills and abilities acquired throughout the course will be assessed globally. MD1: CLASS THEORY. Presentations in the teacher's class with support of computer and audiovisual media, in which the main concepts of the subject are developed and the materials and bibliography are provided to complement the students' learning. MD2: PRACTICES. Resolution of practical cases, problems, etc. raised by the teacher individually or in groups. MD3: TUTORIALS. Individualized assistance (individual tutorials) or group (collective tutorials) to students by the teacher. MD6: LABORATORY PRACTICES. Applied / experimental teaching in laboratories under the supervision of a tutor.
Assessment System
  • % end-of-term-examination 60
  • % of continuous assessment (assigments, laboratory, practicals...) 40

Basic Bibliography
  • Ken C. Pohlmann. Principles of Digital Audio (5th Edition). McGraw-Hill/TAB Electronics. 2005
  • N. Morgan and B. Gold. Speech and Audio Signal Processing: Processing and Perception of Speech and Music. John Wiley & Sons, Inc. New York, NY, USA. 1999
  • Rafael C. González and Richard E. Woods . Digital Image Processing (4th Edition). Pearson. 2018
Additional Bibliography
  • D. O'Shaughnessy. Automatic speech recognition: History, methods and challenges. Pattern Recognition, 41 (10) pp. 2965-2979. 2008
  • David A. Forsyth and Jean Ponce. Computer Vision: A Modern Approach (2nd Edition). Pearson . 2012
  • Ian Goodfellow and Yoshua Bengio and Aaron Courville. Deep Learning. MIT Press. 2016
  • S. Huang, A. Acero, H.W. Hon. Spoken Language Processing: A Guide to Theory, Algorithms and System Development. Prentice Hall. 2001
  • Wilhelm Burger and Mark J. Burge. Principles of Digital Image Processing: Fundamental Techniques. Springer-Verlag. 2009
  • Wilhelm Burger and Mark J. Burge. Principles of Digital Image Processing: Core Techniques. Springer-Verlag. 2009

The course syllabus may change due academic events or other reasons.