Checking date: 28/04/2023

Course: 2023/2024

Speech technologies for health
Master in Machine Learning for Health (Plan: 480 - Estudio: 359)

Coordinating teacher: PELAEZ MORENO, CARMEN

Department assigned to the subject: Signal and Communications Theory Department

Type: Electives
ECTS Credits: 3.0 ECTS


Speech Technologies for Health will provide the students with knowledge on: · the speech production and perception mechanisms and the linguistic categories of the voice · the state of the art and fundamentals of speech and audio coders, speech recognition, speech synthesis or text-to-speech, speaker recognition and dialogue systems · their applications to health They will also acquire abilities to: · initiate research work in speech and audio coding, speech recognition, speech synthesis or text-to-speech, speaker recognition and natural language processing · to apply the previous knowledge to pose research questions on their applications to health
Skills and learning outcomes
Description of contents: programme
Unit 0. Introduction to Speech Technologies Unit 1. The Auditory System and Speech Perception. Unit 2. The Speech Production System and Phonation. Speech and Audio Coding Unit 3. Speech recognition: · Feature extraction · Acoustic & Language models · End-to-end recognition and deep learning models Unit 4. Speaker recognition and biometrics Unit 5. Text-to-Speech synthesis Unit 6. Applications to health · Affective computing: Emotion and Sentiment analysis · Pathologies detection
Learning activities and methodology
The following learning activities and methodologies are employed: Combined master and lab classes, flipped classes, gamification and final project. Teachers are available during 2 hours per week for office hours. Combined master and lab clases (1.5 ECTS): Master classes provide an overview of the main theoretical & mathematical concepts of the representation and processing of speech processing along with the analytic tools employed for their processing. In these classes, lab examples will be introduced as part of the theoretical expositions: all the formative sessions (lab availability provided) will take place in the lab to imbricate practical examples within the explanations to add dynamism to the class. This is also beneficial to solve different background issues. Moreover, every unit will begin with a debate of its technological implications. For this purpose, flipped classroom methodologies will be employed. In particular, students will be provided with some selected videos in advance to motivate the debate together with a list of questions (sometimes controversial) that the instructor will not answer categorically to encourage discussions. In this way, we expect to awake the curiosity of the student on the materials that will be subsequently explained. GAMIFICACIÓN (0.75 ECTS) We will illustrate the process of scientific publishing with a role game. The students will join teams with different roles: editor, reviewer and author. Guidelines with requisites and forms to carry out their roles will be provided together with deadline dates (for reviewing, revising and editorial decisions). A paper will be assigned to each team and the outcomes of the process will be publicly debated. In this way, we promote team working, critical and active reading or research papers. FINAL PROJECT (0.75 ECTS) Students will work on a project in which they will program a complete modular system of one of the tools explained in class. The students will be provided with some guidelines and some preparatory sessions by using problema-based learning.
Assessment System
  • % end-of-term-examination 0
  • % of continuous assessment (assigments, laboratory, practicals...) 100
Calendar of Continuous assessment
Basic Bibliography
  • Ben Gold (Author), Nelson Morgan (Author), Dan Ellis (Author) . Speech and Audio Signal Processing: Processing and Perception of Speech and Music. Wiley. 2011
  • Dan Jurafsky and James H. Martin . Speech and Language Processing (3rd ed.). Prentice Hall. 2018
  • Yu, Dong, Deng, Li . Automatic Speech Recognition. Springer. 2015
Additional Bibliography
  • Amy Neustein (Editor), Jenay M. Beer (Contributor), Conrad Bzura (Contributor) et al. Speech and Automata in Health Care. De Gruyter. 2014
  • Amy Neustein (Editor), Hemant A. Patil (Editor) . Acoustic Analysis of Pathologies: From Infancy to Young Adulthood. De Gruyter. 2020
  • Deborah Dahl (Author), Katharine Beals (Author), Marcia Linebarger (Author), Ruth Fink (Author). Speech and Language Technology for Language Disorders . De Gruyter. 2015
  • Rupayan Chakraborty, Meghna Pandharipande, Sunil Kumar Kopparapu. Analyzing Emotion in Spontaneous Speech. Springer. 2018

The course syllabus may change due academic events or other reasons.