Checking date: 28/04/2023


Course: 2024/2025

Speech Technologies for Health
(18848)
Master in Information Health Engineering (Plan: 427 - Estudio: 359)
EPI


Coordinating teacher: PELAEZ MORENO, CARMEN

Department assigned to the subject: Signal and Communications Theory Department

Type: Electives
ECTS Credits: 3.0 ECTS

Course:
Semester:




Objectives
Speech Technologies for Health will provide the students with knowledge on: · the speech production and perception mechanisms and the linguistic categories of the voice · the state of the art and fundamentals of speech and audio coders, speech recognition, speech synthesis or text-to-speech, speaker recognition and dialogue systems · their applications to health They will also acquire abilities to: · initiate research work in speech and audio coding, speech recognition, speech synthesis or text-to-speech, speaker recognition and natural language processing · to apply the previous knowledge to pose research questions on their applications to health Basic competences CB6 Having and understanding the knowledge that provides a basis or opportunity to be original in the development and/or application of ideas, often in a research context CB7 Students know how to apply their acquired knowledge and problem-solving skills in new or unfamiliar settings within broader (or multidisciplinary) contexts related to their field of study. CB8 Students are able to integrate knowledge and to face the complexity of making judgments based on information that, being incomplete or limited, includes reflections on the social and ethical responsibilities linked to the application of their knowledge and judgments. CB9 Students know how to communicate their conclusions and the knowledge and ultimate reasons behind them to specialised and non-specialised audiences in a clear and unambiguous way. CB10 Students have the learning skills that will enable them to continue studying in a way that will be largely self- directed or autonomous. General competences CG2 Ability to apply the knowledge of skills and research methods related to engineering. CG3 Ability to apply the knowledge of research skills and methods related to Life Sciences. CG4 Ability to contribute to the widening of the frontiers of knowledge through an original research, part of which merits publication referenced at an international level. CG5 Ability to perform a critical analysis and an evaluation and synthesis of new and complex ideas. CG6 Ability to communicate with the academic and scientific community and with society in general about their fields of knowledge in the modes and languages commonly used in their international scientific community. Specific competences CE8 Ability to easily handle with the mathematical concepts and foundations necessary for the analysis, design and implementation of automatic learning algorithms for their operation under given specifications and in particular, in the field of speech and natural language processing. CE9 Ability in the handling of advanced automatic learning techniques for their application in the field of biomedicine.
Skills and learning outcomes
Description of contents: programme
Unit 0. Introduction to Speech Technologies Unit 1. The Auditory System and Speech Perception. Unit 2. The Speech Production System and Phonation. Speech and Audio Coding Unit 3. Speech recognition: · Feature extraction · Acoustic & Language models · End-to-end recognition and deep learning models Unit 4. Speaker recognition and biometrics Unit 5. Text-to-Speech synthesis Unit 6. Applications to health · Affective computing: Emotion and Sentiment analysis · Pathologies detection
Learning activities and methodology
The following learning activities and methodologies are employed: Combined master and lab classes, flipped classes, gamification and final project. Teachers are available during 2 hours per week for office hours. Combined master and lab clases (1.5 ECTS): Master classes provide an overview of the main theoretical & mathematical concepts of the representation and processing of speech processing along with the analytic tools employed for their processing. In these classes, lab examples will be introduced as part of the theoretical expositions: all the formative sessions (lab availability provided) will take place in the lab to imbricate practical examples within the explanations to add dynamism to the class. This is also beneficial to solve different background issues. Moreover, every unit will begin with a debate of its technological implications. For this purpose, flipped classroom methodologies will be employed. In particular, students will be provided with some selected videos in advance to motivate the debate together with a list of questions (sometimes controversial) that the instructor will not answer categorically to encourage discussions. In this way, we expect to awake the curiosity of the student on the materials that will be subsequently explained. GAMIFICACIÓN (0.75 ECTS) We will illustrate the process of scientific publishing with a role game. The students will join teams with different roles: editor, reviewer and author. Guidelines with requisites and forms to carry out their roles will be provided together with deadline dates (for reviewing, revising and editorial decisions). A paper will be assigned to each team and the outcomes of the process will be publicly debated. In this way, we promote team working, critical and active reading or research papers. FINAL PROJECT (0.75 ECTS) Students will work on a project in which they will program a complete modular system of one of the tools explained in class. The students will be provided with some guidelines and some preparatory sessions by using problema-based learning.
Assessment System
  • % end-of-term-examination 0
  • % of continuous assessment (assigments, laboratory, practicals...) 100

Calendar of Continuous assessment


Basic Bibliography
  • Ben Gold (Author), Nelson Morgan (Author), Dan Ellis (Author) . Speech and Audio Signal Processing: Processing and Perception of Speech and Music. Wiley. 2011
  • Dan Jurafsky and James H. Martin . Speech and Language Processing (3rd ed.). Prentice Hall. 2018
  • Yu, Dong, Deng, Li . Automatic Speech Recognition. Springer. 2015
Additional Bibliography
  • Amy Neustein (Editor), Jenay M. Beer (Contributor), Conrad Bzura (Contributor) et al. Speech and Automata in Health Care. De Gruyter. 2014
  • Amy Neustein (Editor), Hemant A. Patil (Editor) . Acoustic Analysis of Pathologies: From Infancy to Young Adulthood. De Gruyter. 2020
  • Deborah Dahl (Author), Katharine Beals (Author), Marcia Linebarger (Author), Ruth Fink (Author). Speech and Language Technology for Language Disorders . De Gruyter. 2015
  • Rupayan Chakraborty, Meghna Pandharipande, Sunil Kumar Kopparapu. Analyzing Emotion in Spontaneous Speech. Springer. 2018

The course syllabus may change due academic events or other reasons.