Checking date: 20/05/2023


Course: 2023/2024

Natural Language Processing
(19211)
Master in Applied Artificial Intelligence (Plan: 475 - Estudio: 378)
EPI


Coordinating teacher: ARENAS GARCIA, JERONIMO

Department assigned to the subject: Signal and Communications Theory Department

Type: Electives
ECTS Credits: 3.0 ECTS

Course:
Semester:




Requirements (Subjects that are assumed to be known)
* It is recommended to have passed the Machine Learning subject * The Deep Learning subject (also offered in the second term) also provides competences of interest, although it is not essential to take it.
Objectives
* Familiarize students with some commonly used methods for natural language processing, both for preprocessing unstructured text, and for building models based on machine learning * Know various approaches for calculating semantic similarity between documents and their use to build and analyze semantic graphs * Presentation of some tools for the interactive visualization of machine learning models and natural language processing based on graphs and interactive dashboards * Familiarize students with some relevant applications of natural language processing * Encourage maturity in the knowledge of these technologies, and the autonomy to deepen the concepts explained in class, by working on a final group project
Skills and learning outcomes
Description of contents: programme
1. Natural Language Processing Introduction 2. Word and document vector representation 2.1. Text homogeneization and cleaning 2.2. Spacy and Spark NLP 2.3. One-hot encoding 2.4. Word Embeddings. Word2Vec. GloVe 2.5. Other Embedding representations 3. Transformers 3.1. Introduction to Transformers. Hugging Face 3.2. Text Classification: Sentiment Analysis 3.3. Other applications * Zero-shot classification * Text Generation * Neural Machine Translation * Question & Answering 4. Topic Modeling 4.1. Latent Dirichlet Allocation 4.2. Neural Topic Modeling 5. Semantic graph Analysis 5.1. Semantic Similarity Metrics 5.2. Semantic Graphs 5.3. Graph Analysis 5.4. Graph Visualization 5.5. Semantic Information Retrieval
Learning activities and methodology
The following learning activities and methodologies are employed: - Combined master and lab clases: Master classes provide an overview of the main theoretical & mathematical concepts of natural language processing along with the analytic tools. In these classes, lab examples will be introduced as part of the theoretical expositions: all the formative sessions (lab availability provided) will take place in the lab to imbricate practical examples within the explanations to add dynamism to the class. This is also beneficial to solve different background issues (MD1 and MD3). - Final Project: Students will work on a project in which they will program a complete modular system of one of the tools explained in class. The students will be provided with some guidelines and some preparatory sessions by using problem-based learning (MD5). Teachers are available during 2 hours per week for office hours.
Assessment System
  • % end-of-term-examination 0
  • % of continuous assessment (assigments, laboratory, practicals...) 100
Calendar of Continuous assessment
Basic Bibliography
  • Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola. Dive into Deep Learning. https://d2l.ai. 2020
  • Christopher D. Manning, Hinrich Schütze. Foundations of Statistical Natural Language Processing. MIT Press. 1999
  • Dan Jurafsky and James H. Martin. Speech and Language Processing. Prentice Hall. 2018
  • Denis Rothman. Transformers for Natural Language Processing. Packt>. 2022 (2nd Ed)
  • Li Deng (Editor), Yang Liu (Editor). Deep Learning in Natural Language Processing. Springer. 2018

The course syllabus may change due academic events or other reasons.