Checking date: 13/04/2023


Course: 2023/2024

Information access and retrieval
(15760)
Bachelor in Computer Science and Engineering (2018 Study Plan) (Plan: 431 - Estudio: 218)


Coordinating teacher: MORATO LARA, JORGE LUIS

Department assigned to the subject: Computer Science and Engineering Department

Type: Electives
ECTS Credits: 6.0 ECTS

Course:
Semester:




Requirements (Subjects that are assumed to be known)
- Files and Data bases (Bachelor in Informatics Engineering, 2nd Course, Semester 2nd, Compulsory) - Object oriented programming (Bachelor in Informatics Engineering, 1st Course, 2nd Semester, Compulsory)
Objectives
The purpose of the course is to improve knowledge of various aspects: 1. Retrieval models 2. Natural Language Processing Techniques 3. Systems to formalize, synthesize, and structure information 4. Traceability systems 5. Ability to show results in an appropriate way 6. Improve retrieval and knowledge reuse systems in the Web and in Software Engineering To this end, the following will be carried out: 1. Design of Retrieval Systems 2. Design of natural language analyzers 3. Application of text mining techniques to improve the representation and sorting of results
Skills and learning outcomes
Description of contents: programme
Description: Retrieval Models, Natural Language Processing, semantic analysis, metadata, linked data, information retrieval, positioning techniques, knowledge reuse, data mining The course examines fundamental concepts about retrieval systems, introducing a variete of basic techniques. This includes the use of knowledge organization systems, positioning techniques, natural language processing techniques and resources, and evaluation by retrieval metrics. Course content, 3 units: Unit 1. Information retrival - Lesson 1: Search basics in different web types: classic web, Semantic Web, Social Web, Data Web, Dark Web, Deep Web, question-answering web, and commercial web. - Lesson 2: Search Engine Optimization (SEO/SEM) - Lesson 3. Basic information retrieval models - Lesson 4: Access, acquisition and cleansing of semantic web data and bigdata - Lesson 5. Crawlers, scrapers and search engine arquitecture Unit 2. Retrieval evaluation - Lesson 6. Evaluation metrics for information retrieval systems Unit 3. Advanced techniques for information retrieval systems - Lesson 7. Natural Language Processing (NLP) - Lesson 8. Information extraction techniques (IE) - Lesson 9. Relevance feedback and query expansion
Learning activities and methodology
Lectures (theory): 1.6 ECTS. To achieve the specific cognitive competences of the course Lectures (practices): To develop the attitudinal and specific competences as well as most of the general ones, such as collaborative teamwork, skills to apply theoretical concepts, design planning, information organization, analysis, and abstraction. Students must design and develop an information retrieval system. Workshops and labs 0.2 ECTS, individual or group exercises 3 ECTS Tutorials to solve practical and theoretical questions 1 ECTS Exercises and examination: 0.2 ECTS. The goal is to complete the development of specific cognitive and procedural capacities. Exercises and results of the practices will be discussed in class.
Assessment System
  • % end-of-term-examination 40
  • % of continuous assessment (assigments, laboratory, practicals...) 60
Calendar of Continuous assessment
Basic Bibliography
  • Aurelien Geron . Hands-On Machine Learning with Scikit-Learn and TensorFlow. OReilly. 2017
  • Benjamin Bengfort, Rebecca Bilbro, Tony Ojeda . Applied Text Analysis with Python. OReilly. 2018
  • R. Baeza-Yates y B. Ribeiro-Neto. Modern Information Retrieval: The Concepts and Technology behind Search (2nd edition). Addison Wesley. 2011
  • Verborgh, R., De Wilde, M., & Sawant, A.. Using OpenRefine: The essential OpenRefine guide that takes you from data analysis and error fixing to linking your dataset to the web. Packt Publishing. 2013
Recursos electrónicosElectronic Resources *
Additional Bibliography
  • Anne Ahola Ward. The SEO battlefield: winning strategies for search marketing programs. OReilly. 2017
  • Dale R. Handbook of Natural Language Processing. Marcel Dekker. 2000
  • Dean Allemang, James Hendler . Semantic Web for the Working Ontologists: Effective Modeling in RDFS and OWL. Elservier. 2011
  • Gábor László Hajba . Website Scraping with Python: Using BeautifulSoup. Google Books. 2018
  • Ian H. Witten, Alistair Moffat and Timothy C. Bell. Managing Gigabytes: compressing and indexing documents. Morgan Kauffman. 1999
  • J. Urbano, M. Marrero, D. Martín y J. Morato. Bringing Undergraduate Students Closer to a Real-World Information Retrieval Setting: Methodology and Resources. ACM SIGCSE ITiCSE. 2011
  • Moens Marie-Francine. Information Extraction: algorithms and prospects in a retrieval context (Chps. 1, 2 & 4). Springer. 2006
  • Morato, J, Sánchez-Cuadrado, S, Moreno, V Moreiro JA . Evolución de los factores de posicionamiento web y adaptación de las herramientas de optimización. Revista española de Documentación Científica, Vol 36, No 3. 2013
  • Nadeau D. and Sekine S.. A survey of named entity recognition and classification. Linguisticae Investigationes vol. 30 n.1. 2007
  • Stuart Russell, Peter Norvig . Artificial Intelligence: A Modern Approach. Pearson. 2018
(*) Access to some electronic resources may be restricted to members of the university community and require validation through Campus Global. If you try to connect from outside of the University you will need to set up a VPN


The course syllabus may change due academic events or other reasons.