Checking date: 21/05/2019


Course: 2019/2020

Big Data: Data Analysis Techniques
(17302)
Study: Master in Libraries, Archives and Digital Continuity (335)
EPH


Coordinating teacher: ROBLEDANO ARILLO, JESUS

Department assigned to the subject: Department of Library Science and Documentation

Type: Electives
ECTS Credits: 3.0 ECTS

Course:
Semester:




Students are expected to have completed
- Database Design (340-17452). - Statistical Data Analysis (340-17467). - Data Visualisation (340-17468). - Metric Studies of Information (340-17461). - Data Science (340-17477). - Information Visualisation (335-17283).
Competences and skills that will be acquired and learning results.
BASIC COMPETENCES CB8 Learn how to design strategies for the analysis and exploitation of large volumes of data. CB9 Make proposals for data management for different contexts and organisations. GENERAL COMPETENCES CG2 Learn to identify work and research lines related to the analysis, cleansing and exploitation of data. CG6 Know the different business models related to Big data. CG8 Learn to identify the potential value and use of data. CG9 Learn how to adapt and develop methods and techniques according to the new knowledge and skills required for data management and analysis. LEARNING RESULTS - Knowledge of methods and techniques used for designing and assessing strategies of data management. The student after passing the subject must: - Apply techniques to elaborate studies and reports analysing data management in organisations. - Know and understand concepts and terms related to big data. - Design, plan and implement data cleansing and analysis processes. - Learn to identify strategic data and high impact in the organisation. - Know the different methods and techniques of data extraction, data cleansing, and linked data. - Acquire a good command using tools for data extraction, data cleansing, and linked data. The student, after passing the subject, must: - Know the limitations and technical capabilities required in Big Data. - Understand and understand the value of data from different sources. - Knowing tools that allow designing, preparing, analysing and managing large volumes of structured or unstructured information. - Use data analysis techniques to obtain valid conclusions for decision making. - Normalise, relate and enrich the information provided by different datasets. - Use different techniques and tools for data cleansing, data analysis and linking data.
Description of contents: programme
Lesson 1. Introduction to Big Data. 1.1. Introduction to Big Data. 1.2. Definition of Big Data. 1.3. Efficient data management. 1.4. Connections to other specialities. Lesson 2. Data Extraction. 2.1. Introduction to data extraction. 2.2. Data collection and data pre-processing. Lesson 3. Data Storage. 3.1. Introduction to data storage. 3.2. Data warehouse. 3.3. Data warehouse architecture. 3.4. Data warehouse management. 3.5. Data warehouse in data mining. Lesson 4. Data Preparation. 4.1. Introduction to data cleansing. 4.2. Data integration. 4.3. Data transformation. Lesson 5. Data Selection. 5.1. Introduction to data selection. 5.2. Analysis of the environment and the audience. 5.3. Data exploration. 5.4. Exploratory data techniques. 5.5. Data selection. Lesson 6. Data Analysis. 6.1. Introduction to data analysis. 6.2. Data analysis methods and techniques. 6.3. Data mining. 6.4. Data models. 6.5. Results interpretation and validation. Lesson 7. Challenges of Big Data. 7.1. Introduction to current and future challenges. 7.2. Ethical and legal questions. 7.3. Future trends in data mining.
Learning activities and methodology
TRAINING ACTIVITIES AF1 Individual work for the study of theoretical and practical materials. AF2 practical cases. AF3 Theoretical-practical classes. AF4 Tutorship. AF5 Final work. AF6 Active participation in fora of the course. TEACHING METHODS MD1 Oral presentations describing the key concepts of the subject. MD2 Critical reading of texts recommended. MD3 Resolution of practical exercises individually or in a group. MD4 Class discussions related to the content of the subject and/or practical cases. MD5 Preparation of individual and group work and reports. MD6 Reading of theoretical and practical teaching materials. TUTORSHIP SCHEME The schedules of the tutorship are available at Aula Global. In addition to these officially tutorship, students may request and arrange additional tutorship with the teacher.
Assessment System
  • % end-of-term-examination 30
  • % of continuous assessment (assigments, laboratory, practicals...) 70
Basic Bibliography
  • GÓMEZ GARCÍA, José Luis. Introducción al big data. Barcelona: UOC. 2015
  • JOYANES AGUILAR, Luis. Big data: análisis de grandes volúmenes de datos en organizaciones. Barcelona: Marcombo. 2013
  • LARA TORRALBO, Juan Alfonso. Minería de datos. Madrid: CEF. 2014
  • MAYER-SCHÖNBERGER, Viktor. Big data: la revolución de los datos masivos. Madrid: Turner. 2013
  • NETTLETON, David F. Data mining: fundamentos y metodologías. Barcelona: UOC. 2007
  • SCHMARZO, Bill. Big data: el poder de los datos. Madrid: Anaya Multimedia. 2014
  • SIEGEL, Eric. Analítica predictiva: predecir el futuro utilizando Big Data. Madrid: Anaya Multimedia. 2013

The course syllabus and the academic weekly planning may change due academic events or other reasons.