Checking date: 04/10/2018


Course: 2019/2020

Back-end of big data analysis
(16916)
Study: Master in Big Data Analytics (322)
EPI


Coordinating teacher: CALLE GOMEZ, FRANCISCO JAVIER

Department assigned to the subject: Department of Computer Science and Engineering

Type: Compulsory
ECTS Credits: 3.0 ECTS

Course:
Semester:




Students are expected to have completed
- Programming skills - Experience in Windows environment Yet not required, it is also desirable to meet: - basics of structured databases (at least, Relational Model) - notions of relational algebra (or at least, set theory) - basics of data languages (specially SQL) - basics of Javascript All of these will be introduced in class, and further materials provided so the student can catch up (or at least, get introduced) at home.
Competences and skills that will be acquired and learning results.
This course aims are - To distinguish different approaches to secondary storage, depending upon the needs, but focusing on storage with analytic purposes. - To learn the wide diversity of solutions, and specifically to get introduced to some of the most widespread tools supporting Big Data implantations. Agenda will cover from the information acquisition and preparation to the manipulation on some DBMS Therefore, it is largely characterized by a practical focus To achieve these goals, the student must acquire a set of generic capabilities, knowledge, skills and attitudes. Cross/Generic Capabilities       o Analysis and synthesis abilities       o Organize and plan abilities       o Troubleshooting       o Ability to apply knowledge in practice Specific Capabilities Cognitive (Knowledge): Storage paradigms, Information Lifecycle, back-end solutions for Big Data Procedural/Instrumental (Know how) - Data manipulation - Information acquisition and preparation - Use of a document oriented DBMS - Use of a column oriented DBMS Attitudinal (To be): ability to design queries (creativity), concerns about the effectiveness and the efficiency, and ability to discuss and clarify the diverse solutions to each specific problem
Description of contents: programme
ITEM 1. Storage Paradigms - Introduction to Storage: archives and files - Databases and DBMS - Evolution of Storage and DBMS: OLTP vs. OLAP - Massive storage: ROLAP vs. RTOLAP ITEM 2. Structured Storages - Relational databases - Analytical Processing on Relational DB ITEM 3. Information Acquisition and preparation - Acquisition and extraction - Transformation, Cleaning & Integration ITEM 4. Supporting Big Data: document-oriented approach and MongoDB ITEM 5. Supporting Big Data: key-value / column approach and Cassandra
Learning activities and methodology
Lectures: Highly practical lectures, most will take place at a computer room. Most tools involved in the course are free, so the student is encouraged to practice with them (after classes). It is an ambitious (and large) syllabus: some further details will be left for the student to review at home (complementary lectures). Assignments: Except from first item (introductory and of theoretical nature) every block will end up with an assignment, for the student to solve at home. Practical exercises (similar to those in the assignment) will be proposed in the classroom. Tutoring sessions: Individual tutoring sessions will be provided for solving eventual doubts, so the student can catch up with the group in case some parts are not found to be completely understood.
Assessment System
  • % end-of-term-examination 25
  • % of continuous assessment (assigments, laboratory, practicals...) 75
Basic Bibliography
  • J. Calle. Course Teaching Materials (provided via Aula Global webpage) Each item will have specific references (mostly, links to webpages where documentation on tools usage, syntax, etc. can be freely accessed). Aula Global. 2018
Additional Bibliography
  • Elmasri, R. y Navathe, SB. Fundamentals of Database Systems. . Pearson .
  • Hurwitz, J, Nugent, A, Halper, F, Kaufman, M.. Big data for dummies . Wiley. 2013
  • Ramakrishnan, R. y Gehrke, J. . Database management systems. . McGraw Hill. .
  • Warden P.. Big Data glossary. A guide to the New Generation of data tools. . O¿Reilly . 2011
  • . Express Learning: Database Management Systems. . ITL Education Solutions Lt. Pearson India Pubs. 2012
  • Rijmenam, M.V. . Think Bigger (ISBN-13: 978-0-8144-3415-4). Amacom. 2014

The course syllabus and the academic weekly planning may change due academic events or other reasons.