Checking date: 14/05/2023


Course: 2023/2024

Back-end of big data analysis
(16916)
Master in Big Data Analytics (Plan: 352 - Estudio: 322)
EPI


Coordinating teacher: CALLE GOMEZ, FRANCISCO JAVIER

Department assigned to the subject: Computer Science and Engineering Department

Type: Compulsory
ECTS Credits: 3.0 ECTS

Course:
Semester:




Requirements (Subjects that are assumed to be known)
- Programming skills - Experience in Windows environment Yet not required, it is also desirable to meet: - basics of structured databases (at least, Relational Model) - notions of relational algebra (or at least, set theory) - basics of data languages (specially SQL) - basics of Javascript All of these will be introduced in class, and further materials provided so the student can catch up (or at least, get introduced) at home.
Objectives
This course aims are - To distinguish different approaches to secondary storage, depending upon the needs, but focusing on storage with analytic purposes. - To learn the wide diversity of solutions, and specifically to get introduced to some of the most widespread tools supporting Big Data implantations. Agenda will cover from the information acquisition and preparation to the manipulation on some DBMS Therefore, it is largely characterized by a practical focus To achieve these goals, the student must acquire a set of generic capabilities, knowledge, skills and attitudes. Cross/Generic Capabilities       o Analysis and synthesis abilities       o Organize and plan abilities       o Troubleshooting       o Ability to apply knowledge in practice Specific Capabilities Cognitive (Knowledge): Storage paradigms, Information Lifecycle, back-end solutions for Big Data Procedural/Instrumental (Know how) - Information acquisition and preparation - Data manipulation (by different languages) on diverse DBMS: - Structured DBMS - Document oriented DBMS - Column oriented DBMS Attitudinal (To be): ability to design queries (creativity), concerns about the effectiveness and the efficiency, and ability to discuss and clarify the diverse solutions to each specific problem
Skills and learning outcomes
Description of contents: programme
ITEM 1. Storage Paradigms - Introduction to Storage: archives and files - Databases and DBMS - Evolution of Storage and DBMS: OLTP vs. OLAP - Massive storage: ROLAP vs. RTOLAP ITEM 2. Structured Storages - Relational databases - Data Warehousing - Analytical Processing on Relational DB ITEM 3. Information Acquisition and preparation - Acquisition and extraction - Transformation, Cleaning & Integration ITEM 4. DBMS Supporting Big Data: MongoDB (document-oriented) ITEM 5. Introduction to other Back-End DBMS: Cassandra and Neo4J
Learning activities and methodology
Lectures: Highly practical lectures, most will take place at a computer room. Most tools involved in the course are free, so the student is encouraged to practice with them (after classes). It is an ambitious (and large) syllabus: some further details will be left for the student to review at home (complementary lectures). Assignments: Except from first item (introductory and of theoretical nature) every block will end up with an assignment, for the student to solve at home. Practical exercises (similar to those in the assignment) will be proposed in the classroom. Tutoring sessions: Individual tutoring sessions will be provided for solving eventual doubts, so the student can catch up with the group in case some parts are not found to be completely understood.
Assessment System
  • % end-of-term-examination 25
  • % of continuous assessment (assigments, laboratory, practicals...) 75
Calendar of Continuous assessment
Basic Bibliography
  • J. Calle. Course Teaching Materials (provided via Aula Global webpage) Each item will have specific references (mostly, links to webpages where documentation on tools usage, syntax, etc. can be freely accessed). Aula Global. 2018
Recursos electrónicosElectronic Resources *
Additional Bibliography
  • Elmasri, R. y Navathe, SB. Fundamentals of Database Systems. . Pearson .
  • Hurwitz, J, Nugent, A, Halper, F, Kaufman, M.. Big data for dummies . Wiley. 2013
  • Ramakrishnan, R. y Gehrke, J. . Database management systems. . McGraw Hill. .
  • Warden P.. Big Data glossary. A guide to the New Generation of data tools. . O¿Reilly . 2011
  • . Express Learning: Database Management Systems. . ITL Education Solutions Lt. Pearson India Pubs. 2012
  • Rijmenam, M.V. . Think Bigger (ISBN-13: 978-0-8144-3415-4). Amacom. 2014
(*) Access to some electronic resources may be restricted to members of the university community and require validation through Campus Global. If you try to connect from outside of the University you will need to set up a VPN


The course syllabus may change due academic events or other reasons.