Ficha

Versión en español

Course: 2025/2026

High-performance computing for big data in companies

(17231)

Master in Big Data Analytics (Plan: 352 - Estudio: 322)

EPI

Coordinating teacher: GARCIA BLAS, FRANCISCO JAVIER

Department assigned to the subject: Computer Science and Engineering Department

Type: Compulsory

ECTS Credits: 3.0 ECTS

Course: 1º

Semester: 1º

Objectives

Basic Skills * Knowledge and understanding that provide a basis or opportunity for originality in developing and / or applying ideas, often in a research context * That the students can apply the broader (or multidisciplinary) acquired knowledge and ability to solve problems in new or unfamiliar environments within contexts related to their field of study * Students must possess the learning skills that enable them to continue studying in a way that will be largely self-directed or autonomous. General Competencies * Apply the theoretical underpinnings of the techniques for the high-performance processing of large volumes of data as a basis for the development and adaptation of such techniques to specific problems * Identify different techniques and paradigms for processing large amounts of data, and differentiate them according to their theoretical and practical features * Use skills for teamwork and getting along with other independently Specific Skills * Apply basic knowledge of big data programming techniques using advanced technologies and methods for treating large volumes of data * Identify opportunities that data processing techniques can make to the improvement of the activity of enterprises and organizations * Provide basic and fundamental knowledge of big data processing frameworks * Identify and select suitable frameworks and software tools for the treatment of large amounts of data * Making efficient use of distributed platforms for high-performance data processing Learning Results * Manage the basics of big data processing frameworks. * Ability to use high-performance architectures and technologies for large volumes of data. * Knowledge of design techniques and application development of high-performance big data computing. * Skills to analyze and model the most appropriate frameworks for each problem, adapting to the specifications of individual cases

Learning Outcomes

Link to document

Description of contents: programme

1. Introduction to Big Data Processing 2. MapReduce Paradigm 3. Storage Systems Big Data environments * HDFS as distributed file system * Commands for managing files in HDFS 4. Frameworks for intensive computing data * Introduction to Apache Hadoop * Apache Spark * Access and processing a large volume of data * Streaming Data Processing 4. Management computational resources * Introduction to Apache Yarn * Deploying applications in corporate Big Data environments * Tools for monitoring Big Data applications

Learning activities and methodology

Learning activities: * Lectures * Hands-on land lab projects * Personal student work. Teaching methodology: * Presential lectures imparted in the class, using multimedia and informatics support, to develop the course's main concepts. Reading materials will be provided to complement student's knowledge. * Reading recommended texts, from papers, technical journals, manuals, and reports, to extend the student's knowledge of the subject topics. * Solving practical jobs, problems, etc. proposed in class (individually or in groups). * AI tools are allowed under the declaration of the student.

Assessment System

% end-of-term-examination/test 60
% of continuous assessment (assigments, laboratory, practicals...) 40

Calendar of Continuous assessment

Use of Artificial Intelligence tools selectively allowed in this subject. The teacher may indicate a list of assignments and exercises that the student may perform using AI tools, specifying how they should be used, and how the student should describe the use he/she has made of them. If the use of AI by the student gives rise to academic fraud by falsifying the results of an exam or work required to accredit academic performance, the provisions of the Regulation of the University Carlos III of Madrid of partial development of the Law 3/2022, of February 24, of university coexistence will be applied.

1.- Continuous evaluation (40%)

    * Class activities.
    * Individual or collective projects made along the course.

2.- Final exam (60%)

It is mandatory to obtain at least 4 points over 10 in each of the evaluable parts of the subject.

Basic Bibliography

Holden Karau, Andy Konwinski, Patrick Wendell & Matei Zaharia. Learning Spark. O¿Reilly. 2015
Martin Odersky, Lex Spoon, Bil Venners. Programming in Scala. Artima.

The course syllabus may change due academic events or other reasons.

More information: https://researchportal.uc3m.es/display/inv36190