Checking date: 17/06/2025 15:53:36


Course: 2025/2026

Big Data for Business
(19618)
Bachelor in International Studies (Plan: 504 - Estudio: 305)


Coordinating teacher: AUSIN OLIVERA, MARIA CONCEPCION

Department assigned to the subject: Statistics Department

Type: Electives
ECTS Credits: 6.0 ECTS

Course:
Semester:




Requirements (Subjects that are assumed to be known)
Statistics for social sciences I: Introduction to statistics Statistics for social sciences II: multivariate techniques
Objectives
1. Understand the importance of transforming large volumes of data into relevant information for decision making and business development in organizations, companies and individuals. 2. Learn the basic techniques of preprocessing and visualization of data. Gain knowledge on methods to work with missing and atypical data. Acquire the ability to use of dimension reduction techniques. 3. Gain knowledge on the main methods of supervised learning in regression and their usefulness in prediction problems. Distinguish between linear and non-linear models and understand the importance of model selection methods. 4. Become familiar with the usual supervised learning procedures for classification. Understand the most common classifiers and their limitations. Gain knowledge in advanced methods for classification and their benefits in business. 5. Be able to identify the appropriate Big Data techniques in real business problems: customer classification, scoring, risk management, fraud detection, bankruptcy prediction, etc.
Learning Outcomes
K3: To know basic humanistic contents, oral and written expression, following ethical principles and completing a multidisciplinary training profile. S4: Use information interpreting relevant data avoiding plagiarism, and in accordance with the academic and professional conventions of the area of study, being able to assess the reliability and quality of such information. S7: Be able to identify, access and manage sources of information relevant to comparative analysis in the field of politics, economics and international relations. S8: Knowing how to propose and use the appropriate tools to solve basic problems of economic, social and political content, especially in the international context. S11: Ability to discern which quantitative or qualitative research technique is the appropriate one to apply depending on the phenomenon being analyzed. C3: Ability to establish good interpersonal communication and to work in multidisciplinary and international teams.
Description of contents: programme
1. Introduction. 2. Data collection, sampling and preprocessing. 2.1. Types of data. 2.2. Sampling. 2.3. Data visualization tools. 2.4. Missing values. 2.5. Outlier detection and treatment. 2.6. Data transformations. 2.7. Dimension reduction. 2.8. Application: Risk management in the stock market. 3. Supervised learning: regression. 3.1. Linear and polynomial regression. 3.2. Cross-validation. 3.3. Model selection and regularization methods (ridge and lasso). 3.4. Nonlinear models, splines and generalized additive models. 3.5. Application: credit-scoring prediction. 4. Supervised learning: classification. 4.1. Bayes classifiers 4.2. Logistic regression. 4.3. K-nearest neighbors. 4.4. Random forest. 4.5. Support-vector machines. 4.6. Boosting. 4.7. Application: Credit risk. 4.8. Application: Fraud detection. 4.9. Application: Bankruptcy prediction
Learning activities and methodology
Theory (2 ECTS). Lectures with available material posted in internet. Problems (4 ECTS) Problem Solving classes. Computational exercises at computer room. Work assignments in groups. Weekly office hours to assist students on an individual and group basis.
Assessment System
  • % end-of-term-examination/test 50
  • % of continuous assessment (assigments, laboratory, practicals...) 50

Calendar of Continuous assessment


Extraordinary call: regulations
Basic Bibliography
  • Bradley Efron, Trevor Hastie.. Computer Age Statistical Inference: Algorithms, Evidence and Data Science.. Cambridge University Press. 2016
  • Daniel Peña. Análisis de datos multivariantes.. McGraw-Hill. 2002
  • James, G., Witten, D., Hastie, T., Tibshirani, R.. An Introduction to Statistical Learning with Applications in R. Springer. 2021 (2nd Edition)
  • T. Hastie, R. Tibshirani, J. Friedman.. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. . 2009

The course syllabus may change due academic events or other reasons.