Checking date: 24/04/2018


Course: 2018/2019

Predictive Modeling
(16494)
Bachelor in Data Science and Engineering (Plan: 392 - Estudio: 350)


Coordinating teacher: GARCIA PORTUGUES, EDUARDO

Department assigned to the subject: Statistics Department

Type: Compulsory
ECTS Credits: 6.0 ECTS

Course:
Semester:




Requirements (Subjects that are assumed to be known)
Calculus I and II Linear Algebra Programming Probability and Data Analysis Introduction to Statistical Modeling Statistical Learning
* General competences   - CG1: Adequate knowledge and skills to analyse and synthesise basic problems related to engineering and data science, solve them and communicate them efficiently.   - CG4: Ability to solve technological, computational, mathematical and statistical problems that may arise in engineering and data science.   - CG5: Ability to solve mathematically formulated problems applied to different subjects, using numerical algorithms and computational techniques.   - CG6: Synthesise the conclusions obtained from the analyses carried out and present them clearly and convincingly, both written and orally. * Transversal competences   - CT1: Ability to communicate knowledge orally and in writing, before a specialised and non-specialised public. * Specific competences   - CE1: Ability to solve mathematical problems that may arise in engineering and data science. Ability to apply knowledge about: algebra; geometry; differential and integral calculation; numerical methods; numerical algorithm; statistics and optimisation.   - CE2: Properly identify problems of a predictive nature corresponding to certain objectives and data and use the basic results of regression analysis as the basic basis of prediction methods.   - CE5: Understand and handle fundamental concepts of probability and statistics and be able to represent and manipulate data to extract meaningful information from them.   - CE7: Understand the basic concepts of programming and ability to carry out programs aimed at data analysis.
Description of contents: programme
This course is designed to give a panoramic view of several tools available for predictive modeling, at an introductory-intermediate level. This view covers in-depth the main concepts in (simple and multiple) linear models, gives an overview on their extensions, and treats more superficially regression trees. The focus is placed on providing the main insights on the statistical/mathematical foundations of the models and on showing the effective implementation of the methods through the use of the statistical software R. 1. Introduction 1.1 Course overview 1.2 What is predictive modeling? 1.3 Review on statistical inference 1.4 Review on probability 1.5 Software 2. Simple linear regression 2.1 Model formulation and estimation 2.2 Assumptions of the model 2.3 Inference for model parameters 2.4 Prediction 2.5 ANOVA and model fit 3. Multiple linear regression 3.1 Model formulation and estimation 3.2 Assumptions of the models 3.3 Inference for model parameters 3.4 ANOVA and model fit 3.5 Model selection 3.6 Use of qualitative predictors 3.7 Model diagnostics and multicollinearity 4. Linear regression extensions 4.1 Dimension reduction techniques 4.2 Regularization 4.3 Handling nonlinear relationships 4.4 Regression splines 4.5 Local linear regression 4.6 Logistic regression 5. Regression trees 5.1 Decision trees 5.2 Bagging 5.3 Random forest 5.4 Boosting The program is subject to small modifications due to the course development and/or academic calendar.
Learning activities and methodology
The lessons consist on a mixture of theory (methods description) and practice (implementation and practical usage of methods). The implementation of the methods is done with the statistical language R.
Assessment System
  • % end-of-term-examination 0
  • % of continuous assessment (assigments, laboratory, practicals...) 100

Basic Bibliography
  • James, G., Witten, D., Hastiee, T. and Tibshirani, R. . An Introduction to Statistical Learning with Applications in R. Springer-Verlag. 2013
Additional Bibliography
  • Kuhn, M. and Johnson, K.. Applied Predictive Modeling. Springer. 2013
  • Peña, D.. Regresión y Diseño de Experimentos. Alianza Editorial. 2002
  • Wood, S. N.. Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC. 2006
Recursos electrónicosElectronic Resources *
(*) Access to some electronic resources may be restricted to members of the university community and require validation through Campus Global. If you try to connect from outside of the University you will need to set up a VPN


The course syllabus may change due academic events or other reasons.


More information: https://www.uc3m.es/ss/Satellite/Grado/en/Detalle/Estudio_C/1371241688824/1371212987094/Bachelor_s_Degree_in_Data_Science_and_Engineering