Statistical Data Modelling - 2023 entry
MODULE TITLE | Statistical Data Modelling | CREDIT VALUE | 15 |
---|---|---|---|
MODULE CODE | MTHM506 | MODULE CONVENER | Dr Oscar Rodriguez De Rivera Ortega (Coordinator) |
DURATION: TERM | 1 | 2 | 3 |
---|---|---|---|
DURATION: WEEKS | 5 (October start) / 0 (January start) | 0 (October start) / 5 (January start) |
Number of Students Taking Module (anticipated) | 50 |
---|
Statistical modelling lies at the heart of modern data analysis and is a vital part of data science, particularly when decision making is involved. Simple statistical models include linear regression familiar from most foundation courses in statistics. This module places linear regression into the very broad framework of Bayesian statistical data modelling, which has become one of the most popular approaches to data analysis. Bayesian inference will be introduced as a unifying modelling framework, and the module will introduce modelling concepts such as Generalized Linear Models, Generalized Additive Models, Hierarchical Models, Multi-Level Models, Discrete Mixture Models, Models for Flawed Data and predictive model validation. These will provide you with a toolbox and the ability to analyse any real world data set, including binary data, count data, contingency tables, data with temporal and spatial structure as well as data that are missing or partially missing. We will use the statistical software R as the main platform to fit this wide range of models, and will use it in practical sessions so that, as well as a sound theoretical basis, you will develop an understanding of how to apply techniques discussed in the module in practical data analysis.
Statistical data modelling offers a systematic and rigorous way of describing data and thus the mechanisms and processes that generated them. Uncertainty is formally quantified in terms of probability. This module will formally define statistical data modelling as a process by which we can use the data as subjective judgement to construct a mathematical description of the data. It will then argue that Bayesian inference is truly a unifying framework with which we can build and check the validity of statistical data models, while fully quantifying the different sources of uncertainty that result in the apparent haphazard nature of real data sets. The module will introduce well-established but fairly restrictive models such as GLMs but then move on to present more state-of-the-art approaches such as GAMs and Bayesian Hierarchical Models as well as a conceptual framework for correcting flaws in observational data sets (such as censoring). The module will introduce a plethora of real data sets spanning a wide range of applications such as public health, weather, climate, ecology, biology, epidemiology, natural hazards and many others.
On successful completion of this module you should be able to:
Module Specific Skills and Knowledge
2 Demonstrate awareness of, and ability to apply, the unifying power of Bayesian inference for data analysis and its use in inference (e.g. quantifying relationships) and prediction;
3 Reveal awareness of, and ability to apply, related modern developments in statistical modelling techniques, including nonparametric and semi-parametric formulations (GAMs), Bayesian hierarchical modelling and models for flawed data;
4 Utilise appropriate software and a suitable computer language for advanced modelling of data;
Discipline Specific Skills and Knowledge
6 Apply simulation-based numerical integration methods in the context of Bayesian statistical modelling
7 Appreciate and apply the concept of piecewise processes and their use in semi-parametric statistical models
8 Understanding of the multivariate Normal distribution and its use in Bayesian statistical modelling
Personal and Key Transferable / Employment Skills and Knowledge
10 Apply relevant computer software competently;
11 Use learning resources appropriately;
12 Exemplify self-management and time-management skills;
13 Gain experience in problem solving using data analysis.
- Introduction of linear regression as a special case of a statistical model and of statistical modelling as a method;
- Value of Bayesian inference as a unifying modelling framework;
- Posterior predictive model checking;
- Generalised linear models (GLMs): definition and historical use;
- Generalised Additive Models (GLMs): definition and a method to capture space-time structures;
- Normal approximation to the posterior and connection to maximum likelihood;
- Hierarchical Models: definition and links to random effects and multi-level models;
- Discrete mixture models and zero-inflation;
- Models for flawed data.
Scheduled Learning & Teaching Activities | 30 | Guided Independent Study | 120 | Placement / Study Abroad |
---|
Category | Hours of study time | Description |
Scheduled learning and teaching | 20 | Lectures |
Scheduled learning and teaching | 10 | Hands-on practical sessions |
Guided Independent Study | 36 | Post lecture study and reading |
Guided Independent Study | 84 | Formative and summative coursework preparation |
Form of Assessment | Size of Assessment (e.g. duration/length) | ILOs Assessed | Feedback Method |
---|---|---|---|
Unassessed Practical Modelling Exercises | 20 exercises | 1-13 | Verbal, in class |
Coursework | 100 | Written Exams | 0 | Practical Exams |
---|
Form of Assessment | % of Credit | Size of Assessment (e.g. duration/length) | ILOs Assessed | Feedback Method |
---|---|---|---|---|
Coursework – practical modelling exercises and theoretical problems | 50 | 10 Hours | 1-13 | Written and oral |
Coursework – data analysis project | 50 | 20 Hours | 1-13 | Written and oral |
Original Form of Assessment | Form of Re-assessment | ILOs Re-assessed | Time Scale for Re-assessment |
---|---|---|---|
CW - Practical modelling exercises 1* | CW - Practical modelling exercises 1 | 1-13 | Ref/Def Period |
CW - data analysis group project * | CW - data analysis individual project | 1-13 | Ref/Def Period |
Deferrals: Reassessment will be by coursework in the deferred element only. For deferred candidates, the module mark will be uncapped.
Referrals: Reassessment will be by a single piece of coursework worth 100% of the module only. As it is a referral, the mark will be capped at 50%.
information that you are expected to consult. Further guidance will be provided by the Module Convener
Reading list for this module:
Type | Author | Title | Edition | Publisher | Year | ISBN |
---|---|---|---|---|---|---|
Set | Aitkin, M., Francis, B., Hinde, J. and Darnell, R. | Statistical Modelling in R | Oxford University Press | 2008 | 9780199219131 | |
Set | Crawley, M.J. | The R Book | Wiley | 2007 | 9780470510247 | |
Set | Faraway, J.J. | Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models | Chapman & Hall | 2006 | 158488424X | |
Set | Wood, Simon N. | Generalized Additive Models: An Introduction with R | Chapman & Hall/CRC | 2006 | 978-1584884743 | |
Set | Gelman, A. and Hill, J. | Data Analysis Using Regression and Multilevel/Hierarchical Models | Cambridge University Press | 2007 | 052168689X | |
Set | Krzanowski W.J. | An Introduction to Statistical Modelling | Arnold | 1998 | 000-0-340-69185-9 |
CREDIT VALUE | 15 | ECTS VALUE | 7.5 |
---|---|---|---|
PRE-REQUISITE MODULES | None |
---|---|
CO-REQUISITE MODULES | None |
NQF LEVEL (FHEQ) | 7 | AVAILABLE AS DISTANCE LEARNING | No |
---|---|---|---|
ORIGIN DATE | Monday 14th September 2020 | LAST REVISION DATE | Friday 9th December 2022 |
KEY WORDS SEARCH | None Defined |
---|
Please note that all modules are subject to change, please get in touch if you have any questions about this module.