Bayesian Philosophy and Methods in Data Science - 2024 entry
MODULE TITLE | Bayesian Philosophy and Methods in Data Science | CREDIT VALUE | 15 |
---|---|---|---|
MODULE CODE | MTHM508 | MODULE CONVENER | Dr James Salter (Coordinator) |
DURATION: TERM | 1 | 2 | 3 |
---|---|---|---|
DURATION: WEEKS | 11 |
Number of Students Taking Module (anticipated) | 28 |
---|
Since the 1980s, computational advances and novel algorithms have seen Bayesian methods explode in popularity, today underpinning modern techniques in data science and machine learning with applications across science, social science, the humanities and finance.
This module will introduce Bayesian statistics and reasoning. It will develop the philosophical and mathematical ideas of subjective probability theory for decision-making and explore the place subjectivity has in scientific reasoning. It will develop Bayesian methods for data analysis and introduce modern Bayesian simulation, including Markov Chain Monte Carlo and Hamiltonian Monte Carlo. The course balances philosophy, theory, mathematical calculation and analysis of real data ensuring the student is equipped to use Bayesian methods in future jobs aligned to data analysis and to take Bayesian research projects.
Pre-requisites: A basic introduction to probability and to classical statistics, plus experience of a programming language for data science such as R or Python. A preliminary online refresher course covering some basics in probability, integration and likelihood theory, supported by the module leader, is given alongside the first 2 weeks of the module to ensure students have the required knowledge to complete the course.
This module will cover the Bayesian approach to modelling, data analysis and statistical inference. The module describes the underpinning philosophies behind the Bayesian approach, looking at subjective probability theory, subjectivity in science as well as the notion and handling of prior knowledge, and the theory of decision making under uncertainty. Bayesian modelling and inference is studied in depth, looking at parameter estimation and inference in simple models and then hierarchical models. We explore simulation-based inference in Bayesian analyses and develop important algorithms for Bayesian simulation by Markov Chain Monte Carlo (MCMC) such the Gibbs sampler, Metropolis-Hastings and Hamiltonian Monte Carlo. We introduce decision theory with Bayes as a route to personalised decision making under uncertainty. The module aims to teach methods along with the mathematics to demonstrate why they work and the philosophy behind when, why and how they should be used. Unlike versions of this module with mathematics codes (MTH3041/MTHM047), the focus of the assessment is application, understanding and reasoning appropriate for data science students who have not completed a mathematics degree. It is not available to students on mathematics programmes (who may take the mathematics equivalent).
Module Specific Skills and Knowledge:
1. Show understanding of the subjective approach to probabilistic reasoning.
2. Demonstrate an awareness of Bayesian approaches to statistical modelling and inference and an ability to apply them in practice.
3. Demonstrate understanding of the value of simulation-based inference and knowledge of techniques such as MCMC and the theories underpinning them.
4. Demonstrate the ability to apply statistical inference in decision-making.
5. Utilise appropriate software and a suitable computer language for Bayesian modelling and inference from data.
Discipline Specific Skills and Knowledge:
6. Demonstrate understanding, appreciation of and aptitude in the quantification of uncertainty using advanced mathematical modelling.
Personal and Key Transferable/ Employment Skills and Knowledge:
7 Show Bayesian data analysis skills and be able to communicate associated reasoning and interpretations effectively in writing;
8. Apply relevant computer software competently;
9. Use learning resources appropriately;
10. Exemplify self-management and time-management skills.
Introduction: Bayesian vs Classical statistics, Nature of probability and uncertainty, Subjectivism.
Bayesian inference: Conjugate models, Prior and Posterior predictive distributions, Posterior summaries and simulation, Objective and subjective priors, Normal approximation, Bernstein Von-mises results Bayesian Hierarchical models, Bayesian regression and logistic regression.
Bayesian Computation: Monte Carlo, Inverse CDF, Rejection Sampling, Importance Sampling, Markov Chain Monte Carlo (MCMC), The Gibbs sampler, Metropolis Hastings, Hamiltonian Monte Carlo.
Decision Theory: Bayes’ rule, Decision trees, Utility theory.
Scheduled Learning & Teaching Activities | 33 | Guided Independent Study | 117 | Placement / Study Abroad |
---|
Category | Hours of study time | Description |
Scheduled learning and teaching activities | 33 | Lectures/practical classes |
Guided independent study | 33 | Post-lecture study and reading |
Guided independent study | 40 | Formative and summative coursework preparation, attempting un-assessed problems |
Guided independent study | 44 | Exam revision/preparation |
Form of Assessment | Size of Assessment (e.g. duration/length) | ILOs Assessed | Feedback Method |
---|---|---|---|
Practical and theoretical exercises | 11 hours (1 hour each week) | All | Verbal, in class and written on script |
Coursework | 50 | Written Exams | 50 | Practical Exams |
---|
Form of Assessment | % of Credit | Size of Assessment (e.g. duration/length) | ILOs Assessed | Feedback Method |
---|---|---|---|---|
Written exam – Restricted Note (1 A4 Sheet (2 sides)
of typed or handwritten notes)
|
50 | 2 hours (Summer) | 1-7, 9, 10 | Verbal on specific request |
Coursework - practical and theoretical exercises I | 25 | 15 hours | All | Written feedback on script and oral feedback in office hour. |
Coursework - practical and theoretical exercises II | 25 | 15 hours | All | Written feedback on script and oral feedback in office hour. |
Original Form of Assessment | Form of Re-assessment | ILOs Re-assessed | Time Scale for Re-assessment |
---|---|---|---|
Written exam * | Written exam (2 hours) | 1-7, 9, 10 | August Ref/Def period |
Coursework 1 * | Coursework 1 | All | August Ref/Def period |
Coursework 1 * | Coursework 2 | All | August Ref/Def period |
*Please refer to reassessment notes for details on deferral vs. Referral reassessment
Deferrals: Reassessment will be by coursework and/or written exam in the deferred element only. For deferred candidates, the module mark will be uncapped.
Referrals: Reassessment will be by a single written exam worth 100% of the module only. As it is a referral, the mark will be capped at 50%.
information that you are expected to consult. Further guidance will be provided by the Module Convener
Reading list for this module:
Type | Author | Title | Edition | Publisher | Year | ISBN |
---|---|---|---|---|---|---|
Set | Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A. and Rubin, D. | Bayesian data analysis | 3rd | CRC | 2008 | |
Set | Lindley, Dennis V. | Making Decisions | 2nd Edition | John Wiley & Sons | 1991 | 9780471908081 |
Set | DeGroot, M.H. | Optimal Statistical Decisions | WCL Ed edition | Wiley-Blackwell | 2004 | 9780471680291 |
Set | Sivia, Devinderjit | Data Analysis: A Bayesian Tutorial | 2nd Edition | Oxford University Press | 2006 | 9780198568322 |
CREDIT VALUE | 15 | ECTS VALUE | |
---|---|---|---|
PRE-REQUISITE MODULES | None |
---|---|
CO-REQUISITE MODULES | None |
NQF LEVEL (FHEQ) | AVAILABLE AS DISTANCE LEARNING | No | |
---|---|---|---|
ORIGIN DATE | Tuesday 12th March 2024 | LAST REVISION DATE | Friday 15th March 2024 |
KEY WORDS SEARCH | None Defined |
---|
Please note that all modules are subject to change, please get in touch if you have any questions about this module.