Course details
Tutors
Aims
This course aims to:
- introduce you to fundamentals of health data science and the latest advancement of this field
- provide you with a robust understanding of terminology used in the field and the distinctive governance required for protection of individuals’ health records while enabling research
- enable you to gain an understanding of working with programming approaches to generate insights from health data
Course content
In this course, you’ll learn how to turn real-world health data into trustworthy evidence that can improve services, inform policy, and ultimately support better patient outcomes. You’ll start by building a clear, practical understanding of how healthcare systems operate globally and why data science has become central to modern healthcare decision-making from evaluating interventions to targeting resources and reducing avoidable harm.
You’ll then tackle one of the most important realities of health data science: working safely and responsibly with sensitive patient information. You’ll learn how secure data access works in practice, how governance is applied, and how to operate confidently within Trusted Research Environments (TREs), using concepts such as the five safes to guide ethical, compliant research. By the end of this section, you’ll be able to explain what “safe access” means, what evidence you need to demonstrate good practice, and how to plan work that meets governance expectations.
Next, you’ll move from ideas to execution. You’ll learn how to scope a health data science study, identify what data and metadata you need, assess feasibility, and translate a research question into a clear study protocol. You’ll practice early-stage data exploration and learn how to document requirements in a way that supports efficient approvals and reproducible delivery.
The course culminates in briefing you on practical steps involved in an open source collaborative health data science research project where you’ll be able to adopt and apply the provided code and materials to run a set of exploratory data analysis to assess data completeness, visualise patterns, and perform a feasibility-led statistical analysis. You’ll finish by strengthening your professional workflow: organising your code, outputs, and documentation into a reproducible, accessible project structure, and learning how to build maintainable analytical pipelines in an agile development environment.
What to expect on this course
The course is taught in an in-person, immersive classroom format designed to help you learn quickly and confidently through doing, discussing, and applying. Each day blends short, interactive teaching segments with live demonstrations of key methods and tools, followed by guided practical activities where you can immediately try what you’ve learned with support in the room.
You’ll be part of a highly engaged cohort experience: you’ll take part in live polls and quick knowledge checks, work through real-world examples, and discuss practical challenges drawn from healthcare settings. You’re encouraged to share your own perspectives, questions, and experiences. These conversations are a core part of the learning, helping you connect concepts to your context and learn from others across different roles and sectors.
Please bring your own laptop (or equivalent device). You’ll use it to follow demonstrations, participate in interactive voting activities, and complete hands-on exercises. Teaching is structured so you always know what’s coming next: clear explanations, step-by-step walkthroughs, time to practice, and opportunities to ask questions and troubleshoot in real time.
A key benefit of the in-person format is momentum and support: you can build skills faster because you’re learning alongside others, getting immediate feedback, and leaving with a complete, practical workflow you can take back to your work. By the end of the course, you’ll not only understand the core ideas you’ll have practised them in an immersive setting and feel ready to apply them with confidence.
Course sessions
Introduction to Data Science in Healthcare : The fundamentals of data science in various types of healthcare systems across the globe. Concepts of designing a medical study. The importance of data-driven decision-making in improving patient outcomes and reproducible approaches in data science.
Trusted Research Environment: Developing an understanding of safe access to patient data and governance requirements. Principles of five-safes, working in the Trusted Research Environments (TRE), steps of project development in a TRE, essential tools and software used in a health data science project.
Health Data Science study design: Requirements for a health data science study. Data exploration and data access requirements, metadata availability, feasibility assessment and study protocols.
Practical project: Hands-on focused on exploratory data analysis. Assessing completeness of data, data visualisation, feasibility assessment and correlation analysis.
- Analysis pipeline development: Ensuring all code syntax and resources for the project are saved in a reproducible and accessible manner and that the research pipeline is adaptive in an agile research development environment.
Learning outcomes
As a result of the course, you will gain a greater understanding of the subject and you should be able to:
- gain an understanding of the principles of designing medical studies, data access requirements and the governance for ethical use of patient data in trusted research environments
- develop structural and computational thinking skills required for application of theoretical knowledge to real-world scenarios, through a hands-on practical project in exploratory data analysis, including assessing data completeness, utilizing data visualisation techniques, and performing feasibility assessments and correlation analysis
Required reading
There is no required reading for this course. See Course materials for supplementary reading once registered.