Data Science: with a focus on healthcare

This course offers an introductory exploration of the core principles of data science. Through real-world examples, we will explore the inherent value of information and its potential. We will delve into the realm of big data, focusing on its significance within the context of healthcare data. The course will cover the latest advancements in data science approaches, as well as innovations in information storage and processing. You will be introduced to fundamental terminologies in the field, empowering you to engage effectively in data science environments. We will follow a journey from the foundational concepts to more advanced levels. No specific background is required or assumed.

Course details

Checking availability...
Start Date
28 Jul 2024
Duration
5 Sessions over one week
End Date
3 Aug 2024
Application Deadline
23 Jun 2024
Location
International Summer Programme
Code
W45Pm30

Tutors

Dr Fatemeh Torabi

Dr Fatemeh Torabi

Assistant Professor of Health Data Science

Aims

This course aims to: 

  • introduce you to fundamentals of health data science and the latest advancement of this field
     
  • provide you with a robust understanding of terminology used in the field and the distinctive governance required for protection of individuals’ health records while enabling research 
     
  • enable you to gain an understanding of working with programming approaches to generate insights from health data

Content

This course provides a comprehensive exploration of the principles of data science within the context of global healthcare systems, commencing with foundational concepts such as healthcare systems, health data at scale, health research and the pivotal role of data-driven decision-making in enhancing patient outcomes. Participants will engage in an in-depth examination of ensuring secure access to patient data and adherence to governance standards, employing methodologies such as the five-safes framework and Trusted Research Environments (TRE). Through practical sessions, they will acquire proficiency in navigating data exploration, data access requirements, feasibility assessments, and the development of study protocols. The course also provides a hands-on project focusing on fundamentals of exploratory data analysis, followed by guidance on reproducible project management practices.

The course will begin with an exploration of the fundamentals of data science in various types of healthcare systems across the globe. We will look at the concepts of designing a medical study, the importance of data-driven decision-making in improving patient outcomes and reproducible approaches in data science.

Next, we will develop an understanding of how to ensure safe access to patient data and adherence to governance requirements are supplemented by the following: principles of five-safes, working in the Trusted Research Environments (TRE), steps of project development in a TRE, essential tools and software used in a health data science project.

The third session will focus on the requirements for a health data science study. We will look at data exploration and data access requirements, metadata availability, feasibility assessment and study protocols.

The fourth session involves a hands-on practical project, focused on exploratory data analysis. 
We will look at assessing completeness of data, data visualisation, feasibility assessment and statistical analysis.

Building on the previous session, the focus on the last day is to ensure all code syntax and resources generated for the small project are saved in a reproducible and accessible manner. 
We will review recommendations for generating reproducible pipeline in an agile development environment.

Presentation of the course 

The course will take place in a classroom setting using interactive presentations tools to aid with demos of technical methods. It is highly recommended that you bring your personal laptop or IT equipment to be able to follow some of the live voting and technical experiences with tools. Students will be encouraged to contribute to discussions in the classroom by offering opinions, experiences and observations.

Course sessions

  1. Introduction to Data Science in Healthcare 
    The fundamentals of data science in various types of healthcare systems across the globe. Concepts of designing a medical study. The importance of data-driven decision-making in improving patient outcomes and reproducible approaches in data science.
     
  2. Trusted Research Environment 
    Developing an understanding of safe access to patient data and governance requirements. Principles of five-safes, working in the Trusted Research Environments (TRE), steps of project development in a TRE, essential tools and software used in a health data science project.
     
  3. Health Data Science study design 
    Requirements for a health data science study. Data exploration and data access requirements, metadata availability, feasibility assessment and study protocols.
     
  4. Practical project 
    Hands-on focused on exploratory data analysis. Assessing completeness of data, data visualisation, feasibility assessment and correlation analysis.
     
  5. Analysis pipeline development 
    Ensuring all code syntax and resources for the project are saved in a reproducible and accessible manner and that the research pipeline is adaptive in an agile research development environment.

Learning outcomes

You are expected to gain from this series of classroom sessions a greater understanding of the subject and of the core issues and arguments central to the course. 

The learning outcomes for this course are:

  • to gain an understanding of the principles of designing medical studies, data access requirements and the governance for ethical use of patient data in trusted research environments 
     
  • to develop structural and computational thinking skills required for application of theoretical knowledge to real-world scenarios, through a hands-on practical project in exploratory data analysis, including assessing data completeness, utilizing data visualisation techniques, and performing feasibility assessments and correlation analysis

Required reading

There are no compulsory readings for this course. 

Typical week: Monday to Friday 

Courses run from Monday to Friday. For each week of study, you select a morning (Am) course and an afternoon (Pm) course. The maximum class size is 25 students.   

Courses are complemented by a series of daily plenary lectures, exploring new ideas in a wide range of disciplines. To add to your learning experience, we are also planning additional evening talks and events. 

c.7.30am-9.00am  Breakfast in College (for residents)  
9.00am-10.30am  Am Course  
11.00am-12.15pm  Plenary Lecture  
12.15pm-1.30pm  Lunch 
1.30pm-3.00pm  Pm Course  
3.30pm-4.45pm  Plenary Lecture/Free 
6.00pm/6.15pm-7.15pm Dinner in College (for residents)  
7.30pm onwards Evening talk/Event/Free  

Evaluation and Academic Credit  

If you are seeking to enhance your own study experience, or earn academic credit from your Cambridge Summer Programme studies at your home institution, you can submit written work for assessment for one or more of your courses.  

Essay questions are set and assessed against the University of Cambridge standard by your Course Director, a list of essay questions can be found in the Course Materials. Essays are submitted two weeks after the end of each course, so those studying for multiple weeks need to plan their time accordingly. There is an evaluation fee of £75 per essay. 

For more information about writing essays see Evaluation and Academic Credit

Certificate of attendance 

A certificate of attendance will be sent to you electronically after the programme.