What is data science?
Data science sits at the intersection of statistics, computer science and deep business or subject knowledge.
This data science venn diagram, adapted from Drew Conway, visualises how the disciplines that make up data science interact.
The intersection between:
- Maths and substantive expertise is traditional research - the systematic exploration of a subject
- Coding and maths is machine learning - the automation of statistical processes for ‘learning’
- Coding and substantive expertise is a danger zone - data science techniques can be used incorrectly if they are not supported by a rigorous understanding of mathematics
A data scientist may not be an expert in all of these areas. However, they will require a working knowledge of all disciplines and will often work closely with other experts in a multidisciplinary team.
What can you do with data science?
Visualisation
Data visualisation is presenting data in a digestible and interpretable way. It helps to understand what the data shows and to communicate insights.
Analysis
As summarised by Microsoft Azure, there are five basic types of question that data science can be used to answer:
Question | Type of Algorithm | Description |
Is it A or B? | Classification | Grouping data into predefined categories |
Is this unusual? | Anomaly detection | Identifying unexpected or unusual events or behaviours |
How much or how many? | Regression | Making numerical predictions |
How is this data organised? | Clustering | Understanding the structure of the data |
What to do next? | Reinforcement learning | Making decisions within an environment where correct actions are rewarded and future decisions improve |
How is data science being used in government?
The adoption of data science is enabling government to unlock the value of the data it holds.
It is being used to:
- visualise and understand data, e.g. presenting NEET (Not in Education Employment or Training) status among people aged 16-24
- build tools that help policymakers access and use information, e.g. DWP’s ‘Churchill’ app
- carry out analysis that helps improve a service or drive efficiency, e.g. identifying potential savings for the NHS
The Open Innovation Team is identifying opportunities for increasing academic engagement to support data science projects across government. If you would like to get in touch with us about our data science work you can email us openinnovation@cabinetoffice.gov.uk or follow us on twitter @openinnovteam.
If you want to explore data science further, this post from the Government Digital Service outlines their approach to data science projects and this blog gives some suggestions on how to get started.