1. Introduction

The Integrated Data Service (IDS) is a cloud-based Trusted Research Environment (TRE) developed as a cross-government service.

The IDS has now been brought into the Office for National Statistics (ONS), providing an internal secure cloud-based analytical environment. The data linkage architecture and analytical tools delivered in the service will be focused on analysis and research that supports the ONS's economic and population statistical priority outputs.  

As a result, the IDS is only available to internal ONS analysts.

Nôl i'r tabl cynnwys

2. Available datasets

The Integrated Data Service (IDS) provides access to over 100 datasets. Research and analysis using these datasets is available to Office for National Statistics analysts.

You can learn more about the datasets and search by date, theme and keywords.

Access the IDS Data Catalogue

Nôl i'r tabl cynnwys

3. Available tooling

The Integrated Data Service (IDS) provides accredited analysts and researchers with a suite of analytical tools.

These are some of the analytical tools currently available in the IDS.

Undertaking analysis

  • Google Vertex AI Workbench instances for Python

  • Google BigQuery to query and manipulate data using Structured Query Language (SQL)

  • Python coding through Integrated Development Environment (Code OSS)

  • R coding through Integrated Development Environment (RStudio)

You can request tools when you apply for a project. You will need to specify which users on your project need these tools.

Managing code

  • Source code through Git

  • Packages through Artifactory

Nôl i'r tabl cynnwys

4. Access to the Integrated Data Service

Access to the Integrated Data Service (IDS) is only available to Office for National Statistics analysts. External research projects can be carried out using other Trusted Research Environments (TRE), such as the Secure Research Service.

Learn how to become an accredited researcher.

Nôl i'r tabl cynnwys

5. Apply for a project

To apply for a project on the Integrated Data Service (IDS), you and your research team must:

  • be internal to the Office for National Statistics

  • have discussed your project requirements with IDS.customer.support@ons.gov.uk and been approved to use the platform

Nôl i'r tabl cynnwys

6. Using the Integrated Data Service

The Integrated Data Service (IDS) is a cloud-based data platform for internal Office for National Statistics analysts. This means it stores data online and provides the tools to analyse it.

You do not need to be an expert with using our technology to access the IDS. Guidance and tutorials are provided for new users. However, you will need some coding skills to run a successful project, so some training on Structured Query Language (SQL) and R or Python ahead of time would be beneficial. 

The IDS uses different technologies that work through Google Cloud to provide its service.

Once your account has been approved and you are a member of a project, you will have access to the following tools. 

Your workspace

This is the analytical area where you can write code to access data, perform analyses and create graphs, tables or other work outputs.

For each project you are a member of, you will have one or more of the following workspaces: 

  • Vertex AI Workbench – a JupyterLab environment that supports Python
  • RStudio – a virtual version of the RStudio environment supporting R
  • Code OSS – a virtual version of Code OSS, like VSCode, supporting Python 

You can mix your code with formatted text to explain your project using documents called Notebooks in the Vertex AI Workbench, or R-Notebooks or R-Markdown files in RStudio. 

There is more user guidance on how to use your workspace after you have been approved as a member of an IDS project.

Remote browsing

Remote browsing is a security measure used by the IDS. For this, we use Cloudflare Remote Browser Isolation (RBI). In practice, this will show a small blue address bar at the top of your internet browser.

This should happen without any user actions required, but there are some points to note: 

  • IDS features will only open within the remote browser
  • other websites will not open within the remote browser
  • copy and pasting into or out of the remote browser is disabled
  • copy and pasting between remote browser windows is allowed, but the clipboard for the remote browser currently needs to be activated by copying some text outside of the remote browser first
  • the remote browser may ask you to sign in a second time 

Storing data

The IDS stores its data in a central database that runs on BigQuery, a data storage platform by Google.

We hold a range of de-identified datasets available to view through our data catalogue 

However, accredited researchers only have access to de-identified data that have been approved for a project they are a member of. The metadata (that is, the data about the data) can be viewed using BigQuery Studio.

You can access it directly through the Hub to inspect your data or interact with BigQuery through code in your Notebook to withdraw and store data.

The IDS database uses Structured Query Language (SQL) as the standard way to interact. Guidance on this, as well as tutorials and example code, are available inside the IDS. 

Interacting with data

To interact with BigQuery, you will need to use Structured Query Language (SQL), a coding language used mainly when interacting with databases.

Bits of SQL code are often called "queries", as their purpose is to query a database. As part of these queries, data can already be manipulated, for example, aggregated or filtered. The output seen by the user is then already processed data, which can save memory space in the work environment.

You will need some skills in SQL coding to effectively access, process and link your data. 

Analysing data

The coding tools currently available for analysing data are R and Python. 

Both are open source, free programming languages with plenty of free resources for learning to code in these languages.

You should be comfortable with using these to: 

  • manipulate data
  • perform analysis using code
  • create basic data visualisations 

Storing code

Code is stored and shared on GitHub, a platform for storing code while maintaining version control.

The IDS has a dedicated, private GitHub server, that allows complex collaboration as well as backups of any work stored on it. It supports a folder structure and is the only permanent storage area for user code on the IDS. 

When a user clones or creates a repository, any changes made to files in their local copy of the repository will be tracked. Every time a user pushes code to their repository, the changes are saved as a history of changes, so that old versions of the code can be brought back. It also allows different people to work on the same code at the same time and merge their changes later.

Nôl i'r tabl cynnwys

7. Contact details

You can contact the Integrated Data Service (IDS) team for more information by email: IDS.customer.support@ons.gov.uk.

Our support hours are Monday to Friday, 9am to 5pm. We aim to respond as soon as possible.

Nôl i'r tabl cynnwys