1. Main points
For Census 2021, households were sent either an initial contact letter containing an online access code ("online-first"), or a paper questionnaire ("paper-first"); we determined the Lower layer Super Output Areas (LSOAs) that would be treated as paper-first using two hard-to-count indices, Hard-to-Count Willingness and Hard-to-Count Digital.
The Digital Propensity Index (DPI) is a unique, rich, and relevant data source of digital uptake for LSOAs across England and Wales.
The DPI combines, with associated measures of uncertainty, the actual online share of responses observed for online-first LSOAs and the predicted online share of responses for paper-first LSOAs, if they had been online-first.
Of the households sent a paper questionnaire (including an access code to respond online) in Census 2021, 46.4% responded online; the equivalent proportion was 92.5% when we modelled predictions using the DPI.
The DPI, a measure of how confident households are using government online resources, is an alternative to the Hard-to-Count Digital Index which was used in planning Census 2021.
These results potentially imply that future research could more confidently take an online approach to data collection, although we recognize that not everyone will be capable of responding, or willing to respond, online.
There is always uncertainty and error in a model's predictions. Read how we have accounted for this in "Interpreting the data" in Section 7: Data sources and quality.
2. About the online-first census
Census 2021 was the first online-first census in England and Wales where all households received a unique access code (UAC) to complete the census online. Most households only received a UAC and are referred to as online-first areas. However, to maximise response, 11% of households (50% of Welsh households and 9% of English households) first received a paper questionnaire and a UAC. These are referred to as paper-first areas.
We identified Lower layer Super Output Areas (LSOAs) rather than individual addresses as needing to be paper-first. This is because we recognise there will be some similarity between an area's households and residents. Additionally, designing which areas would be paper-first was based on data at 2011 LSOA level. Specifically, these were areas where the take-up of the online option was expected to be low, but willingness to take part without further prompts (reminder letters or field visits) was high.
Online share
Census 2021 had quality targets specifically linked to it being the first online-first census. We exceeded a target of achieving at least 75% of responses to Census 2021 online. In total, 88.9% of responding households in England and Wales chose to respond online.
Specifically, paper-first LSOAs had an online share of 46.4%, whereas online-first LSOAs had an online share of 94.2%.
However, the Census 2021 collection design influenced the online share. The online share is the proportion of responding households that have responded online. It is not the proportion of all households in England and Wales (including non-responding households) that have responded online.
Neither are these figures about overall response, nor does a high digital uptake in an area necessarily correspond to a high overall response rate. For example, urban areas often had a lower overall response for Census 2021, but a higher proportion of the responding households in these areas completed an online census form.
Read more about the share of households responding online, and how the collection design influenced it, in our Designing a digital-first census article.
Nôl i'r tabl cynnwys3. About the Digital Propensity Index
We created the 2021 Digital Propensity Index (DPI) to give a relative measure of digital propensity at Lower layer Super Output Area (LSOA) level across England and Wales. This provides additional value from Census 2021 data by providing a valuable tool for anyone planning digital services or researching digital deprivation.
To do this, we predicted the online share for paper-first areas had they been online first. In doing so, we provided a comparative measure of digital propensity at LSOA level across England and Wales, independent of the census paper strategy. Combining these modelled predictions with the actual online shares for online-first areas gives us DPI scores for every LSOA (as defined following the 2011 Census) and local authority in England and Wales.
You can now download the DPI along with associated measures of uncertainty at LSOA level.
Nôl i'r tabl cynnwys5. Digital Propensity Index data
Digital Propensity Index for Census 2021 at Lower layer Super Output Areas (LSOAs), England and Wales
Dataset | Released 8 February 2023
Digital Propensity Index scores and associated confidence intervals for LSOAs as defined in 2011 in England and Wales.
Digital Propensity Index for Census 2021 at local authority, region and country level, England and Wales
Dataset | Released 8 February 2023
Digital Propensity Index scores for local authorities, as defined in December 2020 and December 2021, regions and countries in England and Wales.
6. Glossary
Digital Propensity Index
A measure of how often individuals use communication technology.
Hard-to-Count Index
A measure of the relative willingness of residents in an area to respond to the census without further prompts and the relative likelihood that they will respond online. The Hard-to-Count Index is made up of the Hard-to-Count Digital Index and the Hard-to-Count Willingness Index, both of which are used as variables in the model.
Lower layer Super Output Area (LSOA)
Lower layer Super Output Areas (LSOAs) are made up of groups of Output Areas (OAs), usually four or five. They comprise between 400 and 1,200 households and have a usually resident population of between 1,000 and 3,000 persons. They were first created following the 2001 Census and may change after each census.
Online-first area
LSOAs where households first received only a letter with an access code to complete the census online.
Paper-first area
LSOAs where households first received a paper questionnaire and an access code to complete the census online.
Binomial logistic regression
A logistic regression model for a binomial dependent variable with y successes out of n trials and one or more independent variables.
Nôl i'r tabl cynnwys7. Data sources and quality
Measuring the data
We used a binomial logistic regression for the Lower layer Super Output Area (LSOA) predictions. This is because it allowed us to model how multiple independent variables affect the likelihood that households will respond online. We created two models at LSOA level, an English and a Welsh model, to account for the respective indices of multiple deprivation from each country.
We also used the Census 2021 online share for online-first areas as the dependent variable because we aimed to predict the online share of returns. The dependent variable is not binary in the sense that it is not on a 0/1 scale. However, the dependent variable was binomial in that y is defined as the number of households in each LSOA responding online and n is defined as the total number of responding households in each LSOA.
The independent variables included in the models are:
age - the proportion of household reference persons in the age group 65 years and over at LSOA level from the 2011 Census data; the variable was logit transformed to better meet model assumptions, and 2011 Census data were used as at the time of modelling age was not available at LSOA from Census 2021
Hard-to-Count Digital Index (HtC D) - an index from 1 to 5 (5 being the hardest to count), showing the relative propensity of households in an area to respond to Census 2021 online (HTC D 1 is the reference category)
Hard-to-Count Willingness Index (HtC W) - an index from 1 to 5 (5 being the hardest to count), showing the relative willingness of households in an area to respond to Census 2021 within 10 days after Census Day (HTC W 1 is the reference category)
urban/rural classification - urban/rural as defined by the Official Statistics 2011 Rural Urban Classification, which states whether the LSOA is classed as urban or rural (rural is the reference category)
region - region as defined by Eurostat's Nomenclature of Territorial Units for Statistics (NUTS) in the UK (London is the reference category), included in the English model
English Index of Multiple Deprivation (IMD) - included only in the English model, an index from 1 to 10, showing the relative deprivation for an LSOA (IMD 1 is the reference category)
Welsh Index of Multiple Deprivation (WIMD) - included only in the Welsh model, an index from 1 to 10, showing the relative deprivation for an LSOA (IMD 1 is the reference category)
We created and trained both models using the data from the online-first areas. Then, we applied the models to the respective paper-first areas to produce the predicted online share for paper-first areas, had they been online first.
We used all the Welsh online-first areas as the sample for the Welsh model. The sample size for the Welsh model was 934 LSOAs.
We used the variable region in the English model. To make the sample more representative of what we were predicting, we made the sample match the regional breakdown of the paper-first areas. For example, 26.3% of paper-first areas in England are from the West Midlands, so 26.3% of the online sample we used in the model were from the West Midlands.
To achieve this regional representation, the sample size for the English model was 9,998 LSOAs of the 29,835 online-first LSOAs in England. Sensitivity analyses were conducted to ensure that the sample of LSOAs selected for the English model had minimal impact on the final predictions.
We carried out checks to ensure the assumptions underpinning the models were met. The checks we used were:
the Cook's distance to ensure no LSOAs were having an undue impact on the final predictions
Variance Inflation Factors (VIFs) were assessed to ensure there was no severe multicollinearity
linearity of continuous independent variables on the logit scale was assessed
residual plots
Then, we used the online share from online-first areas and the predicted online share from paper-first areas to create the Digital Propensity Index (DPI).
The LSOA online shares were also aggregated to find local authority and regional results for quality assurance and publishing. This is done by calculating a weighted average taking into account the number of households in each LSOA, rather than simply averaging the LSOA predictions in each LA.
Interpreting the data
When interpreting the results, it is important to remember that 11% of the LSOAs across England and Wales were paper first. As such, 11% of the LSOA results are predictions produced from the models. There is always some level of error and uncertainty in a model's predictions as the model is limited by the data used, and not all factors can be considered.
To reflect this, we have published confidence intervals (CIs) with the paper-first predictions and calculated coefficients of variation (CVs). These CIs and CVs show how much uncertainty there is in the modelled values, and the CI's give a range of values around the estimate which likely contains the correct value. Note that it is not possible to capture all forms of uncertainty in the confidence intervals. For example, the confidence intervals for the English models do not capture the uncertainty introduced from using a subset of LSOAs in the final model. Table 6 shows the CV results from each model. Often a CV under 20% is acceptable and the highest CV for both models is below 20%, with the mean well below 20%.
Welsh model | English model | |
---|---|---|
Highest CV | 6.7% | 1.2% |
Mean CV | 2.7% | 0.7% |
Median CV | 2.8% | 0.7% |
Download this table Table 6: The coefficient of variation (CV) results from the predictions of the Welsh model and English model, England and Wales LSOAs, Census 2021
.xls .csv8. Future developments
This data on the relative digital propensity of households at Lower layer Super Output Area (LSOA) level across England and Wales is useful for research and planning of digital services. These results potentially imply that future research could more confidently take an online approach to data collection, which is important with more of our surveys going online. However, we recognize that not everyone will be capable of responding, or willing to respond, online.
Nôl i'r tabl cynnwys10. Cite this article
Office for National Statistics (ONS) released 8 February 2023, ONS website, article, Digital Propensity Index for England and Wales LSOAs: Census 2021