Cynnwys
- Main points
- Change in Coronavirus (COVID-19) Infection Survey data collection method
- Likelihood of testing positive for COVID-19 by data collection method
- Likelihood of strong positive COVID-19 cases reporting symptoms by data collection method
- Representativeness of the Coronavirus (COVID-19) Infection Survey population sample by data collection method
- Summary of findings
- How the data are measured
- Coronavirus (COVID-19) Infection Survey, Quality Report data
- Collaboration
- Glossary
- Related links
- Cite this article
1. Main points
Following on from our September quality report, we have continued to assess the impact of the change to our data collection method from study worker home visits to remote data collection on our coronavirus (COVID-19) estimates.
We compared the estimated likelihood of testing positive for COVID-19 on a nose and throat swab, as well as the likelihood of people with a strong positive test reporting symptoms, by data collection method while adjusting for several variables. We also assessed the representativeness of our remote data collection and study worker population samples by comparing them with demographic data from the 2021 Census.
In the first week of the period studied (11 to 31 July 2022), participants who provided a swab sample by remote data collection were more likely to test positive compared with those who provided a swab sample with a study worker home visit; however, after this there was no difference between the groups in their likelihood of testing positive.
Analysis using data from 19 July to 1 August 2022 showed that remote data collection led to participants with a strong positive test being 2.7 (95% confidence interval: 2.2 to 3.3) times more likely to report symptoms compared with those who had study worker home visit data collection.
Data from the COVID-19 Infection Survey on the percentage of people with a strong positive test reporting symptoms should not be considered equivalent across the two data collections methods; however, data from both collection methods still provide valuable insights on the most commonly reported symptoms and trends in reported symptoms when analysed separately.
Remote data collection and study worker population samples are representative of the Census 2021 population by sex, age and region.
The demographic profiles of remote data collection and study worker home visit population samples are very similar to each other.
3. Likelihood of testing positive for COVID-19 by data collection method
In our August quality report we compared modelled estimates of the percentage of people testing positive for coronavirus (COVID-19) on a nose and throat swab by data collection method. The estimates were produced using the same methods as those in our weekly Coronavirus (COVID-19) Infection Survey bulletin, which use a Bayesian multi-level regression (MRP) model that adjusts for age, sex and region. More information on the methods used to produce COVID-19 positivity rates in our weekly bulletin can be found in our methods article.
To further assess the impact of the change in how the data are collected, we compared the estimated likelihood of testing positive for COVID-19 on a nose and throat swab by data collection method and calendar date while adjusting for several variables. The analysis is based on regression models similar to those presented in our Analysis of populations in the UK by risk of testing positive for coronavirus (COVID-19) September 2021 publication, which provides a more detailed explanation of the methods used.
The models presented here additionally include an interaction between data collection method and calendar date to test for variation in any effect of data collection method on the likelihood of testing positive for COVID-19 over time.
The analysis in this section uses data from 11 to 31 July 2022 and includes 4,104 positive results from 89,030 people who provided a nose and throat swab to study workers, and 2,430 positive results from 53,718 people who provided a nose and throat swab via post or courier. Our first regression model allowed us to test the effect of data collection method by calendar date on the likelihood of testing positive for COVID-19 on a nose and throat swab, while controlling for the following demographic variables:
age
sex
geographical region the participant lives in
ethnicity
household size
whether the household was multigenerational
urban or rural classification of the participant's address
effect of a disability (from not having a disability to affected "a lot" by a disability)
The likelihood of testing positive for COVID-19 for those who provided a nose and throat swab sample remotely, compared with those who provided a swab sample at study worker home visits, by calendar date from 11 to 31 July 2022, is shown in Figure 1.
Results show that between 11 and 17 July 2022, participants who provided a swab sample by remote data collection were more likely to test positive than those who provided a swab sample at a study worker home visit. It is possible that those who provided a swab sample remotely in the initial few days of its launch were different in a way that meant that their risk of testing positive was higher than those who provided a swab sample through a study worker home visit in the same time period. For example, symptomatic participants may have provided a swab sample remotely sooner in these earlier days of the online survey launch than participants with no symptoms so that they could know their infection status.
All participants who provided a swab sample by remote data collection at the beginning of the online survey launch were at the start of their 14-day data collection window. Subsequent samples could have been taken at any time during a participant's 14-day data collection window, and invitations to participants to move to the remote data collection approach were also staggered. This means that participants could provide a sample at any point during their testing window, leading to overlap in times from the start of their testing window. This is why the behaviour of symptomatic participants may have affected the results in the first week, when there was no overlap with other data collection windows.
Between 18 and 31 July 2022, there was no statistical evidence of a difference between those who provided a swab sample by remote data collection and those who provided a swab sample at a study worker home visit in the likelihood of testing positive for COVID-19 on a nose and throat swab. This finding supports overall comparability of the results obtained from remote data collection and study worker home visits.
The odds ratios for this analysis are shown in Figure 1. An odds ratio of greater than 1 indicates a greater likelihood of an outcome in the specified group compared with the reference group, and an odds ratio of less than 1 indicates a lower likelihood. In this case, an odds ratio of greater than 1 indicates an increased likelihood of testing positive for COVID-19 for those who provided a swab sample remotely compared with those who provided a swab sample with a study worker home visit. An odds ratio of less than 1 indicates a decreased likelihood of testing positive for COVID-19.
Figure 1: There was no statistical evidence of a difference in the likelihood of testing positive for coronavirus (COVID-19) between remote and study worker home visit data collection methods, from 18 to 31 July 2022
Estimated likelihood of testing positive for COVID-19 on nose and throat swabs by day for those that provided a swab sample remotely compared with those who provided a swab sample at a study worker home visit, UK, 11 to 31 July 2022
Embed code
Notes:
An odds ratio of greater than 1 indicates a greater likelihood of an outcome in the specified group compared with the reference group, and an odds ratio of less than 1 indicates a lower likelihood.
This model controls for age, sex, geographical region the participant lives in, ethnicity, deprivation score, household size, whether the household was multigenerational, urban or rural classification of participant’s address, and the effect of a disability.
Download the data
Sensitivity analysis was produced using a second regression model, which controlled for the variables mentioned previously, as well as other variables that are associated with COVID-19 positivity, such as COVID-19 vaccinations, previous COVID-19 infection and recent contact with hospitals. When controlling for these additional variables the results comparing the two data collection methods were very similar. Odds ratios from this model and the previous model can be found in Tables 1a and 1b of the the Coronavirus (COVID-19) Infection Survey quality report: December 2022 dataset.
All variables used and variables considered for these models can be found in Section 7: How the data are measured.
Nôl i'r tabl cynnwys4. Likelihood of strong positive COVID-19 cases reporting symptoms by data collection method
This section considers the effect of the data collection method on the reporting of symptoms in people with a strong positive coronavirus (COVID-19) test (cycle threshold (Ct) value less than 30). This analysis uses data from 19 July to 1 August 2022 where both remote and study worker home visit data collection methods were used. Participants across the UK were asked to report whether they experienced the following symptoms in the seven days before they were tested, and separately whether they felt that they had symptoms compatible with a COVID-19 infection in the last seven days:
fever
muscle ache (myalgia)
fatigue (weakness or tiredness)
sore throat
cough
shortness of breath
headache
nausea or vomiting
abdominal pain
diarrhoea
loss of taste or loss of smell
Symptoms were self-reported and were not professionally diagnosed.
Among those who tested positive for COVID-19 with a strong positive test, the percentage who reported symptoms was 60% (95% confidence interval: 57% to 62%) for data collected by study worker home visits, and 78% (95% confidence interval: 76% to 80%) for data collected remotely.
To further assess the effect of how the data were collected, we used a logistic regression model to compare the estimated likelihood of reporting symptoms by data collection method, among those with a strong positive test. The model controlled for age, sex, region, ethnicity, long-term health condition, work sector and deprivation score.
The results showed that remote data collection led to participants with a strong positive test being 2.7 (95% confidence interval: 2.2 to 3.3) times more likely to report symptoms compared with those who had study worker home visit data collection.
There are several potential reasons why the reporting of symptoms may differ by data collection method. This analysis only included participants who tested positive for COVID-19 with a strong positive test, and participants may be more likely to choose to complete the survey and test using remote data collection while experiencing symptoms. In contrast, study worker visits were scheduled independently, so participants did not have the same choice about when they occurred. Other potential reasons may relate to differences in how the questionnaire was interpreted when completed remotely compared with completion with a study worker.
These results show that data from the COVID-19 Infection Survey on the percentage of people with a strong positive test reporting symptoms should not be considered equivalent across the two data collections methods. Data from both collection methods provide valuable insights on the most commonly reported symptoms and trends in reported symptoms among the population when analysed separately.
Analysis on symptoms presented in our Coronavirus (COVID-19) Infection Survey: characteristics of people testing positive for COVID-19, UK publications used a different method. This method considered the percentage of people that reported symptoms at survey visits within 35 days of the first positive test in a positive episode where any test was a strong positive. For this reason, symptoms analysis presented in this publication is not directly comparable with our previous publications.
Nôl i'r tabl cynnwys6. Summary of findings
The findings presented in this article, as well as findings from our August 2022 Coronavirus (COVID-19) Infection Survey quality report and September 2022 Coronavirus (COVID-19) Infection Survey quality report indicate that the change to a remote data collection method has had minimal impact on most survey results, including the likelihood of testing positive for COVID-19. However, results show that data from the COVID-19 Infection Survey on the percentage of people with a strong positive test reporting symptoms should not be considered equivalent across the two data collection methods.
Nôl i'r tabl cynnwys7. How the data are measured
Likelihood of testing positive for coronavirus (COVID-19) by data collection method
The models described in Section 3: Likelihood of testing positive for COVID-19 by data collection method test the effect of data collection method by day on the likelihood of testing positive for COVID-19, while controlling for several other variables. Variables controlled for in our first model were:
- age
- sex
- geographical region the participant lives in
- ethnicity
- deprivation score
- household size
- whether the household was multigenerational
- urban or rural classification of the participant's address
- effect of a disability (from not having a disability to affected "a lot" by a disability)
Variables controlled for in our second model were:
all of the variables controlled for in our first model
work status (responses were grouped into "Employed, working", "Employed, not working", "Not working", "Retired" and "Child/student")
whether the participant was previously infected with COVID-19 based on a positive swab test (in the survey, the English national testing programme or self-reported)
whether the participant had travelled abroad in the previous 28 days
COVID-19 vaccinations
contact with hospitals in the previous 28 days
contact with care homes in the previous 28 days
whether the participant currently smoked
Additional variables considered for the model that were not included were:
whether a child aged 16 years or under lived in the household
whether an adult aged 70 years or over lived in the household
days worked outside the home
whether the participant worked in a patient-facing healthcare role, a health and social care role or a care home
whether the participant worked in a role that involves direct contact with others
work sector
work or school location (at home or elsewhere)
social distancing at work or school
how the participant travels to work or school
These variables were not included in the model because our screening process revealed no statistical evidence of association between them and the likelihood of testing positive for COVID-19.
Nôl i'r tabl cynnwys9. Collaboration
The Coronavirus (COVID-19) Infection Survey analysis was produced by the Office for National Statistics (ONS) in collaboration with our research partners at the University of Oxford, the University of Manchester, UK Health Security Agency (UK HSA) and Wellcome Trust. Of particular note are:
- Sarah Walker - University of Oxford, Nuffield Department for Medicine: Professor of Medical Statistics and Epidemiology and Study Chief Investigator
- Koen Pouwels - University of Oxford, Health Economics Research Centre, Nuffield Department of Population Health: Senior Researcher in Biostatistics and Health Economics
- Thomas House - University of Manchester, Department of Mathematics: Reader in Mathematical Statistics
10. Glossary
Deprivation
Deprivation is based on an index of multiple deprivation (IMD) (PDF, 2.18MB) score or equivalent scoring method for the devolved administrations, from 1, which represents most deprived, up to 100, which represents least deprived. The hazard or odds ratio shows how a 10-unit increase in deprivation score, which is equivalent to 10 percentiles or 1 decile, affects the likelihood of testing positive for COVID-19.
SARS-CoV-2
This is the scientific name given to the specific virus that causes COVID-19.
Effect of a disability
To measure how severely a disability affected participants, we asked them if any long-lasting health conditions reduced their ability to carry out day-to-day activities, as part of our Coronavirus (COVID-19) Infection Survey questionnaire. The response options for this question were: "Yes, a lot", "Yes, a little" or "Not at all".
Odds ratio
An odds ratio indicates the likelihood of an individual testing positive for COVID-19 given a particular characteristic or variable. When a characteristic or variable has an odds ratio of 1, this means there is neither an increase nor a decrease in the likelihood of testing positive for COVID-19 compared with the reference category. An odds ratio greater than 1 indicates an increased likelihood of testing positive for COVID-19 compared with the reference category. An odds ratio less than 1 indicates a decreased likelihood of testing positive for COVID-19 compared with the reference category.
Confidence interval
A confidence interval gives an indication of the degree of uncertainty of an estimate, showing the precision of a sample estimate. The 95% confidence intervals are calculated so that if we repeated the study many times, 95% of the time the true unknown value would lie between the lower and upper confidence limits. A wider interval indicates more uncertainty in the estimate. Overlapping confidence intervals indicate that there may not be a true difference between two estimates. For more information, see our methodology page on statistical uncertainty.
Cycle threshold (Ct) values
The strength of a positive coronavirus (COVID-19) test is determined by how quickly the virus is detected, measured by a cycle threshold (Ct) value. The lower the Ct value, the higher the viral load and the stronger the positive test. Positive results with a high Ct value can be seen in the early stages of infection when virus levels are rising, or late in the infection, when the risk of transmission is low.
Embed code
12. Cite this article
Office for National Statistics (ONS), published 21 December 2022, ONS website, methodology article, Coronavirus (COVID-19) Infection Survey, Quality Report: December 2022