1. Main changes

  • Trends in population size, ageing and mortality rates are accounted for by the new method for estimating the expected number of deaths used in the calculation of excess mortality (the difference between the actual and expected number of deaths); this is not the case for the current method, which uses a simple five-year average to estimate the number of expected deaths.

  • Individual weeks and months that were substantially affected by the immediate mortality impact of the coronavirus (COVID-19) pandemic are removed from the data when estimating expected deaths in subsequent periods, whereas the current approach involves removing data for the whole of 2020.

  • Use of a statistical model means that multiple demographic, trend, seasonal and calendar effects can be included simultaneously in the estimation of expected deaths, and confidence intervals can readily be obtained.

  • A "bottom-up" approach to aggregation means that estimates of excess deaths are additive across age groups, sexes, and high-level geographies, and between months and years.

  • Having a common methodology for all four UK countries means that estimates of excess deaths are consistent and comparable across all parts of the UK, and the new methodology is largely coherent (though not identical) to that used by the Office for Health Improvement and Disparities to estimate excess deaths in English local authorities.

Nôl i'r tabl cynnwys

2. Overview of the new methodology for estimating excess mortality

Excess mortality is the difference between the observed number of deaths in a particular period and the number of deaths that would have been expected in that period, based on historical data. This is sometimes known as the "baseline" for excess mortality calculations. Our new methodology for estimating excess mortality involves the following steps:

  1. Create a dataset comprising the number of deaths for each unique combination of age group, sex, and geography (referred to as an age-sex-geography stratum) and time period; the geographies are the four UK countries and nine former Government Office Regions of England.

  2. Fit a statistical model to the number of deaths in previous periods in each age-sex-geography stratum.

  3. Use the model to predict the number of deaths in the reference period in each age-sex-geography stratum.

  4. Sum the expected number of deaths across age-sex-geography strata to obtain the expected total number deaths registered in the reference period.

  5. Estimate the number of excess deaths as the difference between the observed and expected number of deaths in the reference period.

More contextual information on excess mortality and the development of our new methodology can be found in the accompanying National Statistical Excess deaths - a new methodology and better understanding blog post. Other UK government bodies also publish estimates of excess deaths, as summarised on the GOV.UK website.

Statistical modelling

Our new methodology for estimating the expected number of deaths involves fitting a quasi-Poisson regression model to aggregated death registration data. This statistical model provides the expected number of deaths registered in the current period, if trends in mortality rates remained with the same as those from recent periods and in the absence of extraordinary events affecting mortality, such as the peak of the coronavirus (COVID-19) pandemic.

To prepare the dataset for modelling, numbers of deaths are summed by period (weeks or months), age group, sex, and geography. For weekly data, the expected number of deaths in age-sex-geography stratum i in period t is the predicted value, d[i,t], from the statistical model:

The model includes:

  • age group: under one year, one to four years, and then five-year age bands up to 90 years and older (consistent with the European Standard Population 2013)

  • age coarse: under 30 years, 30 to 69 years, and then five-year age bands up to 90 years and older

  • sex: male or female

  • region: English region of residence (only in the model for deaths registered in England and Wales among residents of England)

  • trend: a linear time trend, modelled as a time index ranging from one to the number of periods in the dataset

  • week number: a seasonal component, modelled as a categorical variable representing the week number within each year (mortality rates follow a clear seasonal pattern, with more deaths in the winter than the summer each year)

  • age by week: each pairwise combination of age coarse and week number

  • population: the estimated size of the population

For monthly data, the expected number of deaths in age-sex-geography stratum i in period t is the predicted value, d[i,t], from the statistical model:

The model fitted to monthly data is similar to the one fitted to weekly data, except week number is replaced with month number (a categorical variable representing the month number within each year). Correspondingly, age by week is replaced with age by month. The weekdays variable represents the number of weekdays in the month, to reflect reduced death registration activity on weekends compared with weekdays and potentially an extra weekday in February in leap years. This approach does not adjust for bank holidays (see Section 7: Future developments).

The coefficients are estimated within the model. Each of the categorical characteristics (age group, coarse age group, sex, English region, week number, and the pairwise combination of coarse age group and week number) is modelled using a set of indicator variables, equaling one if age-sex-geography stratum i has that attribute and zero otherwise. One level of each categorical variable is assigned as a reference level and, therefore, does not have an estimated coefficient associated with it. For example, although the age variable in the model comprises 20 groups, only 19 coefficients are estimated. The choice of reference level does not affect the expected number of deaths produced by the model.

The sex, trend and seasonal terms are interacted with a coarse age group variable to allow these components to vary by age group. For example, the effect of cold weather in the winter might be expected to have a bigger effect on the mortality rate among older people than younger people.

When using a quasi-Poisson regression model, modelling the number of deaths as the dependent variable, and including the natural logarithm of population size as an offset term is analogous to modelling the mortality rate in each age-sex-geography stratum.

Separate models are fitted for each of the four countries of the UK (plus one for England and Wales combined). Each model is fitted to five years of data with a lag of one year from the end of the fitting period to the current period, so the expected number deaths in each week or month has its own five-year baseline period. For example, when estimating the expected number of deaths in January 2024, the model will be fitted to data spanning February 2018 to January 2023. This lag ensures that a large number of excess deaths in the current period (for example, because of an outbreak of a new infectious disease or another emerging public health event) do not immediately influence the model-fitting process to estimate excess deaths the following period. This gives time to identify and decide how best to treat extraordinary mortality events in the estimation process.

Nôl i'r tabl cynnwys

3. Further methodological details

Calculation of excess deaths

The number of excess deaths in each period and age-sex-geography stratum is calculated as the difference between the observed and expected number of deaths:

Where E-hat[i,t] is the estimated number of excess deaths, d[i,t] is the observed number of deaths and d-hat[i,t] is the expected number of deaths in age-sex-geography stratum i in period t.

The estimated total number of excess deaths in each period is obtained by summing estimated excess deaths across age groups, sexes, and geographies:

This "bottom-up" approach ensures additivity throughout the aggregation structure. For example:

  • estimated excess deaths by age group for males in a particular UK country will sum to the total estimated excess deaths across all age groups for males in that UK country

  • estimated excess deaths for males and females in a particular UK country will sum to the total estimated excess deaths for both sexes combined in that UK country

  • estimated excess deaths in individual UK countries (England and Wales combined, Scotland, and Northern Ireland), including deaths among non-residents, will sum to the total estimated excess deaths for the UK

Temporal additivity between monthly and annual estimates of excess deaths is also achieved by summing the estimated excess deaths obtained from the monthly model to derive annual totals. However, weekly estimates will not necessarily sum to annual estimates, as weeks may straddle calendar years at the beginning and end of each year.

Confidence intervals

It will always be the case that the number of excess deaths in a period is an estimate rather than a known value, because the number of expected deaths is a counterfactual quantity that must be estimated from observed data using statistical techniques. To reflect this uncertainty inherent in expected and excess deaths estimates, 95% confidence intervals are constructed around the excess deaths estimates using the following formula:

Where E-hat[i,t] is the estimated expected deaths in age-sex-geography stratum i in period t, and SE(E-hat[i,t]) is the standard error of the estimate, which is the square root of the variance of the estimate, V(E-hat[i,t]). See our Uncertainty and how we measure it methodology for more information on confidence intervals and standard errors.

The number of excess deaths is estimated as the difference between the observed and expected number of deaths, so the variance of the estimated excess deaths is a combination of the variances of both these quantities. However, the observed number of deaths is a known quantity rather than an estimate, so it has no variance. Therefore:

Where d-hat[i,t] is the expected number of deaths in age-sex-geography stratum i in period t and V(d-hat[i,t]) is its variance, approximated through the Delta method.

The overall variance of the expected total number of deaths across age groups, sexes and geographies in each period can be found by summing the stratum-specific variances within periods:

Population denominators

The population denominators used to calculate mortality rates in each period and age-sex-geography stratum are derived from mid-year population estimates. These population estimates are not timely enough to feed into contemporary estimates of excess deaths. For example, estimates relating to mid-2022 were not published until August 2023 for Northern Ireland, on the Northern Ireland Statistics and Research Agency (NISRA) website, and November 2023 for England and Wales in our 2021-based National population projections bulletin. They have not yet been published for Scotland.

In the future, the Dynamic Population Model and resulting admin-based population estimates may provide more timely estimates (see our Admin-based population estimates: local authorities in England and Wales article). For the time being, the mid-year population estimates are extrapolated with population projections in each age-sex-geography stratum. Historical estimates of excess deaths will be revised whenever population projections for a given year are replaced by the mid-year population estimate.

National population projections are typically updated once every two years but subnational projections (needed for population denominators in the English regions) are only updated once every four years. These are published several months after the corresponding national update. For example, our 2021-based National population projections bulletin was published in January 2024, and before this our 2020-based National population projections bulletin was published in January 2022.

However, our latest available Subnational population projections bulletin (2018-based) was published in March 2020. Therefore, contemporary population sizes for the English regions are obtained by applying the regional proportions from the latest mid-year population estimates to the latest available national population projections for England. This ensures that the population denominators used for calculating mortality rates across the English regions sum to the national population denominator for England.

Population estimates and projections relate to the estimated population size at the mid-point of each year, but population denominators are needed on weekly and monthly bases for excess deaths calculations. Therefore, weekly and monthly population estimates are linearly interpolated between the mid-year estimates.

Accounting for the coronavirus (COVID-19) pandemic

The pandemic saw a large increase in death registrations, particularly in certain weeks and months that coincided with "waves" of infection (for example, when new COVID-19 variants became widespread in the population). To avoid these periods affecting estimates of expected deaths in subsequent periods, they are removed from the dataset when the model is fitted so that they do not contribute to the mortality baseline. This means that estimates of excess deaths in subsequent periods relate to the additional deaths registered in the period, over and above what would be expected from previous periods had they not been extraordinarily affected by the pandemic.

We define periods extraordinarily affected by the direct mortality impacts of the pandemic as being those where COVID-19 was given as the underlying cause of death for at least 15% of all deaths registered in the period across the UK. This threshold gives the greatest coherence between the weekly and monthly data in terms of periods excluded from the model fitting. These periods are April and May 2020, and November 2020 to February 2021 for monthly data; they are Weeks 14 to 22 of 2020, and Week 45 of 2020 to Week 8 of 2021 for weekly data.

Excess deaths in Week 53

The annual calendar on which we report our weekly mortality statistics usually comprises 52 seven-day weeks and is 364 days in length. By contrast, the Gregorian calendar year (used by most countries across the world) is 365 days long for non-leap years and 366 days long for leap years. This means that the reporting calendar slips out of alignment with the Gregorian calendar by one or two days each year. To avoid this misalignment becoming too severe, there is international agreement that a "Week 53" should periodically be added to the reporting calendar.

Week 53 occurs infrequently (it was last added to the mortality calendar in 2020, and before that in 2015), so it is not practical to estimate a separate seasonal term for it when fitting models to five years of data. Instead, any instances of Week 53 are re-labelled as Week 52 when fitting models and obtaining expected numbers of deaths. This assumes that the mortality rate in a typical Week 53 is similar to a typical Week 52.

Producing estimates for individual UK countries and the UK as a whole

In the future, it is anticipated that we will publish estimates of excess deaths in each of the four UK countries as well as the total excess deaths in the UK as a whole. National Records of Scotland (NRS) and NISRA will also separately publish estimates of excess deaths for Scotland and Northern Ireland, respectively, using the same methodology as the Office for National Statistics (ONS). This will ensure consistent and comparable estimates across all parts of the UK.

For consistency with the death registrations data we publish and the devolved administrations, the following models are fitted to estimate excess deaths:

  • deaths registered in England or Wales, including those for non-residents

  • deaths registered in Scotland, including those for non-residents

  • deaths registered in Northern Ireland, including those for non-residents

  • deaths registered in England or Wales among residents of England

  • deaths registered in England or Wales among residents of Wales

The total number of estimated excess deaths across the UK is then derived by summing the outputs from the first three models listed. The fourth model listed includes English region of residence as an explanatory variable.

In practice, 10 models are fitted to obtain estimates of excess deaths: five for weekly data and five for monthly data. In addition, five models are fitted to the annual data to obtain standard errors and confidence intervals around the annualised estimates (monthly excess deaths estimates can be summed within years to obtain annual estimates, but this is not possible for the standard errors because of the existence of correlation between successive monthly estimates, which is generally the case with any time series data). To obtain the variance of the annualised estimate, we assume that its coefficient of variation is the same as that of the estimate from the model fitted to annual data.

The models fitted to annual data include age group, sex, English region (only in the model for deaths registered in England or Wales among residents of England), a trend component and the number of weekdays in the year.

Comparison with the current methodology

In our current approach to estimating excess deaths in England and Wales, and that of the devolved administrations of Scotland and Northern Ireland, the expected (baseline) number of deaths is estimated as the average number of deaths registered in a recent five-year period. In contrast, our new methodology is based on age-specific mortality rates rather than death counts, so trends in population size and age structure are accounted for. Furthermore, the five-year average mortality rate is adjusted for a trend, so historical changes in population mortality rates are also accounted for.

Before the pandemic, the five-year period used in the current methodology was the five years preceding the current year. For example, the expected number of deaths in 2019 was estimated as the average number of deaths registered from 2014 to 2018 (inclusive). Weekly and monthly expected deaths were estimated as the average number of deaths registered in the same week or month over the past five years. For example, the expected number of deaths in Week 1 of 2019 was estimated as the average number of deaths registered in Week 1 from 2014 to 2018 (inclusive).

The expected number of deaths in 2021 was estimated as the average of deaths registered from 2015 to 2019 rather than 2016 to 2020, to avoid the pandemic distorting the excess deaths calculation. The expected number of deaths in 2022 was estimated as the average of deaths registered in 2016, 2017, 2018, 2019 and 2021.

In contrast, individual weeks and months, rather than whole years, that were substantially affected by the immediate mortality impact of the pandemic are removed from the expected deaths calculation under the new methodology.

Other improvements brought about by the change in methodology include:

  • use of a statistical model means that multiple demographic, trend, seasonal and calendar effects can be included simultaneously in the estimation of expected deaths, and confidence intervals can readily be obtained

  • a "bottom-up" approach to aggregation means that estimates of excess deaths are additive across age groups, sexes, and high-level geographies, and between months and years

  • having a common methodology for all four UK countries means that estimates of excess deaths are consistent and comparable across all parts of the UK, and the new methodology is largely coherent (though not identical) to that used by the Office for Health Improvement and Disparities (OHID) to estimate excess deaths in English local authorities.

Nôl i'r tabl cynnwys

4. Estimates from the new and current methods

The weekly estimates of excess mortality relate to the whole of the UK. Weekly, monthly, and annual estimates of expected and excess mortality broken down by UK country, age group and sex, as well as 95% confidence intervals around all estimates, can be found in the accompanying dataset. The death registrations data and population estimates in the dataset can be used alongside the code available on GitHub to recreate our new methodology for estimating excess mortality.

Before the coronavirus (COVID-19) pandemic

The new and current methods produce estimates of excess deaths with similar trends and seasonal patterns over the nine-year period before the coronavirus pandemic, 2011 to 2019 (Figure 1). From 2011 to 2013, estimates from the new method are generally higher than those from the current method. However, estimates from the new method are consistently lower than those from the current method from 2015 onwards, particularly in 2019.

On an annual basis, the new method estimates negative 34,408 excess deaths (that is, fewer deaths registered than would be expected) in the UK throughout the whole of 2019 compared with 6,006 for the current method, a difference of negative 40,415 excess deaths (Table 1). This is coherent with the age-standardised mortality rate for the UK population, which fell by 3.9% between 2018 and 2019, the largest annual fall since 2009 (5.3%).

During and after the peak of the pandemic

The new and current methods estimate similar numbers of excess deaths during the pandemic (Figure 2). In particular, the two approaches produce similar peaks in estimated excess mortality in the second quarter of 2020 and the winter of 2020 to 2021. However, estimates from the new method are generally lower than those from the current method throughout the latest year, 2023, by an increasing amount.

On an annual basis, the new method estimates 76,412 excess deaths in the UK in 2020, compared with 84,064 estimated by the current method (Table 2). For context, the highest number of excess deaths estimated by the new method over the nine years before the pandemic is 30,858 in 2015. In the latest year, 2023, the new method estimates 10,994 excess deaths in the UK, 20,448 fewer than the current method.

Nôl i'r tabl cynnwys

5. Components of differences between estimates from the new and current methods

Unlike the current method, the new method accounts for trends in population size, age structure and mortality rates. The UK population grew by 12.2% between 2006 and 2023 (based on mid-year population estimates to 2022 for England, Wales, and Northern Ireland, and 2021 for Scotland, appended with population projections), from 60.8 million to 68.3 million (Figure 3). All else being equal, having more people in the population each year means we can expect more deaths to occur. Furthermore, people aged at least 70 years, the group in which most deaths occur each year, made up an increasing share of the UK population (from 11.5% in 2006 to 13.8% in 2023), with the size of this group growing by 35.4% over the period.

Conversely, mortality rates have been decreasing over time. The age-standardised mortality rates (ASMRs) among the UK population generally decreased from 2006 to 2011, before levelling off from 2012 to 2018 and dropping again in 2019 (before the start of the pandemic).

These increasing trends in population size and ageing, and the generally decreasing trend in mortality rates, are not accounted for by the current methodology for estimating excess deaths. However, they are reflected in the new methodology.

Figure 4 shows the overall difference in the estimated number of excess deaths between the current and new methods (solid line), and the contributions to this difference of:

  • accounting for population growth and ageing

  • accounting for trends in mortality rates

  • all other methodological changes between the current and new methods

Positive bars indicate positive contributions to the overall difference in excess deaths estimated by the two methods, while negative bars indicate negative contributions to the overall difference.

In all years from 2011 to 2023, accounting for population growth contributes negatively to the difference in estimated excess deaths between the new and current methods. The growing, ageing population of the UK means that, all else being equal, more deaths would be expected from one year to the next. This has the effect of pulling up expected deaths, and pushing down excess deaths, for the new method relative to the current method.

From 2011 to 2013, the biggest contributor to the difference between the estimates of excess deaths is accounting for trends in mortality rates. This is likely to be attributed to the decreasing trend in the population ASMRs from 2006 to 2011, which is taken into account by the new method but not the current method. This feature also makes a relatively large contribution to the difference between the estimates in 2020 and 2021, following a notable annual decrease in the ASMR in 2019. This has the effect of pushing down expected deaths and pulling up excess deaths for the new method, relative to the current method.

Other changes to the methodology, besides accounting for trends in population size, ageing and mortality rates, make a relatively large contribution to the difference between the new and current estimates in 2022 and 2023. These other changes include the treatment of periods substantially affected by the coronavirus (COVID-19) pandemic (removing individual months rather than the whole of 2020), which makes a notable contribution in 2022 and 2023, and using a quasi-Poisson regression model to estimate the number of expected deaths in each period.

Nôl i'r tabl cynnwys

6. Estimating excess deaths in the UK, methodology changes data

Estimating excess deaths in the UK, methodology changes
Dataset | Released 20 February 2024
Outputs from the new method for estimating excess deaths across UK countries.

Nôl i'r tabl cynnwys

7. Future developments

We will regularly review estimates produced by the new excess deaths methodology, with further refinements to the approach being undertaken if necessary. As such, our estimates of excess deaths produced by the new methodology will be labelled as official statistics in development while further review, testing and development work is undertaken.

Future work may seek to investigate:

  • additional statistical properties of the new methodology and improved estimation of confidence intervals through simulation

  • alternative approaches to handling periods affected by the coronavirus (COVID-19) pandemic, such as varying the threshold used to identify these periods (currently COVID-19 deaths making up at least 15% of all deaths) and accounting for the proportion of COVID-19 deaths as a continuous variable rather than applying a discrete threshold

  • alternative methods of interpolating daily population sizes in between mid-year estimates and projections, such as spline interpolation to obtain a smooth curve rather than assuming linearity between the annual values

  • the sensitivity of the estimates to using more or less than five years of historical data when fitting the statistical model, and increasing or decreasing the lag between the end of the fitting period and the current period

  • different statistical modelling specifications; for example, testing a wider range of interactions between variables than those currently included in the model, and further exploring time series modelling techniques

  • the effect of modelling the number of public holidays (when death registration activity is generally reduced) in a week or month, though it may be challenging to apply this consistently across UK countries; for example, some public holidays and registration office closures are determined locally in Scotland and Northern Ireland, so it may not be possible to derive a suitable national-level variable to include in the model

  • implementations of the models by place of death (for example, at home, in hospital or in a care home) and cause of death, enabling the production of place- and cause-specific estimates of excess deaths, and producing estimates for subnational geographies beyond English regions

  • implementation of the models based on date of death or date of notification rather than date of registration

  • approaches to accounting for large numbers of excess deaths in one period generally being followed by lower or negative excess deaths in later periods (the "mortality displacement" effect, as detailed in our Excess mortality and mortality displacement in England and Wales article)

Collaboration

This methodology was developed in collaboration with the Office for Health Improvement and Disparities (OHID), the UK Health Security Agency (UKHSA), Public Health Wales, the Welsh Government, National Records of Scotland (NRS), Northern Ireland Statistics and Research Agency (NISRA), and members of the actuarial profession from Lane Clark & Peacock LLP and the Continuous Mortality Investigation. An earlier iteration of this work was reviewed by the Office for National Statistics (ONS) Methodological Assurance Review Panel.

Nôl i'r tabl cynnwys

9. Cite this article

Office for National Statistics (ONS), released 20 February 2024, ONS website, article, Estimating excess deaths in the UK, methodology changes: February 2024

Nôl i'r tabl cynnwys

Manylion cyswllt ar gyfer y Erthygl

Health Statistics and Research
health.data@ons.gov.uk
Ffôn: +44 1633 455825