1. Introduction

This guide provides technical information on the data and methodology used for the analyses presented in the article Living longer: is age 70 the new age 65?. We have included an overview of the data sources used, how issues with survey data were overcome, a description of the calculation of prospective age, and findings from sensitivity analyses.

Survey data, taken from the household population, were used to estimate the proportion of the population reporting poor health or limiting longstanding illness. The main issues encountered with the data were changes in the wording of the general health and limiting longstanding illness questions over time, and small sample sizes for the calculation of the poor health/illness proportions when survey data is split by single year of age, sex, and survey year.

We conducted sensitivity analyses using 2001 and 2011 census data examining the effect of the exclusion of communal establishments on these health/illness proportions, and comparing the proportions in census data with survey data for the same years.

Nôl i'r tabl cynnwys

2. Data sources

Population estimates and projections

England and Wales population estimates by single year of age and sex for 1911, 1921 and 1931 from Census reports. Publicly available on histpop.org.

England and Wales population estimates by single year of age and sex for 1951, 1961, and 1971 from Census reports. Provided by ONS Census Customer Services.

Great Britain mid-year population estimates by single year of age and sex for 1981 to 2018 from mid-2018 Population Estimates (detailed time-series supporting file).

Great Britain mid-year population projections by single year of age and sex for 2019 to 2066 from 2018-based National Population Projections.

Life expectancy

England and Wales period life expectancy by single year of age and sex for 1911, 1921, 1931, 1951, 1961, and 1971 from internal ONS database.

Great Britain period life expectancy by single year of age and sex for 1981 to 2016, and projected from 2017 to 2066, from 2016-based Past and Projected life tables release.

Survey data (for health/illness proportions)

General Household Survey (GHS) 1972 to 2004: Time series end user licence dataset available from UK Data Archive.

General Household Survey (GHS) 2005 and 2006. End user licence dataset available from UK Data Archive.

Opinions and Lifestyle Survey (OPN) 2006 to 2017. Compiled from 98 end user licence datasets available from UK Data Archive. Only modules which included both general health and limiting longstanding illness questions were included (with some exceptions in 2006 and 2007).

Census data (for sensitivity analyses)

2001 census tables C1413 and C1414. Bespoke tables provided by ONS Census Customer Services.

2011 census tables CT1044 and CT1045. Publicly available on ONS website.

Adjustments to survey data

The 2001 General Household Survey (GHS) appeared to have a maximum age group of 85 years plus, with no data on ages over 85 years, and a high number of respondents at age 85 years relative to younger ages. We deleted data for this open-ended age group as it was not possible to distinguish between single years of age.

The Opinions and Lifestyle Survey (OPN) had overlapping data in January and February 2015. This was apparently because data collection for the 2014 survey year ran over into early 2015 and was added to data from the 2015 survey year in these months. Because of concerns about weighting and representativeness, we excluded the 2014 survey months which took place in 2015 from the main analysis.

The number of respondents at ages 60 years and over who answered the general health and the limiting longstanding illness question for each survey year are provided in Figure 1.


For estimates of the proportion of the population at and over different prospective ages, data for England and Wales were used for years prior to 1981. This is because life expectancy by single year of age and sex for higher UK geographies was unavailable for earlier years.

For the time series 1981 to 2017, and for the projections of the population age structure to 2066, data for Great Britain are used. Both the GHS and OPN surveys are representative of the population of Great Britain.


No survey weights were available for the GHS prior to 2000 when the survey was redeveloped with a new design. This was because response rates tended to be higher in earlier years. From 2000 to 2005, the GHS non-response weight was used, and from 2006 to 2017 the OPN calibration weight (which adjusts for likelihood of selection within the household, non-response rates and scales up to population level) was used.

Nôl i'r tabl cynnwys

3. Converting between general health and limiting longstanding illness question versions

During the time period covered, the European Union Statistics on Income and Living Conditions (EU-SILC) standardisation of the general health question changed from a three-point to a five-point response, to allow comparison with other EU countries. For the proportion of people reporting poor health, which was used in this paper, the response to the general health question was converted into a dichotomous health outcome.

The three-point general health question was used in the General Household Survey (GHS) from 1977 to 2004. In GHS 2005, Quarters 2 to 4 (Apr to Dec), and GHS 2006 Quarter 1 (Jan to Mar) both versions of the health question were asked. In OPN 2006 to 2017 the five-point (EU-SILC) version of the question was used.

We used data from respondents who answered both versions of the health question to convert responses in earlier years from the three-point to the five-point version. We needed to do this because people could be coded as having poor health based on one version of the question, but not when based on the other version. For instance, a large proportion of people answering “fairly good” on the three-point health question (coded as “good” dichotomous health) went on to answer “fair” on the five-point version (coded as “not good” dichotomous health).

Table 1 shows how the three- and five-point versions of the question were converted into dichotomous health status.

Using data from GHS 2005 to 2006 (when both versions of the health question were asked) we assigned responses to a dichotomous health category based on the three-point health question (Table 1). We then calculated the proportion of respondents (by age, sex, and three-point response category) who had a different dichotomous health category when based on the five-point version.

Using these conversion proportions (by age, sex, and three-point response category) we switched the dichotomous heath status of that proportion of respondents for all years where the three-point health question was used (1977 to 2004), and also for 2005 because of lower responses to the five-point than to the three-point question.

This methodology is adapted from a 2009 Health Statistics Quarterly article by Michael Smith and Chris White An investigation into the impact of question change on estimates of General Health Status and Healthy Life Expectancy.

Because of the small numbers, when split by single year of age, sex, and three-point response category, many of these conversion proportions were based on just a few people or were missing entirely (particularly at older ages). We deleted proportions which were based on fewer than nine respondents. This is an arbitrary number which was chosen because, compared to larger numbers, it minimised missing data at older ages (which are the focus of the analysis).

Additionally, to smooth fluctuations and impute missing conversion proportions, we calculated five-age moving averages of the conversion proportions. To reduce bias, if a proportion for any of the response categories was missing, we deleted the proportions for the other response categories for that age and sex.

Weights were not used in the calculation of these conversion proportions (only in subsequent analyses after these conversion proportions had been applied).

We applied the same approach for calculating conversion proportions for the limiting longstanding illness question. Two versions of the limiting longstanding illness question were used during the time period: whether activities are limited by the illness (yes or no), and whether the illness reduces the ability to carry out day-to-day activities (a little, a lot, or not at all).

The limited activities version of the question was used in GHS from 1979 to 2005, and in OPN from 2006 to 2015. The reduced activities version of the question was used in OPN in 2014, 2016 and 2017. As such, both versions of the illness question were asked in OPN 2014.

As previously highlighted, for the 2014 survey year, data collection ran into the first two months of 2015. This overlapping 2015 data was included to boost numbers in the calculation of the illness conversion proportions but was not included in the main analysis.

The illness questions were two stage, with respondents first being asked if they had a longstanding illness, and a follow-up question about whether it limited, or how much it reduced their activities. Table 2 shows the response options, and the dichotomous limiting longstanding illness status we applied to these responses.

We converted cases from the reduced activities version of the question to the limited activities version because the limited activities version was used for all survey years except for 2016 and 2017. We did this by applying the same process of calculating conversion proportions as used for the general health question.

For 2014, the limited activities version of the question was used where it was present. If the limited activities version was not present but the reduced activities version was, then the reduced activities version was used (and converted).

Illness conversion proportions based on fewer than five respondents were deleted. This is a lower threshold than for general health because there were more missing proportions at older ages, partly because respondents are split across four response categories rather than three.

Nôl i'r tabl cynnwys

4. Smoothing poor health/illness proportions across three adjacent ages and years

The sample size, when split by survey year, single year of age, and sex was small, particularly at older ages. This resulted in big fluctuations in the proportion of people reporting poor general health or a limiting longstanding illness, and a lot of missing data. Poor health/illness proportions which were based on fewer than 10 respondents were deleted, resulting in even more missing data.

To smooth fluctuations, impute missing proportions, and boost sample sizes, a moving three-age average of health/illness proportions was calculated, where there were valid proportions for at least two out of the three ages. These three-age average proportions were then additionally averaged across three adjacent years. For example, the proportion reporting poor health at age 85 years in 1995 is based on the proportions in Table 3.

The first year of the time series used in the analysis is 1981. As the survey data goes back to 1977 and 1979 respectively for health and illness, it was possible to calculate three-year average proportions for 1981. For the last year in the time series (2017), proportions for 2017 and 2016 were averaged.

Data were not collected in 1997 or 1999. Proportions for 1998 were averaged across 1996, 1998 and 2000. Values for 1997 and 1999 were then imputed from 1996 and 1998, and 1998 and 2000, respectively.

Nôl i'r tabl cynnwys

5. Combining male and female data

To further smooth fluctuations and boost sample sizes, proportions of men and women reporting poor health/limiting longstanding illness were combined to person level for each chronological and prospective year of age. This greatly boosted the minimum and average number of respondents that proportions were calculated from (see Table 4).

For chronological ages, person-level proportions were achieved by calculating the three-age and three-year average proportion of people at each age reporting poor health/limiting longstanding illness from the weighted survey data.

For prospective ages, men and women are usually at different ages when they have a set number of years of remaining life expectancy (RLE). For instance, the attained (completed) age at RLE15 in 2017 was 70 years for men and 72 years for women.

For this example, the three-age three-year average poor health/illness proportions were applied to the number of men aged 70 years and and women 72 years in 2017, and then combined to estimate the number of people at age RLE15 in 2017 who reported poor health or a limiting longstanding illness.

This was then divided by the total number of people at age RLE15 in 2017 (men aged 70 years and women aged 72 years), to calculate the proportion of people at RLE15 in 2017 who reported poor health/a limiting longstanding illness.

Nôl i'r tabl cynnwys

6. Confidence intervals around poor health/illness proportions

To assess the level of uncertainty in the poor health and illness proportions, and whether observed differences in these proportions would hold up to statistical scrutiny, we calculated 95% confidence intervals around each proportion. We used these confidence intervals internally to assess trends but did not include them in the published article to avoid confusion and for clarity of results.

When calculating three-age and three-year average health/illness proportions, the unweighted number of respondents which each proportion was based on was added together. It is this number which was used to calculate confidence intervals, using the following formula:

Nôl i'r tabl cynnwys

7. Calculating prospective age

The age at which there is 15 years of life expectancy remaining is not simply life expectancy at birth minus 15. For instance, in 2017, male period life period expectancy at birth was 79.6 years, but men aged 70 years could expect to live a further 15 years (to age 85 years) on average. This is because life expectancy at birth takes into account mortality rates at all ages, yet by age 70 years a person has already survived past many of these.

Calculating the age at which there was a set number of years (for example 15) of remaining life expectancy (RLE) involved identifying the last age at which there was more than 15 years of life expectancy. For women in 2017, this was 72 years (Table 5).

For this example, at some point while a woman is aged 72 years and before her 73rd birthday, she will have exactly 15 years of life expectancy remaining (on average). This means 72 years is the attained age at which women have RLE15 in 2017. The estimate of the population at the attained age is used in the calculation of health/illness proportions.

But this is not the exact age at RLE15, which lies somewhere between the 72nd and before the 73rd birthdays. The exact age is required to calculate the number and proportion of the population at a given RLE or older over time.

Using the example in Table 5, between ages 72 and 73 years, life expectancy drops by less than a year (0.8 years) from 15.4 to 14.6 years. This is despite having lived for a whole year. Half of this decrease occurs at age 72 years (15.4 minus 15 is 0.4 years) and half at age 73 years (15 minus 14.6 is 0.4 years).

If we assume that life expectancy between any two ages declines linearly, we can estimate that RLE15 occurs at exact age 72.5 years. To then calculate the number of women in the population aged RLE15 or older, we take population estimates for age 73 years and over, and add half of the estimate for age 72 years (because we have identified that RLE15 occurs halfway through this age).

Nôl i'r tabl cynnwys

8. Limiting results to direction of change for multiple ages and years

After smoothing the proportions out across three age groups and three adjacent years, and combining males and females, there were still large fluctuations between different years and different ages, for both prospective and chronological ages. Figures 2 to 5 demonstrate this.

Both poor health and illness proportions at different chronological ages appeared to decline over the time period, while there was a mixed picture for prospective ages. However, whether a proportion declined over time, and whether this was statistically significant, was dependent on the specific age and years selected for comparison.

Large confidence intervals, particularly at older ages, meant declines in health/illness proportions for chronological ages were only significant around half of the time. We propose that the failure to reach statistical significance for many of the declines in health/illness proportions at chronological ages over time is the result of the following data limitations:

  • large confidence intervals resulting from small sample sizes (despite the measures taken to boost numbers)

  • changes in the question wording over the time period which introduced extra uncertainty

Comparing the direction of change for multiple ages and pairs of time points enabled us to assess whether the trend of declining health/illness proportions at chronological ages is consistent. This was the reason for making multiple comparisons which examine the direction of change for a wide range of ages and years.

Nôl i'r tabl cynnwys

9. Sensitivity analyses using 2001 and 2011 census data

We conducted two sensitivity analyses using census data. A comparison of poor health/illness prevalence between the household and the total population, and a comparison of health/illness proportions calculated from survey and census data for the same year.

Data from the 2001 and 2011 censuses showed that a large proportion of the older population, particularly women, live in communal establishments. And older people living in communal establishments report poorer general health and higher prevalence of limiting longstanding illness. The weighted survey data that we used to calculate poor health/illness proportions is representative of the household population (and excludes those living in communal establishments).

To estimate the extent to which this biased our results, we used census data to compare poor health/illness proportions for the whole population with proportions for the household population (excluding communal establishments). We converted between the different general health and limiting longstanding illness question versions used in the 2001 and 2011 censuses using the same methodology as detailed above.

For ages 60 to 90 years, prevalence of poor health was between 0.24% and 3.12% (mean 0.97%), higher for the general than for the household population. For limiting longstanding illness prevalence was between 0.72% and 5.9% (mean 2.0%) higher for the general population. The largest differences were concentrated at the oldest ages, reflecting the larger communal establishment population at these ages.

This suggests that the exclusion of communal establishments in our calculations leads to an underestimation of the true poor health/illness prevalence in the population by up to 5.9%. This underestimation is the greatest at older ages where we additionally see the smallest survey sample sizes, the greatest fluctuations, and the widest confidence intervals.

When comparing poor health/illness proportions between census (household population only) and survey data, we found that the proportions from the census data were often higher than the upper confidence interval of the proportions from the survey data. This was the case for general health in 2011 at all ages, and for limiting longstanding illness in both 2001 and 2011 from around age 75 years. The differences were greatest at older ages, and generally larger for women than men.

This suggests that from around age 75 years, the survey data underestimate the prevalence of limiting longstanding illness in the household population, particularly for women.

The census is both compulsory and universal across all ages, and in the absence of the respondent completing it themselves, is likely to be completed by proxy. In comparison, completing survey questions on poor health is voluntary, and potential respondents suffering from poor health may be less likely to respond, particularly at older ages.

There are two further constraints of these analyses.

There was not a consistent time series for health state life expectancies over a sufficiently long period for this analysis. Health state life expectancies are also confounded by changes in total life expectancy. In a situation where life expectancy increases faster than healthy life expectancy, increases in healthy life expectancy would be coupled with more years, and a larger proportion of life, lived in poor health.

In the three-point version of the general health question, "fairly good" was categorised as "good" health on the dichotomous outcome, while "fair" on the five-point version was categorised as "not good". There are concerns around how people interpreted these categories and whether someone reporting fair health aligns with poor health. Further research to explore this would be beneficial.

Nôl i'r tabl cynnwys

Manylion cyswllt ar gyfer y Methodoleg

Ngaire Coombs, Angele Storey, Rose Giddings
Ffôn: +44 (0)1329 444661