Developing improved estimates of quality adjusted labour inputs using the Annual Survey of Hours and Earnings

1. Abstract

This article describes work in progress to improve our estimates of quality adjusted labour inputs (QALI) using data collected in the Annual Survey of Hours and Earnings (ASHE). ASHE provides detailed estimates of the hourly earnings of UK employees, which we plan to use to augment the compilation of QALI indices, which currently rely almost exclusively on the Labour Force Survey (LFS). Because ASHE does not record levels of education this means using information from ASHE and LFS on the occupational classification of workers for the first time.

There are several strands to this work. Firstly, in order to construct a reasonable time series we need to convert historic occupational classifications used in earlier ASHE vintages to the most recent equivalent classification. This process is similar to conversions of industrial classifications, which is a fairly routine occurrence within the Office for National Statistics (ONS). However, conversion of historic occupational classifications is a non-trivial task. Although we have made some progress, more work remains to be done in this area.

Secondly, since ASHE records earnings per paid hour and QALI uses earnings per actual hour worked, we report some exploratory analysis on the relationship between actual and paid hours. This analysis suggests that the relationship between actual and paid hours can be modelled satisfactorily in terms of the characteristics that we use to stratify hourly earnings estimates on ASHE; namely age group, sex, industry and occupation.

Thirdly the article describes a method of benchmarking hourly earnings in the QALI framework to ASHE estimates (the latter adjusted to an actual hours basis). To do this we first need to expand the QALI LFS-based framework to include occupation in addition to the existing age, sex, industry and education dimensions, which leads to a large number of cells with missing pay data, particularly as we also propose expanding the current QALI industry breakdown from 10 to 19 industries. We propose to fill these empty cells using model-based estimates, which capture the relationships between pay and education for each occupation.

Fourthly, ASHE includes some sectoral information, which we have used to re-visit previous work on sectorisation of labour market metrics. In particular ASHE provides an improved source of estimates of non-market sector workers other than those in central and local government, as well as information on the sectoral dimension of second jobs, which is not available on LFS.

Lastly, and not directly related to the use of ASHE, we report some small methodological changes to how QALI deals with LFS respondents who do not report their level of education.

All of the work reported in this article is exploratory. We plan to do more work on converting occupation classifications and on modelling relationships within the LFS microdata and we need to develop the ASHE-LFS benchmarking framework from proof-of-concept to a full operational process. We will report on these planned developments alongside the next QALI release, which is scheduled for October.

As always, your feedback is welcome and can be sent to productivity@ons.gov.uk or to kris.johannsson@ons.gov.uk.

Nôl i'r tabl cynnwys

2. Introduction

A Quality Adjusted Labour Index (QALI) augments traditional measures of labour input by taking account of changes in labour composition. As such, it is one measure of the effective supply of labour: weighting changes in the hours worked of relatively high (low) productivity workers more heavily (lightly) to produce an index that reflects both changes in the quantity and quality of the labour supply.

As currently specified, QALI stratifies the employed labour force into 360 segments across four categories: education (six strata), sex (two), age group (three) and industry (10). We collect data from the Labour Force Survey (LFS) on hours worked and hourly earnings of each category in each quarter. These raw estimates are then benchmarked to industry-level estimates of hours worked and labour income. QALI indices are then compiled by weighting (log) changes in hours worked by the income weights implied by the combination of hours worked and average hourly remuneration of each QALI category. Other things equal, a QALI index will increase faster than a simple measure of hours worked when labour composition is shifting towards those categories with relatively higher hourly remuneration, for example, an increasing share of graduates in the employed labour force, or a rising share of labour employed in industries that tend to pay higher wages.

We have published experimental QALI estimates for a number of years. QALI indices are of some interest in their own right, but the principal reason for their compilation by the Office for National Statistics (ONS) is as a set of inputs to our multi-factor productivity (MFP) estimates. In the growth accounting literature, MFP is what is left-over after subtracting contributions to economic growth that can be ascribed to movements in capital services, movements in hours worked and movements in labour composition.

The work reported in this article is motivated by two principal drivers. First, as well as having a larger number of unique respondents than the LFS, the Annual Survey of Hours and Earnings (ASHE) also has the merit of being a survey of businesses about their employees – which is widely thought to avoid some problems of reporting bias and to provide more accurate industry allocation, as well as a lower propensity to round reported hours. A further issue is that LFS collects earnings information only on the first and fifth quarterly wave, resulting in many missing pay estimates in any particular quarterly LFS dataset, which will contain cohorts from all five waves.

Second, utilising a secondary data source provides a route to delivering finer industry granularity. Some earlier work by the growth accounting team suggested that it might be feasible to expand the industry granularity of QALI from the current 10-industry specification. But it is already the case that some QALI cells are very thin (or missing entirely) on the LFS, whereas ASHE is sufficiently large to support a much more detailed granularity.

In the first instance the work reported in this article expands the industry granularity from 10 to 19 industries (all letter level industries in Standard Industrial Classification 2007: SIC 2007 apart from S, T and U which are aggregated). Subject to your feedback we intend to use this breakdown for forthcoming quarterly QALI and MFP estimates. We are planning to develop functionality for a finer industry granularity (around 60 2-digit industries) for QALI and MFP as an annual system.

The layout of the rest of the article is as follows. Section 3 explores issues arising from the use of occupational classification data for the first time. Section 4 reports some work on identifying relationships between actual and paid (or usual) hours worked. Section 5 describes an approach to adjusting LFS hourly pay estimates in terms of QALI categories to align with ASHE estimates adjusted as described in the previous section. This involves expanding the number of pay and hours observations collected from LFS to include occupation groups (as well as finer industry granularity), replacing missing pay observations with estimated equivalents, aligning to the ASHE hourly earnings estimates before re-aggregating back to the original QALI stratification. Initial results suggest that this method generates pay differentials that are similar but not identical to those from LFS alone.

Appendix 1 also uses ASHE data but in the context of sectorisation of the labour market between market and non-market components, and for the purpose of deriving industry level benchmarks for sectoral hours worked and sectoral labour remuneration. Using ASHE for this purpose will have some impacts on market sector QALI that are independent of the use of ASHE component level hourly earnings.

Appendix 2 describes further proposed changes to the QALI methodology that are independent of ASHE, specifically dealing with the treatment of LFS respondents who do not report their level of education.

Nôl i'r tabl cynnwys

3. Working with occupation classifications in LFS and ASHE

To make greater use of Annual Survey of Hours and Earnings (ASHE) data in our Quality Adjusted Labour Index (QALI), it is first necessary to ensure an overlap of the characteristics that we use from the Labour Force Survey (LFS) with those available from ASHE. Our QALI methodology utilises information on education qualifications, along with age, sex and industry of employment. ASHE collects information on age, sex and industry but not on education. The closest alternative to education that is available on ASHE is occupation, which is also available on LFS. As there is a sizeable literature on the relationship between education and occupation, this forms our bridging variable. But before we explore this relationship further – and make the changes outlined previously to both utilise the larger sample size from ASHE and to increase the industry granularity of our QALI estimates – it is first necessary to determine what level of occupational categories to use.

Two considerations guide the choice of occupational grouping. Firstly, a more granular categorisation would ensure that differences in hourly remuneration can be better captured. To the extent that there are notable changes in hours or earnings within an occupational category, these will be averaged away at a higher level of aggregation, but made plain with a more detailed classification. All else equal, a more detailed breakdown is therefore preferred. However, a more detailed classification could result in a large number of cells that are empty or contain few observations, reducing the quality of our estimates. The cell size resulting from a given level of classification is consequently the second consideration.

At the 2-digit level there are 25 different occupation groups (Table 1) and using so many occupational groupings would result in 17,100 QALI categories on our expanded 19-industry granularity. This would result in many categories not having any observations for hourly remuneration and other cells with a small sample of pay observations. Two-digit occupations could be amalgamated; for instance into four skill groups as shown in Table 1. However, these are quite aggregated and are likely to mask significant variation in pay and hours.

Table 1: Comparison of the sub-major groups of Standard Occupational Classification SOC2000 and SOC2010

Skill Level	Sub-major groups of:
Skill Level	SOC 2000		SOC 2010
Level 4	11	Corporate managers	11	Corporate managers and directors
	21	Science and technology professionals	21	Science, research, engineering and technology professionals
	22	Health professionals	22	Health professionals
	23	Teaching and research professionals	23	Teaching and educational professionals
	24	Business and public service professionals	24	Business, media and public service professionals
Level 3	12	Managers and proprietors in agriculture services	12	Other managers and proprietors
	31	Science and technology associate professionals	31	Science, engineering and technology associate professionals
	32	Health and social welfare associate professionals	32	Health and social care associate professionals
	33	Protective service occupations	33	Protective service occupations
	34	Culture, media and sports occupations	34	Culture, media and sports occupations
	35	Business and public service associate professionals	35	Business and public service associate professionals
	51	Skilled agricultural trades	51	Skilled agricultural and related trades
	52	Skilled metal and electrical trades	52	Skilled metal, electrical and electronic trades
	53	Skilled construction and building trades	53	Skilled construction and building trades
	54	Textiles, printing and other skilled trades	54	Textiles, printing and other skilled trades
Level 2	41	Administrative occupations	41	Administrative occupations
	42	Secretarial and related occupations	42	Secretarial and related occupations
	61	Caring personal service occupations	61	Caring personal service occupations
	62	Leisure and other personal service occupations	62	Leisure, travel and related personal service occupations
	71	Sales occupations	71	Sales occupations
	72	Customer service occupations	72	Customer service occupations
	81	Process, plant and machine operatives	81	Process, plant and machine operatives
	82	Transport and mobile machine drivers and operatives	82	Transport and mobile machine drivers and operatives
Level 1	91	Elementary trades, plant and storage related occupations	91	Elementary trades and related occupations
	92	Elementary administration and service occupations	92	Elementary administration and service occupations
Source: Office for National Statistics

Download this table Table 1: Comparison of the sub-major groups of Standard Occupational Classification SOC2000 and SOC2010

.xls (30.7 kB)

Our point of departure is therefore to assess the degree to which there are differences in hourly remuneration between occupation categories and the number of empty pay cells at the level of the nine separate 1-digit Standard Occupational Classification 2010: SOC10 occupation categories shown in Table 2.

Table 2: Standard Occupational Classification (SOC) 1–digit categories

The SOC Hierarchy
Occupation Group 1	Managers, directors and senior officials
Occupation Group 2	Professional occupations
Occupation Group 3	Associate professional and technical occupations
Occupation Group 4	Administrative and secretarial occupations
Occupation Group 5	Skilled trades occupations
Occupation Group 6	Caring, leisure and other service occupations
Occupation Group 7	Sales and customer service occupations
Occupation Group 8	Process, plant and machine operatives
Occupation Group 9	Elementary occupations
Source: Office for National Statistics

Download this table Table 2: Standard Occupational Classification (SOC) 1–digit categories

.xls (26.6 kB)

To examine the extent of differences in earnings within skill groups and across 1-digit occupational groups we adopt a regression approach. Regressions on the log of hourly remuneration in ASHE over the period 1997 to 2015 (using a modal mapping of earlier SOC classifications, see the SOC conversion sub-section later in this section) shows that there are quite substantial differences in hourly pay for 1-digit occupation categories that are included in the same skill level, after controlling for other factors likely to affect hourly pay. The regression in Model 1 consists of occupation groups and year. The subsequent models each include additional control variables, so Model 2 adds industry controls, Model 3 adds age group controls to Model 2, and Model 4 adds controls for sex to Model 3 (Table 3).

Table 3: Modelling pay by occupation

	Model 1	Model 2	Model 3	Model 4
Dependent variable	ln (hourly pay)
Controls	Year	+ Industry	+ Age Group	+ Sex

Professional occupations	0.000750	0.0116***	0.0344***	0.0477***
	(0.77)	(11.89)	(35.76)	(50.30)

Associate professional and technical occupations	-0.205***	-0.223***	-0.198***	-0.190***
	(-199.84)	(-222.64)	(-201.20)	(-196.36)

Administrative and secretarial occupations	-0.612***	-0.625***	-0.593***	-0.530***
	(-637.25)	(-667.50)	(-643.77)	(-571.77)

Skilled trades occupations	-0.581***	-0.563***	-0.536***	-0.564***
	(-513.72)	(-505.14)	(-490.25)	(-522.26)

Caring, leisure and other service occupations	-0.820***	-0.756***	-0.715***	-0.669***
	(-737.35)	(-660.62)	(-635.24)	(-599.31)

Sales and customer service occupations	-0.916***	-0.831***	-0.772***	-0.716***
	(-875.80)	(-778.02)	(-729.71)	(-678.82)

Process, plant and machine operatives	-0.677***	-0.706***	-0.688***	-0.701***
	(-596.24)	(-620.94)	(-616.45)	(-638.39)

Elementary occupations	-0.929***	-0.864***	-0.831***	-0.815***
	(-937.35)	(-875.04)	(-855.79)	(-851.50)

R²	0.4758	0.5146	0.534	0.5487
N	3377976	3377976	3377976	3377976

t statistics in parentheses
* p < 0.05, p < 0.01, * p < 0.001
Source: Office for National Statistics
Notes:
1. Estimated coefficients can be interpreted as logs of hourly pay relative to the control group (Managers, directors and senior officials). For instance, Model 4 suggests that workers in Elementary occupations earn around exp(-0.815) = ~0.44 of the hourly pay of the control group.

Download this table Table 3: Modelling pay by occupation

.xls (24.6 kB)

As expected, higher occupation groups (that is, lower skill groups) tend to receive lower levels of hourly remuneration. The regressions also show that associate professionals and technical occupations receive significantly more pay than skilled trades occupations, despite being in the same skill grouping in Table 1. Thus using 1-digit occupation groups would ensure that differences in labour quality are better captured than would be the case by using skill levels, but are likely to deliver fewer observations based on low cell-counts than a full 2-digit breakdown.

SOC conversion

In order to use ASHE data from 1997 it is necessary to convert earlier Standard Occupational Classification codes (SOC90 and SOC00) into SOC10. There are a number of different methods that can be used to map previous SOC codes to SOC10, most of which depend on correspondence tables that draw on dual coded observations for a limited period, which show how each old classification maps to a new one. For instance, the conversion of SOC90 to SOC00 codes for LFS data was done using correspondence tables produced from dual-coded LFS data from winter 2000 to 2001. The SOC00 to SOC10 conversion uses a correspondence matrix derived from the dual coding of LFS for winter 1996 to 1997, the 2001 Census and the first quarter (January to March) of 2007.

One method of conversion using these data is modal conversion. An example is that for women in SOC90 code 345 (dispensing opticians), the relationship from the correspondence tables is that 75% are coded to SOC00 code 3216 (dispensing opticians) and 25% are coded to SOC00 2214 (ophthalmic opticians). Using a modal conversion all SOC90 code 345 records would be mapped to SOC00 code 3216. A drawback of this method is that, as in this example, correspondence tables generally do not map to a single SOC code.

An alternative method is to use a one-to-many mapping, proportionately splitting existing records and weighting them accordingly. So for the previous example each record for SOC90 code 345 (dispensing opticians) would be split into two; one with SOC00 code 2214 (ophthalmic opticians) with a weight of 0.75 and another into SOC00 code 3216 with a weight of 0.25 (dispensing opticians). This more accurately reflects the relationship of the mapping, but at the cost of significantly increasing the size and complexity of the dataset. This is particularly apparent when converting occupational classifications more than once. For example, where a SOC90 code is converted to 10 different SOC00 codes and each of these is then converted to 10 SOC10 codes, the original SOC90 record will be split into 100 separate records in terms of SOC10, many of which are likely to have negligible weights.

Figure 1 shows the proportion of hours worked in each occupation group in the LFS using a modal mapping and Figure 2 the proportion of hours worked for a proportional mapping. Figure 2 has significantly less variation in the proportion of hours worked in 1-digit occupation categories for changes in SOC code in 2001 and 2011 than in Figure 1.

Figure 1: Percentage of hours worked in first jobs in Labour Force Survey for each 1-digit occupation group using modal mapping

UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Occupational groups 1 to 9 as described in Table 2.

Dependent variable	acthr/usuhr
year	0.000499***
	(-230.41)
30 to 49 years	-0.00548***
	(-4.97)
50 to 99 years	-0.0203***
	(-17.34)
Male	0.0479***
	(-49.85)
Professional occupations	-0.0212***
	(-13.04)
Associate professional and technical occupations	-0.0448***
	(-26.70)
Administrative and secretarial occupations	-0.0485***
	(-27.13)
Skilled trades occupations	-0.0747***
	(-40.98)
Caring, leisure and other service occupations	-0.0730***
	(-36.57)
Sales and customer service occupations	-0.0494***
	(-23.88)
Process, plant and machine operatives	-0.0766***
	(-36.11)
Elementary occupations	-0.0714***
	(-38.90)
R²	0.8364
N	1015447
t statistics in parentheses
* p < 0.05, p < 0.01, * p < 0.001
Industry controls all significant at p < 0.001
Source: Office for National Statistics

1-digit level occupation category	1	2	3	4	5	6	7	8	9	Total
Hours missing	20.0%	22.0%	13.0%	15.3%	33.7%	41.9%	29.9%	46.3%	25.8%	27.6%
Hours present, pay missing	12.3%	11.3%	10.3%	11.5%	16.2%	16.4%	18.2%	23.6%	13.7%	14.3%
as % of hours worked	1.3%	0.4%	0.9%	1.0%	1.2%	1.1%	1.6%	1.8%	1.2%	1.0%
Source: Office for National Statistics

Dependent variable	ln (hourly pay)

Occupation group	1	3	4	7
Year	2015	2015	2015	2015
Controls	Age/sex/industry

GCSEs or equivalent	0.209*	0.154*	0.128**	0.0631
	(2.52)	(1.97)	(2.95)	(1.78)

A – levels or trade apprenticeships	0.280***	0.248**	0.159***	0.179***
	(3.39)	(3.18)	(3.56)	(4.68)

Certificate of Education or equivalent	0.418***	0.287***	0.223***	0.219***
	(4.80)	(3.59)	(4.62)	(4.39)

First Degrees and other degrees	0.490***	0.394***	0.276***	0.261***
	(5.96)	(5.06)	(5.96)	(5.99)

Masters and doctorates	0.644***	0.539***	0.339***	0.278***
	(7.36)	(6.61)	(5.99)	(3.50)

R²	0.1284	0.1281	0.0746	0.1656
N	4008	5457	5030	3396

t statistics in parentheses
* p<0.05, p<0.01, * p<0.001
Source: Office for National Statistics
Notes:
1. Occupation groups are described in Table 2.

		Hourly pay	hours worked (%)	Pay relatives		Adjusted Hourly pay
				from LFS	estimated	unbenched	benched
LFS	No qualifications	.	2%		0.712	£13.53	£13.92
	GCSEs or equivalent	£14.00	4%	0.811		£15.40	£15.84
	A-Levels or trade apprenticeships	.	5%		0.782	£14.87	£15.29
	Certificate of education or equivalent	.	10%		0.892	£16.95	£17.43
	First degree or other degrees	£16.50	42%	0.955		£18.15	£18.66
	Masters and doctorates	£18.50	37%	1.071		£20.35	£20.93
	Weighted Average	£17.27	100%			£18.48	£19.00

ASHE		£19.00
Source: Office for National Statistics

Cookies on ons.gov.uk

Developing improved estimates of quality adjusted labour inputs using the Annual Survey of Hours and Earnings: a progress report

Cynnwys

Table 1: Comparison of the sub-major groups of Standard Occupational Classification SOC2000 and SOC2010

Download this table Table 1: Comparison of the sub-major groups of Standard Occupational Classification SOC2000 and SOC2010

Table 2: Standard Occupational Classification (SOC) 1–digit categories

Download this table Table 2: Standard Occupational Classification (SOC) 1–digit categories

Table 3: Modelling pay by occupation

Download this table Table 3: Modelling pay by occupation

SOC conversion

Figure 1: Percentage of hours worked in first jobs in Labour Force Survey for each 1-digit occupation group using modal mapping

UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 1: Percentage of hours worked in first jobs in Labour Force Survey for each 1-digit occupation group using modal mapping

Figure 2: Percentage of hours worked in first jobs in Labour Force Survey for each 1-digit occupation group using proportional mapping

UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 2: Percentage of hours worked in first jobs in Labour Force Survey for each 1-digit occupation group using proportional mapping

Figure 3: Percentage of hours worked in Annual Survey of Hours and Earnings for each 1-digit occupation group using modal mapping

UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 3: Percentage of hours worked in Annual Survey of Hours and Earnings for each 1-digit occupation group using modal mapping

Figure 4: Percentage of hours worked in Annual Survey of Hours and Earnings for each 1-digit occupation group using proportional mapping

UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 4: Percentage of hours worked in Annual Survey of Hours and Earnings for each 1-digit occupation group using proportional mapping

Table 4: Regression of actual to paid hours ratio on categories of employees in Labour Force Survey

Download this table Table 4: Regression of actual to paid hours ratio on categories of employees in Labour Force Survey

Figure 5: Actual:paid hours ratios by occupation

UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 5: Actual:paid hours ratios by occupation

Table 5: Labour Force Survey cells with missing hours worked or pay estimates by occupation, 2011 to 2015

Download this table Table 5: Labour Force Survey cells with missing hours worked or pay estimates by occupation, 2011 to 2015

Table 6: Sample regression results

Download this table Table 6: Sample regression results

Figure 6: Estimated coefficients on education controls

Process, plant and machine operatives, UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 6: Estimated coefficients on education controls

Figure 7: Number of hourly pay observations for education group 6

lower skilled occupations, UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 7: Number of hourly pay observations for education group 6

Figure 8: Confidence intervals for HQ6 estimated coefficients

Process, plant and machine operatives, UK, 1997 to 2015

Source: Office for National Statistics

Download this chart Figure 8: Confidence intervals for HQ6 estimated coefficients

Figure 9: Estimated coefficients on education controls

Sales and customer service occupations, UK, 1997 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 9: Estimated coefficients on education controls

Table 7: Stylised example

Download this table Table 7: Stylised example

Provisional results

Figure 10: Hourly pay relatives by education

UK, 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 10: Hourly pay relatives by education

Figure 11: Hourly pay relatives; males

UK, 2002 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 11: Hourly pay relatives; males

Figure 12: Hourly pay relatives, age 16 to 29

UK, 2002 to 2015

Source: Office for National Statistics

Notes:

Download this chart Figure 12: Hourly pay relatives, age 16 to 29

Figure 13: Hourly pay relatives by occupation

UK, 2015