Preliminary estimates of quarterly UK public service productivity, inputs and output did not systematically under or overestimate the growth rate relative to the later estimates.
There is no statistically significant bias in the quarterly UK public service productivity, inputs and output (Experimental Statistics) growth revisions at any horizon past the initial estimate, from one quarter to seven quarters after the first published estimate.
The effects of methodological changes in the quarterly national accounts (QNA) due to Blue Book publications and internal development work done by the public service productivity team are reflected in the revisions.
The methodological changes have improved the accuracy of the quarterly estimates compared with the annual estimates over most of the time series; the outcome is mixed for 2015.
There are not enough observations to analyse the reliability of the timeliest estimates or nowcast across time.
Office for National Statistics (ONS) produces annual estimates of total public service productivity as well as breakdowns of specific service areas including healthcare, education and adult social care. The annual estimates are compiled using data that are from finalised sources and uses established and statistically sound methods. However, these annual estimates are produced with a two-year lag, limiting their value in public sector planning. This lag is due to the time needed to compile and publish the data necessary to calculate the productivity estimates from estimates of inputs and output. To address this, we publish the Quarterly UK public service productivity (Experimental Statistics) estimates to provide a short-term, timely indicator of the future path of the annual productivity.
The quarterly estimates are published on an experimental basis, compiled using raw data that are not final and/or estimated. The methodology used to compile the experimental statistics is explained in New nowcasting methods for more timely quarterly estimates of UK total public service productivity.
Public service productivity estimates operate an open revisions policy. This means that new data and/or methods can be incorporated at any time and will be implemented for the entire time series. This article looks to analyse the revisions of the quarterly experimental time series starting in Quarter 1 (Jan to Mar) 2011 and ending in Quarter 1 2016 (extending forward in time by one quarter with each new estimate). The first experimental quarterly estimates were published in July 2016, with seven more later releases and the latest (eighth) estimate being published in April 2018. This means that the productivity, inputs and output estimates potentially could have been revised seven times.
Annex A: Technical Annex contains more detailed information about the dates of publication of all the estimates and how the revisions and related summary statistics are calculated. It also contains information on other methods of analysing revisions.Nôl i'r tabl cynnwys
The quality of a statistic can be measured by its accuracy, timeliness and reliability. Revisions analysis examines the reliability of estimates over time. It examines if revisions to preliminary estimates are significantly different from zero. If they are, then the initial estimates are biased relative to the later estimates, making them unreliable as they would systematically under or overestimate the final value of a variable.
Quarterly productivity estimates are revised for four main reasons:
changes to raw data (this having many causes itself)
changes to methodology
effect of seasonal adjustment
correction of errors
It is important for users of data that the estimates are published without errors that lead to statistically significant revisions. Revisions should ideally improve the accuracy of estimates, for example, updates to incomplete raw data from final returns and/or late submissions, but not have a systematic and identifiable pattern.
Revisions may also be caused by methodological changes to the way estimates of variables are calculated or compiled. The aim of changing methodology is to improve the accuracy of published estimates, therefore there may be a one-off systematic revision between vintages of estimates at a specific date. Users requiring timelier data should expect to see revisions to the estimates. This is especially common for estimates of very recent periods due to the more intense use of provisional or preliminary data and statistical estimation methods used to produce the estimates.
Methodological changes to quarterly UK public service productivity estimates come from two sources; indirect methodological changes apparent in revised data in the quarterly national accounts (QNA) and direct methodological changes made by the public service productivity team in the development of the productivity, inputs and output estimates. The experimental estimates are developed to, where possible, bring them closer to Public service productivity estimates: total public service, UK: 2015's methodology and to improve their accuracy and quality relative to the annual estimates.
The national accounts (NA) provide the expenditure data used in the compilation of public service productivity. The NA are revised through Blue Book publications, as the Office for National Statistics (ONS) needs to comply with international accounting standards such as The European System of National and Regional Accounts 2010: ESA 2010. This includes methodology and classification changes that revise the data used for estimating public service productivity from inputs and output. Changes introduced through Blue Book 2017 are covered in the following section’s analysis. These changes revised the raw data through new methodology to improve the alignment of public sector finances and national accounts; a change in the base year from 2013 to 2014 and general national accounts methodology changes.
To carry out the revisions analysis to detect bias in revisions to preliminary estimates, this article follows the advice and methodology published by the Organisation for Economic Co-operation and Development (OECD). It focuses on the average or expected revision and whether it is significantly different from zero. The article also breaks down the mean squared revision into its causes and presents a selection of other summary statistics to help to describe the revisions. All quarterly estimates analysed in Section 4 are seasonally adjusted. All comparisons in Section 5 are against non-quality adjusted annual estimates of public service productivity.Nôl i'r tabl cynnwys
The preliminary estimates of Quarterly UK public service productivity did not systematically under or overestimate productivity growth relative to the later estimates. The average revisions between preliminary and later estimates of productivity cannot be confidently distinguished from zero. The mean revisions between all estimates have no statistically significant bias to systematically under or overestimate the final value. This suggests that the average revisions have a value of zero in their confidence interval or, equivalently, the null hypothesis of a zero mean revision cannot be rejected for a given significance level (5% was used in this article).
Figure 1 presents the time series of the UK public service productivity percentage growth rate published in July 2016 (estimate 1) as the blue line and published in April 2018 (Estimate 8) as the yellow line. The blue bars show the percentage point revisions between the two estimates. The conceptual or technical difference between the estimates published in July 2016 and April 2018 is potentially the largest possible between all releases due to the accumulation of all the interim data and methodology changes.
It shows relatively large negative revisions in Quarter 1 (Jan to Mar) 2011, Quarter 2 (Apr to June) 2011 and Quarter 1 2013; and relatively large positive revisions in Quarter 2 2012, Quarter 3 (July to Sept) 2013, Quarter 2 2014 and Quarter 1 2016. The other revisions are a mix of positive and negative as well as smaller in size.
Between the July 2016 and April 2018 publications, the estimates have gone through a number of direct and indirect methodological changes. Many deflators used to calculate volume measures have been improved. Constant price (deflated expenditure) labour measures have been replaced with direct volume measures. The method of application of seasonal adjustment of time series has changed and there have been methodological changes to the use of raw data from the quarterly national accounts (QNA).
Also, raw data that are used to compile the QNA have gone through revisions due to their timely nature, just as the estimates being analysed in this article. The revision between the first and eighth published estimates is on average negative 0.05 percentage points, but given the size of the standard error of 0.11 percentage points, it is very likely that it is non-zero due to “noise”.
Figure 2 compares the same published series as in Figure 1. However, to account for impact of methodology changes, both series are calculated on the same (most up-to-date) basis, using the data that were available at the time.
Figure 2 takes away the effects of revisions due to direct methodology changes made by the public service productivity team. It retains indirect methodological changes due to Blue Book 2017 and raw data changes between the two vintages. The chart shows that the revisions are a mix of positive and negative changes but much smaller in size than in Figure 1.
The largest revision is in the estimate for Quarter 1 2016, the timeliest estimate in July 2016. This is expected as timely data are based on incomplete or preliminary raw data sources that are liable to change. This suggests that if the most up-to-date methodology was available back in July 2016, there would be very little or no change in the reported trend of productivity growth. The mean revision is 0.02 percentage points, but given the size of the standard error of 0.06 percentage points, it is very likely that it is non-zero due to “noise”.
This section covers the revisions between the initial published experimental quarterly productivity estimates (Quarterly public service productivity (Experimental Statistics): January to March 2016 – published July 2016) and all subsequently published estimates. The revisions are summarised, but the data and detailed analysis of results are published alongside for any user wanting information regarding specific revisions.
Table 1 contains a selection of summary statistics that describe the revisions between estimate 1 and all subsequently published estimates.
Table 1: Summary statistics for revisions between the first published estimate of seasonally adjusted quarterly UK public service productivity and subsequent estimates
|Summary Statistic of Revisions between Estimate 1 (Jul 2016) and Estimates published:||1 quarter later (Oct 2016)||2 quarters later (Jan 2017)||3 quarters later (Apr 2017)||4 quarters later (Jul 2017)||5 quarters later (Oct 2017)||6 quarters later (Jan 2018)||7 quarters later (Apr 2018)|
|Standard Error - Heteroskedasticity and Autocorrelation Consistent||0.05||0.07||0.10||0.10||0.10||0.11||0.11|
|Source: Office for National Statistics|
|1. Standard error is adjusted for serial correlation using the Newey-West estimator.|
|2. t-statistic is obtained by dividing the mean by the standard error.|
|3. t-critical comes from the student's t table for (n-1) degrees of freedom and 5% significance level.|
|4. Figures may not sum due to rounding.|
Download this table Table 1: Summary statistics for revisions between the first published estimate of seasonally adjusted quarterly UK public service productivity and subsequent estimates.xls (36.9 kB)
Table 1 highlights that the mean revisions are negative and close to zero, suggesting that the initial estimate might have overestimated productivity growth. However, once the size of the relevant standard errors (adjusted for serial correlation as explained in the technical annex) of the revisions are taken into consideration, it shows that the confidence intervals of the mean revisions contain the value of zero. This suggests that the mean revisions are not significantly different from zero, therefore the initial estimate did not systematically overestimate productivity growth of UK public services relative to the later estimates.
Other statistics (from the Summary Statistics Tables) show that the later estimates showed a similar trend to the first estimate. The proportion of the first and later estimates having the same direction of productivity growth remained high as more estimates were published, while falling initially. The proportion of later estimates having a higher value than the first estimate also indicates that the estimates are not largely different. To summarise this, Theil’s U (UII) is a measure of how good the preliminary estimates are compared against an estimate of no change. A value between zero and one indicates that the preliminary estimates are better than a no change estimate, a value of one or more indicates that the estimates are no better than a no change estimate. For the first published estimate, the value of Theil’s U is increasing but smaller than one, indicating that the first published estimate is a good preliminary estimate.
The average size of the revisions between the first published estimate and subsequent estimates is represented by the mean absolute revision (MAR). The MAR shows the average distance between preliminary and later estimates while not considering direction. The use of the mean squared revision (MSR), similarly to the MAR, also gets around the problem of positive and negative revisions cancelling each other out so it can be used together with MAR to compare how the average size of the revisions evolved with subsequent releases. Figure 3 investigates this by plotting the MAR and MSR between the first published estimate and later estimates.
These measures show that the revisions initially got bigger in size on average, as more estimates were published, but then fell slightly and remained stable. You would expect the size of the revisions to get bigger as the horizon between estimates increased, as the estimates would differ from each other more over time. This shows that the reliability of the first published estimate remained stable over time.
To understand this further, the MSR can be broken down into the underlying causes, based on a quadratic loss function as explained in the technical annex. UM represents the proportion of the MSR due to the mean revision being different from zero. UR represents the proportion of the MSR caused by systematic differences between estimates. UD shows the proportion of the MSR that is caused by noise. Good preliminary estimates have high proportions of UD and low proportions of UM and UR.
As Figure 4 shows, estimate 1 is a good preliminary estimate as UD has a high proportion of MSR across the revision horizons. The increased proportion of UR (the proportion of the MSR due to systematic differences) from the fourth estimate onwards reflects the change in methodology to use direct measures of labour input. The inclusion of methodological changes in quarterly national accounts data due to the publication of Blue Book 2017, that coincided with the sixth estimate, increased the systematic proportion of MSR slightly.
To show the effects of indirect methodology changes and raw data changes, Figure 5 shows the MSRs and MARs between all eight quarterly public service productivity estimates processed using the methodology used in the publication of the eighth estimate. The raw data used are as they were available at the time of publication of the preliminary estimates. Therefore, it is said to be processed on a consistent internal methodology basis.
Figure 5 shows that removing the effects of direct methodology changes on revisions decreases both the MARs and MSRs for all comparisons. As a result, Figure 5 shows where productivity estimates were affected by both raw data changes and indirect methodology improvements due to Blue Book 2017. The latter being present from the sixth publication (October 2017) onwards.
As before, Figure 6 breaks down the causes of the MSRs between the estimates published using consistent internal methodology.
Figure 6 shows that external methodology changes caused little systematic differences between the first estimate and subsequent estimates. The clear majority of the MSR was caused by “noise” through the changes in the raw data used. It also confirms that the main reason for systematic differences present in the MSR between preliminary and later estimates of published estimates (Figure 4) are the internal methodology changes. Examples of these changes include the choice to use direct measures of labour input and the method of aggregating service areas together for inputs, that coincided with the fourth release.
The same analysis was carried out for growth estimates of inputs and output of UK public services. The preliminary estimates of inputs and output did not consistently under or overestimate the growth in the variables relative to later estimates. There is no statistically significant bias in the revisions to preliminary estimates of growth of UK public service inputs and output. Most of the revisions to productivity are caused by revisions to inputs, as output estimates are relatively more stable over time.Nôl i'r tabl cynnwys
The quarterly UK public service productivity estimates are published due to the need for timelier estimates of UK public service productivity. The experimental quarterly measure is used to nowcast the annual estimate; informing the likely path that the annual measure might take in the future. This section sheds light on the usefulness of the experimental estimates as nowcasts of the annual estimates. However, the analysis is limited due to the small number of observations available. The reliability of the timeliest estimates also cannot be thoroughly analysed (through detection of statistically significant bias in revisions) because of the limited number of observations.
Figure 7 compares the annual non-quality adjusted UK public service productivity index, from 1997 to 2015, and the annualised1 experimental Quarterly UK public service productivity estimates covering the same period. The quarterly experimental estimates are not benchmarked to the annual estimates.
The annual estimate for non-quality adjusted productivity growth in 2015 (represented by the blue line) was published at the same time as the seventh estimate, therefore the first six experimental estimates were used to nowcast the path of the annual statistic.
Figure 7 shows how the timelier quarterly estimates (first to eighth) perform against the annual statistic (Annual public service productivity estimates), as published in Public service productivity estimates: total public service, UK: 2015. According to the annual estimate, non-quality adjusted productivity is around 6% lower in 2015 than it was in 1997. We can see from Figure 7 that all the experimental measures underestimate the productivity level when compared with the annual estimate. When compared with the first quarterly estimate, it underestimates productivity growth the most, suggesting it was around 12% lower in 2015 than 1997. This has improved through methodological improvements in subsequent iterations of the quarterly estimate. The latest quarterly estimate underestimates productivity growth the least, suggesting that public service productivity is about 8% lower in 2015 than 1997.
The first noticeable improvement is the publication of the fourth estimate, where major direct methodological changes were made. Examples include:
volume of labour input was measured directly for the first time
changes to the way service area inputs were aggregated together
changes to the way volume of output was seasonally adjusted
other changes to price deflators
The next noticeable improvement was made with the publication of the sixth estimate, especially for the estimates of more recent productivity levels. The sixth estimate included indirect methodological changes as a result of publication of Blue Book 2017, as well as other minor revisions to direct labour input measures and price deflators.
Although the number of observations is small and the statistical inference that can be drawn is limited, Figure 7 shows that methodological changes (direct or indirect) have improved the accuracy of the nowcast relative to the annual productivity estimates. Therefore, revisions due to methodological changes shown in Figure 7 illustrate the purpose of methodological changes, which is to improve their relative accuracy.
The indices presented in Figure 7 show that the quarterly estimates have improved their relative longer-term fit compared with the annual estimates. However, to assess the experimental estimates’ usefulness as nowcasts of the annual estimates, it is useful to compare the estimates of productivity growth of the most recent period. Figure 8 compares the estimates for UK public service productivity growth for 2015, from the annual statistic and the annualised quarterly statistics.
The first three experimental estimates underestimated the growth in UK public service productivity of 0.2%, suggesting contractions in productivity of 0.2%, 0.6% and 0.4%. The direct methodological changes helped to improve the relative accuracy of the quarterly measure, where the fourth and fifth estimates very slightly overestimated productivity growth, showing the correct direction, suggesting productivity growth of 0.3%. The indirect methodological changes made through Blue Book 2017 appear to have contributed to the relative accuracy of the nowcast deteriorating. The sixth, seventh and eighth estimates underestimate productivity growth, however they do indicate the correct direction, suggesting productivity growth of 0.0%, 0.1% and 0.1%.
The analysis of the data, outlined in Figure 8, needs to be considered within the context of the relationship between the annual and quarterly estimates in mind. Figure 8 compared the relative accuracy of the quarterly estimates with the current annual estimates for 2015 as they are available at the time of writing. However, the current estimates of annual productivity growth will be potentially revised in the upcoming release of the annual productivity estimates in January 2019. Therefore, to better assess the relative accuracy of the nowcast, more time needs to pass and more annual estimates need to be published. To investigate the relative accuracy of the quarterly estimates further, Figure 9 carries out a similar exercise, comparing the annualised quarterly estimates of productivity growth for 2014 with annual estimates from the January 2018 and January 2017 publications.
Figure 9 shows that the annual estimates are subject to revision too, especially for recent periods. Figure 9 shows that in January 2017 the annual estimate for non-quality adjusted productivity growth was 0%. This meant that all quarterly estimates overestimated productivity growth and quarterly estimates were relatively inaccurate. However, as new and updated annual estimates were published in January 2018, the current annual estimate for 2014 non-quality adjusted productivity growth is 0.4%. The revision to the annual estimates has improved the relative accuracy of the quarterly estimates. This shows that more estimates of quarterly and annual measures need to be published to better assess the relative accuracy of the nowcast.
Growth in productivity can be proxied by the difference in output and inputs growths, or as the growth in the index of the ratio of the output and inputs indices. Figure 7 suggests that the experimental statistic estimates underestimate the growth in non-quality adjusted productivity. This can result from overestimating inputs growth, underestimating output growth or a combination of both. Applying similar analysis to quarterly inputs and output estimates, the findings suggest that both series overestimate the growth of the variables relative to the annual estimates, although inputs were overestimated to a greater extent. The latest experimental estimates of inputs overestimate growth from 1997 to 2015 over the national statistic by 6 percentage points. The latest experimental estimates of output overestimate growth from 1997 to 2015 over the national statistic by 2 percentage points. This leads to the latest experimental estimates of productivity to underestimate growth from 1997 to 2015 over the national statistic by 2 percentage points.
Notes for: Quarterly public service productivity (Experimental Statistic) compared with Annual public service productivity (National Statistic)
- The quarterly indices are annualised by calculating the average value of the four quarters in each calendar year. This means that a rolling average of the quarterly index is calculated where the calculation moves forward by four quarters each time.
The revisions analysis carried out for this article has shown that, historically, the preliminary estimates of Quarterly UK public service productivity did not systematically under or overestimate the relative rate of change in UK public service productivity, inputs and output. In other words, there is no statistically significant bias in the revisions to the estimates. Where the mean revisions are non-zero it is very likely that they are due to “noise”. Where the revisions were more substantial (still not statistically significant), this coincided with the publications of Blue Book and methodological changes made by the public service productivity team to improve the estimates. The more substantial revisions due to methodological changes improved the accuracy of the experimental series relative to the national series.Nôl i'r tabl cynnwys
The purpose of this section is to describe and explain the methods used to analyse the historic revisions to quarterly UK public service productivity estimates. ONS publishes Quarterly UK public service productivity estimates to provide a short-term, timely indicator of the future path of the annual productivity estimates. The quarterly estimates lag by one quarter. They have an open revisions policy, meaning that new data or methods can be incorporated at any time and will be implemented for the entire time series. The first estimates were published in July 2016, with seven more estimates published quarterly after.
The time series available and published goes back to Quarter 1 (Jan to Mar) 2011 for every release since July 2016. The first release published estimates from Quarter 1 2011 to Quarter 1 2016, resulting in 21 observations.
Each subsequent publication included revised estimates of the previous version, and in addition an estimate for one more quarter. The analysis was carried on percentage growth rate of productivity. Productivity is measured in index form, as the index of output divided by the index of inputs, using the following equation:
The percentage growth rate of productivity was calculated as:
A revision, for an estimate for period t, was calculated as the percentage point difference between a later (L) and earlier/preliminary (P) estimate, using the equation:
As each publication revised estimates for the same time period in addition to producing an estimate for a new period, therefore, it can be said there are vintages of the estimates. i was used to denote the vector of later vintages, that is, all estimates published from October 2016 onwards.
i = October 2016,January 2017,April 2017,July 2017,October 2017,January 2018,April 2018
j was used to denote the vector of earlier or preliminary vintages, that is, all estimates published before April 2018:
j = July 2016,October 2016,January 2017,April 2017,July 2017,October 2017,January 2018
There is cross-over between the periods that i and j cover, but vintages are compared against each other, so it can be thought of j being a one quarter lagged version of i.
With eight vintages of estimates, there are seven revision windows, but every early or preliminary estimate can be compared against all the subsequent estimates. This means that there are 28 possible comparisons to make:
Table 2 helps to visualise the number of possible comparisons.
Table 2: Possible comparisons of vintages
|Revisions||Vintage 1||Vintage 2||Vintage 3||Vintage 4||Vintage 5||Vintage 6||Vintage 7||Vintage 8|
|Vintage 1||N/A||2 vs 1||3 vs 1||4 vs 1||5 vs 1||6 vs 1||7 vs 1||8 vs 1|
|Vintage 2||1 vs 2||N/A||3 vs 2||4 vs 2||5 vs 2||6 vs 2||7 vs 2||8 vs 2|
|Vintage 3||1 vs 3||2 vs 3||N/A||4 vs 3||5 vs 3||6 vs 3||7 vs 3||8 vs 3|
|Vintage 4||1 vs 4||2 vs 4||3 vs 4||N/A||5 vs 4||6 vs 4||7 vs 4||8 vs 4|
|Vintage 5||1 vs 5||2 vs 5||3 vs 5||4 vs 5||N/A||6 vs 5||7 vs 5||8 vs 5|
|Vintage 6||1 vs 6||2 vs 6||3 vs 6||4 vs 6||5 vs 6||N/A||7 vs 6||8 vs 6|
|Vintage 7||1 vs 7||2 vs 7||3 vs 7||4 vs 7||5 vs 7||6 vs 7||N/A||8 vs 7|
|Vintage 8||1 vs 8||2 vs 8||3 vs 8||4 vs 8||5 vs 8||6 vs 8||7 vs 8||N/A|
|Source: Office for National Statistics|
Download this table Table 2: Possible comparisons of vintages.xls (37.4 kB)
The focus of this article is to test for statistically significant bias in the revisions to determine if the early or preliminary estimates under or overestimated productivity growth. If the average or mean revision is significantly different from zero, this means that the early or preliminary estimates under or overestimated productivity growth are therefore unreliable. For a preliminary estimate to be unbiased, the expected value of the revisions to it should be zero, the expected value is equal to the average value:
The way to test for this is to set up a hypothesis test, where the null hypothesis is that the average revision is equal to zero:
The alternative hypothesis is that the average revision is not equal to zero:
The null hypothesis can be rejected if its t-statistic is greater in absolute size than a t-critical value from a student’s t distribution for a given significance level (5% has been used for this article). The t-statistic is calculated as the average revision divided by the standard error of the revisions. The standard error needed to be adjusted for serial correlation, as the revisions showed strong sings of it for several lag lengths. Serial correlation leads to incorrect inferences being made from samples. The standard error was adjusted for serial correlation using the Newey-West method, as recommended by the Organisation for Economic Co-operation and Development (OECD). This is because the alternative way that the ONS adjusts the standard error for serial correlation assumes that serial correlation is significant only at the first lag, that is, the revisions time series is auto-regressive of order 1:
However, auto-correlation and partial auto-correlation functions or plots of the revisions time series showed evidence of higher orders of serial correlation. The results (not published in this article) did not differ when the standard error was adjusted using the alternative method that assumed the order of serial correlation.Nôl i'r tabl cynnwys
Manylion cyswllt ar gyfer y Methodoleg
Ffôn: +44 (0)1633 455750