New index number methods in consumer price statistics

1. Main points

We have produced a quality framework and carried out subsequent testing of different methods that can be used to produce consumer price indices at the lowest level of aggregation (index number methods), including emerging multilateral methods that are in use by other national statistical institutes (NSIs) when using scanner and web-scraped data; this research will help us to understand whether different index number methods are needed for different data sources (for example, scanner and web-scraped data) and different spending categories (for example, groceries, clothing or package holidays).
Our framework and subsequent testing have shown that at the lowest level of aggregation, consumer price indices produced using multilateral indices for web-scraped and scanner data will be more comprehensive and accurate than those made using fixed-base or chained bilateral methods; Quality-adjusted Geary Khamis (QU-GK) was the highest-scoring method against our index number method framework’s criteria, and it performed well under our testing; therefore, the QU-GK method is our current proposed method to use with scanner and web-scraped data when expenditure information, or approximations thereof, are available.
In the absence of expenditure information, we propose the GEKS-Jevons to be a suitable alternative, although we stress the importance of research into approximating product-level expenditures as a more effective approach for use with web-scraped data, provided suitable expenditure approximations can be made.
The move to use scanner and web-scraped data for consumer price statistics is a current point of interest among many other NSIs, and we will continue to produce indices for our favoured methods in parallel while research, both in the UK and internationally, progresses; this means that should international consensus converge on a preferred method, we will likely follow international guidance and best practice.

Nôl i'r tabl cynnwys

2. Introduction

We are investigating automated ways of collecting data for consumer price statistics with increased product coverage and frequency of collection relative to our traditional data sources. The data sources we plan to use, by 2023, are scanner and web-scraped data. These new data sources have several features that mean that new price index number methods may be required to maximise their use in our statistics. However, there is an ever-growing number of index number methods that can be used, with little international consensus, currently, as to what the optimal method is.

The choice of which index number methods to use at the lowest level of aggregation will depend heavily on the data source and information available as well as the desired properties of the method itself.

Nôl i'r tabl cynnwys

3. Overview of index number methods

Traditional price collection

Currently, the price quotes used in UK consumer price statistics are collected manually from physical stores, websites and by phoning the retailer or business. This way of collecting prices means that traditional data sources do not typically contain information on the number of each product sold and, as such, most indices use what we refer to as unweighted index number methods at the lowest level of aggregation.

To calculate an index using an unweighted index number method, a sample of products and retailers is chosen that is considered representative of consumer spending in each region and country of the UK. Prices for this same sample of products are collected each month and an average¹ of price movements is calculated to produce an initial price index, known as an elementary aggregate (EA) index. For example, a price collector will observe the price of a commonly bought loaf of white bread in London every month. An EA index is created for bread, based on the average price movement of all loaves of bread sampled in London. Above this EA level, expenditure weights are used to aggregate price movements of London bread together with price movements of bread in other regions and countries of the UK. Price movements for all bread are then weighted together with price movements of all other goods and services in the UK in calculating the headline rates of consumer price inflation.

For more information of how prices are collected and price indices are constructed in our current measures of consumer price inflation, please refer to our Consumer Prices Indices Technical Manual.

New data sources

Web-scraped data are collected from retailers’ websites. Compared to traditional data sources, they can be collected more frequently, cover a broader range of products, and can contain a wealth of information about the product and its attributes. A monthly price can be calculated for web-scraped data by taking an average of the observed prices for each product across a month. Research is ongoing into how this average should be calculated. Web-scraped data lack information on the number of products sold. This means that for web-scraped data, where we have not been able to approximate the likely number of each product sold we will need to use unweighted index number methods.

In contrast, scanner data are collected by retailers at the point of sale, providing information on the number and type of products sold. These data allow us to use weighted index number methods, meaning that products that have a higher value of sales will have greater influence over the inflation rate. A monthly price is calculated for each product by taking its total expenditure across each month and dividing it by the number of sales. For example, if a loaf of the retailers’ own-brand white bread had a total sales value of £100 in March, and 100 loaves of this type of bread had been sold, we would calculate that the average price of that loaf of bread in March would have been £1. Once we have a price for each product in each month, we can aggregate the price changes for individual products using the total sales value as an indication of how much weight to give each product in the resulting index. This means that, for scanner data, we can use weighted index number methods to produce item-level inflation estimates that account for the number of each product that have been sold.

Weighting period used in constructing price indices

For weighted indices, we need to decide what period to use the expenditure weights from. For example, to calculate inflation between January and February, the price movements can be aggregated based on sales values for either January, February, or an average of both. Using the first period (in this case, January) would result in a “base weighted” index number method, such as a Laspeyres. Using the second period (in this case, February) would result in a “current-period weighted” index number method, such as a Paasche. Using a combination of the weights in each period would result in a “superlative” index number method, such as a Fisher or a Törnqvist (technical descriptions of the methods included in this article can be found in Annex A).

In normal economic conditions, consumers tend to substitute towards cheaper products when prices increase or towards discounted products. In a base weighted index, the weights are calculated before this substitution occurs, meaning that a base weighted index could overstate the true cost of living. The reverse is true for a current-period weighted index as the weights are calculated after substitution away from more expensive products has taken place. Superlative indices are better approximations of the cost of living as they better account for substitution behaviour between the base and current periods. Current-period weighted and superlative indices have not been used historically in UK consumer price indices because of a lack of reliable timely expenditure information in the current period.

Time periods used in constructing price indices

Another consideration is the time periods that should be used in calculating the index. Bilateral methods consider price changes for a consistent sample of products between two time periods, although these time periods are not necessarily consecutive. For example, in the current calculation of consumer price indices, prices in each month of a year are expressed relative to the price of the same product in January of the same year. This current method is referred to as a fixed-base bilateral index number method.

Fixed-base methods only measure price movements for products that were available in the base month, or products that have been used as replacements if the original products are out of stock. Frequent chaining can be used to incorporate new products more regularly, to ensure that new and disappearing products can be accounted for so that the sample remains representative of the market over time. Monthly chaining is where consistent product sets for pairs of months are taken throughout the year and their price movements are chained together to form a continuous series. For example, the price change for a set of products between January and February, a set of products between February and March, and a set of products between March and April would be calculated, and these movements would be chained together to show the overall price change between January and April. While frequent chaining has the advantage over fixed-based methods in that it can account for new and disappearing products, it also typically suffers from a phenomenon referred to as “chain-drift”. This is explored more in Section 5: Stress- testing shortlisted index number methods.

In comparison, multilateral methods simultaneously make use of all data over a given time period. The use of multilateral methods for calculating temporal price indices is relatively new internationally, but these methods have been shown to have some desirable properties relative to their bilateral method counterparts, in that they account for new and disappearing products (to remain representative of the market) while also reducing the scale of chain-drift. Multilateral methods can use a specified number of time periods to calculate the resulting price index; the number of time-periods used by multilateral methods is commonly defined as a “window length”.

Varieties of bilateral index number methods (comparing two time periods)

All weighted and unweighted bilateral methods that simply compare prices between two chosen time periods can use both fixed-base and chained varieties. Table 1 provides a list of the bilateral methods considered in this article, grouped by the period from which their weights are derived. Technical descriptions of all methods considered in this article are provided in Annex A.

Table 1: Bilateral index number methods under consideration
Unweighted	Base weighted	Current-period weighted	Superlative
Jevons	Arithmetic Laspeyres	Paasche	Fisher
Dutot	Geometric Laspeyres		Törnqvist
Carli

Download this table Table 1: Bilateral index number methods under consideration

.xls .csv

While bilateral methods are relatively simple to understand, they can be problematic in certain conditions, particularly when there are a high number of products entering and leaving the market (referred to as churn). The weaknesses of bilateral methods are demonstrated further in Section 5: Stress- testing shortlisted index number methods.

Varieties of multilateral methods (comparing multiple time periods simultaneously)

Multilateral methods overcome some of the problems experienced in bilateral methods by simultaneously making use of all data available in all time periods. But while multilateral methods have many advantageous properties compared to their bilateral counterparts, in their purest form they are subject to revisions as newer data become available to inform the calculation of previous periods. For example, a multilateral index calculated for March 2020 could be calculated using price changes in all available time periods between January 2020 and January 2021 (using a 13-month window length). Therefore, as each future month between March 2020 and January 2021 becomes available, there is more information to inform the March index value and it would likely be revised.

Using the same example, at the time of publishing a multilateral price index for March 2020 we would lack the information requirements from the remainder of the window until all data were collected by the end of January 2021. The practical way in which we can extend our time series is known as an extension method. Extension methods can be used in combination with multilateral methods to overcome the need for revisions (details of extension method calculations can be found in Annex A), which are impractical and undesirable for many users of consumer price statistics. A range of multilateral methods paired with a range of extension methods is considered in this article and is presented in Table 2. Multiple combinations of these methods can be used, for example, the GEKS-Jevons can be combined with a movement splice and Quality-adjusted Geary Khamis can be combined with a fixed-base monthly expanding window.

Table 2: Multilateral methods and extension methods under consideration
Multilateral methods	Extension methods
GEKS-Jevons (GEKS-J)²	Direct extension (DE)
GEKS-Törnqvist (GEKS-T)	Movement splice (MS)
GEKS-Fisher (GEKS-F)	Window splice (WS)
Quality-adjusted Geary Khamis (QU-GK)	Half window splice (HWS)
Time product dummy (TPD)²	Geometric mean splice (GMS)
Time product dummy hedonic (TPH)²	Fixed base monthly expanding window (FBME)

Download this table Table 2: Multilateral methods and extension methods under consideration

.xls .csv

Notes for: Overview of index number methods

Different methods of averaging can be used, such as geometric (Jevons) or arithmetic (Carli) averaging. Another respected method, known as Dutot, calculates the ratio of average prices, rather than taking the average of price movements.

Nôl i'r tabl cynnwys

4. Framework for shortlisting index number methods

In total, the combination of different multilateral and extension methods, along with the fixed-base and chained bilateral methods, gives rise to over 50 potential methods that we could use in our consumer price statistics at the lowest level of aggregation. To decide on an appropriate index number method for each data source (for example, scanner and web-scraped data) and each category of spending (for example, clothing, groceries or package holidays), we intend to complete the following steps.

Step 1: Shortlist methods

Exclude methods that do not meet minimum resource requirements.
Exclude methods that do not meet minimum interpretability requirements.
Apply theoretical framework to remaining methods: theoretical properties (55%), resource (20%), interpretability (15%) and flexibility (10%).
Methods that score within the top 10 are chosen for the shortlist.

Step 2: Assess shortlisted methods

Test methods against a range of synthetic datasets with different pricing behaviours.

Step 3: Choose the appropriate method

Assess pricing behaviour of unique item over given time series.
Determine the characteristics of the data.
Assess whether the highest ranked method is suitable given the pricing behaviour and characteristics of unique item.
If the highest ranked method is unsuitable, choose appropriate alternative from the shortlist.

To limit the number of methods available for use, we have produced a shotlist of methods based on a quality framework of pre-determined criteria. A large number of methods is undesirable as it is both impractical to implement and complex to explain. While a large number of methods is undesirable, a single method may also not be suitable for all data sources and categories of spending. We have therefore produced two shortlists of appropriate methods: one shortlist for use when expenditure weights, or approximations, are available and one shortlist for when this information cannot be obtained.

The framework for assessing index number methods has been discussed with, and informed by, both our Technical and Stakeholder Advisory Panels on Consumer Prices (APCPs). The framework will be periodically reviewed and updated in line with our own and international research and guidance as well as with any emerging price index number methods.

There are five criteria that we use to produce our shortlist. Table 3 provides the criteria (with reference to the European Statistical System’s (ESS’s) quality dimensions) and their respective weights within our framework. Detailed information about the framework, criteria weights and how the index number methods performed can be found in The winning formula? A framework for choosing an appropriate index method for use on web-scraped and scanner data, presented to the Technical APCP in January 2020.

Table 3: Criteria weights within the Index Number Methods Framework
Criterion	Weight
a. Theoretical properties (accuracy and reliability)	55%
b. Resource (timeliness and frequency)	20%
c. Interpretability (accessibility and clarity)	15%
d. Flexibility (relevance)	10%
e. Coherence (coherence and comparability)	0% (used as a secondary filter)

Download this table Table 3: Criteria weights within the Index Number Methods Framework

.xls .csv

New methods will be assessed against our framework and ranked against existing methods as they emerge. Scores for our existing methods will be reviewed periodically to ensure that they account for the most recent research and developments in the international literature.

Prior to assessing methods against our framework, two primary filters are applied. First, if the information-processing requirements are unmanageable, then the method is excluded as we do not want to hinder the timeliness or frequency of the consumer price inflation publication. Secondly, if the price movements are not intuitive to those producing or using the data, then the method is excluded as we believe any price movements should be understandable to both producers and users.

After applying these primary filters, each method is assessed against each criterion. The final scores are used to rank the methods and produce the shortlists of appropriate index methods for UK consumer price statistics. In cases of equal scores between methods, the cohesion criterion is used as a secondary filter to separate methods in the rankings. For example, if two methods received the same score in the rankings, any method in use by other National Statistics Institutes (NSIs) would take precedence in the shortlist.

Following discussions with our Technical APCP and other index number method experts, we have made some small alterations to the framework scores and resulting shortlists. Our current shortlists for index number methods when weighting information (or approximates thereof) are available and for when weighting information (or approximates thereof) are unavailable are shown in Tables 4 and 5 respectively.

Table 4: Shortlist 1, favoured index number methods when weighting information is available
Rank		Method
1		Quality adjusted Geary Khamis (QU-GK) using a Fixed Base Monthly Expanding window (FMBE)
2		GEKS-Törnqvist using a Movement Splice
3		GEKS-Fisher using a Movement Splice
4		GEKS-Jevons using a Movement Splice
5		GEKS-Törnqvist using a Geometric Mean Splice
6		GEKS-Fisher using a Geometric Mean Splice
7		GEKS-Törnqvist using a Window Splice
8		GEKS-Fisher using a Window Splice
9		GEKS-Jevons using a Geometric Mean Splice
10		GEKS-Jevons using a Window Splice

Download this table Table 4: Shortlist 1, favoured index number methods when weighting information is available

.xls .csv

Table 5: Shortlist 2, favoured index number methods when weighting information is unavailable
Rank	Method
4	GEKS-Jevons using a Movement Splice
9	GEKS-Jevons using a Geometric Mean Splice
10	GEKS-Jevons using a Window Splice
33	Chained Jevons (CJ)
44	Fixed base Jevons (FBJ)

Download this table Table 5: Shortlist 2, favoured index number methods when weighting information is unavailable

.xls .csv

Tables 4 and 5 show that the multilateral methods consistently outperform the bilateral methods in both shortlists. Our shortlist in Table 4 shows that our unweighted multilateral methods (GEKS-Jevons) outperforms our weighted bilateral methods. Only three unweighted methods ranked within the top 10 methods in our shortlists, as seen in Table 5.

While bilateral methods did not rank well in the framework, we have chosen to include a fixed-base and chained Jevons index in our second shortlist (in Table 5) to provide a comparison to the multilateral methods and ensure that, were we to find that the GEKS methodology was not suitable for a dataset, we could revert to more traditional index number methods were we to deem them appropriate for the dataset in consideration. In the future, we may also consider the hedonic approaches for use when expenditure information are unavailable, but as these methods did not rank within our top 10 they have not currently been assessed as part of this research.

Nôl i'r tabl cynnwys

5. Stress-testing shortlisted index number methods

To assess the potential suitability of our shortlisted methods in the production of consumer price indices, we have produced a range of synthetic datasets demonstrating isolated pricing behaviours to stress-test each method’s performance. The behaviours we have isolated and include in this section are high attrition rates and product churn, product obsolescence, high variance in prices, and high quantities of product sold.

The synthetic datasets were produced through modification of an open source dataset known as Dominick’s Finer Food data (Dominick’s). These data have been provided by the James M. Kilts Center, University of Chicago Booth School of Business. The data were restricted to a single store, and values with an absent price or quantity were removed before taking a random sample of the data. A model was then fitted to the data to understand the relationship between price and quantity, and this model was subsequently used to build a syhthetic datset. Once the base data had been created, behaviours could be added into the dataset in isolation to see their impact on the resulting indices.

A simple base dataset was initially produced to understand the differences in each method’s index values when the dataset shows a static set of products, so all products are available in all time periods, with no changes in the underlying quality of the sample. There are relatively small changes in prices and quantities throughout the 27-month period studied, as shown by the reduction in mean prices between Periods 1 and 27 in Figure 1.

Period	Product A	Product B	Product C	Product D
January (1)	£1.00	£1.50
February (2)	£1.00	£1.50	£1.20
March (3)		£1.50	£1.20	£2.00
April (4)			£1.20	£2.00

Period	I T(FB)	I F(FB)	I GEKS-T	I GEKS-F	I T(C)	I F(C)
January (1)	1	1	1	1	1	1
February (2)	0.69	0.7	0.7	0.7	0.69	0.7
March (3)	1	1	0.99	0.99	0.97	0.98
April (4)	1	1	1	1	0.97	0.98

Feature		Description
Dump prices		When a product ceases to be produced, it is possible that retailers will apply a “dump” price on their remaining stock of the product to encourage sales.
External shocks		Any external shocks in supply or demand can be replicated in the synthetic data to see how well the index number methods account for these shocks.
Level shift		This dataset will replicate any level shifts that may be seen in pricing structures, such as an increase in VAT.
Flat line		If there are no prices changes for any products between two periods we would want a price index to remain the same as the previous month.
Seasonality		Seasonality is a characteristic of a time series in which the product experiences regular and predictable changes in price and/or sales that recur at consistent intervals
Short series		A method should be able to cope should a data series come to an abrupt ending. E.g. if all products in an EA cease to be sold, the method should not continue producing index values for future periods.

Data Set	Number of unique items	Number of rows of data	Window length of method	Computational run time for method (Seconds, 2 d.p.)
				QU-GK FBME	GEKS-T MS	GEKS-F MS	GEKS-J MS	Chained Jevons
A	827	226278	13	66.08	16.76	9.8	12.37	6.55
B	827	452556	13	449.22	22.41	11.75	15.87	7.56
C	863	465799	27	478.63	37.01	25.9	35.65	8.19
D	863	931598	27	570.26	27.1	22.21	35.9	8.07

Category	Weighted method	GEKS-J MS	GEKS-J WS	GEKS-J GMS
Beer	QU-GK	3.07	3.05	3.05
	GEKS-T MS	1.21	1.19	1.19
	GEKS-F MS	1.27	1.26	1.26
Cereal	QU-GK	2.14	2.19	1.76
	GEKS-T MS	0.42	0.47	0.04
	GEKS-F MS	0.53	0.58	0.15
Laundry detergent	QU-GK	0.96	0.99	0.91
	GEKS-T MS	1.49	1.52	1.44
	GEKS-F MS	1.48	1.51	1.43
Soft drinks	QU-GK	2.49	2.79	5.94
	GEKS-T MS	2.01	2.32	5.47
	GEKS-F MS	1.53	1.83	4.99
Toothpaste	QU-GK	0.15	0.09	0.08
	GEKS-T MS	0.77	0.71	0.7
	GEKS-F MS	0.87	0.82	0.8

Cookies on ons.gov.uk

New index number methods in consumer price statistics

Cynnwys

Traditional price collection

New data sources

Weighting period used in constructing price indices

Time periods used in constructing price indices

Varieties of bilateral index number methods (comparing two time periods)

Download this table Table 1: Bilateral index number methods under consideration

Varieties of multilateral methods (comparing multiple time periods simultaneously)

Download this table Table 2: Multilateral methods and extension methods under consideration

Notes for: Overview of index number methods

Step 1: Shortlist methods

Step 2: Assess shortlisted methods

Step 3: Choose the appropriate method

Download this table Table 3: Criteria weights within the Index Number Methods Framework

Download this table Table 4: Shortlist 1, favoured index number methods when weighting information is available

Download this table Table 5: Shortlist 2, favoured index number methods when weighting information is unavailable

Figure 1: Mean price across time for the base synthetic data

Source: Office for National Statistics

Download this chart Figure 1: Mean price across time for the base synthetic data

Figure 2: Indices produced for shortlisted methods using base synthetic dataset

Source: Office for National Statistics – New index number methods in consumer price statistics

Download this chart Figure 2: Indices produced for shortlisted methods using base synthetic dataset

High attrition rate and product churn

Download this table Table 6: Price data for four products in four periods

Figure 3: Comparison of fixed base and chained approach

Source: Office for National Statistics – New index number methods in consumer price statistics

Download this chart Figure 3: Comparison of fixed base and chained approach

Download this table Table 7: Price and quantity data for two products for four periods

Download this table Table 8: Indices (I) for price and quantity data in Table 7

Figure 4: Indices produced for top shortlisted methods using a high churn synthetic dataset

Source: Office for National Statistics – New index number methods in consumer price statistics

Download this chart Figure 4: Indices produced for top shortlisted methods using a high churn synthetic dataset

Figure 5: Sum of absolute differences between high churn and base index values across 27 periods

Source: Office for National Statistics – New index number methods in consumer price statistics

Download this chart Figure 5: Sum of absolute differences between high churn and base index values across 27 periods

Product obsolescence

Figure 6: Indices produced for shortlisted methods using the product obsolescence synthetic dataset

Source: Office for National Statistics – New index number methods in consumer price statistics

Download this chart Figure 6: Indices produced for shortlisted methods using the product obsolescence synthetic dataset

High price variance

Figure 7: Indices produced for some shortlisted methods using the high price variance synthetic dataset

Source: Office for National Statistics – New index number methods in consumer price statistics

Download this chart Figure 7: Indices produced for some shortlisted methods using the high price variance synthetic dataset

High quantities of products sold

Figure 8: Indices produced for weighted shortlisted methods using the high sales synthetic dataset

Source: Office for National Statistics – New index number methods in consumer price statistics

Download this chart Figure 8: Indices produced for weighted shortlisted methods using the high sales synthetic dataset

Other features assessed

Download this table Table 9: Other synthetic data sets for testing index number methods

Computational run time of the methods

Download this table Table 10: Comparison of computational run times for some methods in ONS consumer prices pipeline

Notes for: Stress-testing shortlisted index number methods

Figure 9: Indices produced using weighted shortlisted methods for soft drinks

Source: Office for National Statistics – Dominick’s

Download this chart Figure 9: Indices produced using weighted shortlisted methods for soft drinks

Figure 10: Indices produced using a range of shortlisted methods for soft drinks

Source: Office for National Statistics – Dominick’s

Download this chart Figure 10: Indices produced using a range of shortlisted methods for soft drinks

Download this table Table 11: Average difference between some unweighted and weighted methods for five Dominick’s food categories, index points

A.1. Unweighted bilateral indices

Jevons

Carli

Dutot

A.2. Weighted bilateral indices

Laspeyres (and Lowe)

Paasche

Fisher

Törnqvist

A.3. Multilateral indices

GEKS

Geary-Khamis

Time product and time product dummy

A.4. Extension methods for multilateral indices

Direct extension

Splicing (window, half-window, movement and geometric)

Figure 11a: Time series values for two windows of data before splicing

Source: Office for National Statistics – New index number methods in consumer price statistics

Download this chart Figure 11a: Time series values for two windows of data before splicing