This vignette provides a detailed explanation of the statistical
measures and hypothesis tests used in
get_revision_analysis()
. The function is designed to
analyze revisions between preliminary and final data releases, helping
to assess bias, efficiency, correlation, seasonality, and the relative
contributions of news and noise in revisions. The function returns a
tibble with the desired statistics and hypothesis tests. In the
following sections, we provide a detailed explanation of the statistics
and hypothesis tests used in the function and provide the corresponding
column name in brackets to assess it.
Let’s define some notation to facilitate the discussion of the statistics and hypothesis tests used in the function:
- denote the th released value for time with being the preliminary release
- denote the final revised value for time
- be the final revision compared to the th release for time
- is the number of observations
The function takes two mandatory arguments:
- A data frame containing the preliminary data releases.
- A data frame containing the final release data against which the revisions are calculated and preliminary data will be compared.
The function argument degree
allows users to specify
which statistics and hypothesis tests to include in the output:
-
degree = 1
includes information about revision size (default) -
degree = 2
includes correlation statistics of revision -
degree = 3
includes news and noise tests -
degree = 4
includes sign switches, seasonality analysis and Theil’s U -
degree = 5
includes all the statistics and hypothesis tests
Summary Statistics
Revision Size
Revisions provide insights into the reliability of initial data releases. Various metrics assess their magnitude and distribution:
1. Mean Revision ("Bias (mean)"
)
The mean revision quantifies systematic bias in the revisions:
If the mean revision is significantly different from zero, it indicates a tendency for initial releases to be systematically over- or under-estimated. A t-test is conducted to test the null hypothesis .
-
"Bias (p-value)"
reports the standard t-test p-value.
-
"Bias (robust p-value)"
provides a heteroskedasticity-robust alternative.
2. Mean Absolute Revision ("MAR"
)
The mean absolute revision measures the average size of revisions, regardless of direction:
This metric is useful when evaluating the magnitude of revisions rather than their direction.
3. Minimum and Maximum Revisions
("Minimum"
, "Maximum"
)
These statistics capture the most extreme downward and upward revisions in the dataset.
4. Percentiles of Revisions ("10Q"
,
"Median"
, "90Q"
)
The 10th, 50th (median), and 90th percentiles provide a distributional perspective, helping to identify skewness and tail behavior in revisions.
5. Standard Deviation of Revisions
("Std. Dev."
)
The standard deviation quantifies the dispersion of revisions:
A higher standard deviation indicates greater variability in the revision process.
6. Noise-to-Signal Ratio
("Noise/Signal"
)
This metric measures the relative size of revisions compared to the total variance in the final data:
where is the standard deviation of the final value . A high noise-to-signal ratio suggests that revisions are large relative to the underlying variability in the final data, indicating high uncertainty in initial estimates.
library(reviser)
library(dplyr)
library(tsbox)
gdp <- reviser::gdp %>%
ts_pc() %>%
na.omit()
df <- get_nth_release(gdp, 0:1)
final_release <- get_latest_release(gdp)
results <- get_revision_analysis(
df,
final_release,
degree = 1
)
head(results)
#> # A tibble: 6 × 14
#> id release N `Bias (mean)` `Bias (p-value)` `Bias (robust p-value)`
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 CHE release_0 178 0.110 0.00692 0.000107
#> 2 CHE release_1 177 0.0960 0.0148 0.000202
#> 3 EA release_0 178 0.0553 0.00167 0.000450
#> 4 EA release_1 177 0.0502 0.00367 0.000673
#> 5 JP release_0 178 0.00989 0.848 0.795
#> 6 JP release_1 177 0.00920 0.859 0.790
#> # ℹ 8 more variables: Minimum <dbl>, Maximum <dbl>, `10Q` <dbl>, Median <dbl>,
#> # `90Q` <dbl>, MAR <dbl>, `Std. Dev.` <dbl>, `Noise/Signal` <dbl>
Correlation of Revisions
Understanding how revisions relate to initial releases and past revisions can reveal patterns in the revision process:
1. Correlation Between Revisions and Initial
Releases ("Correlation"
)
This measures whether revisions are systematically related to the initial release , which serves as a proxy for available information at the time of release:
A significant correlation suggests that initial estimates contain
information that predicts later revisions. A t-test
("Correlation (p-value)"
) is used to test whether
is significantly different from zero.
2. 1st order Autocorrelation of Revisions
("Autocorrelation (1st)"
)
The first-order autocorrelation measures the persistence in the revision process by examining whether past revisions predict future revisions:
If revisions exhibit strong autocorrelation, it may indicate a
systematic pattern in the revision process rather than purely random
adjustments.
A t-test ("Autocorrelation (1st p-value)"
)
is used to assess whether
is significantly different from zero.
3. Autocorrelation of Revisions up to 1 year
("Autocorrelation up to 1yr (Ljung-Box p-value)"
)
This metric assesses the autocorrelation of revisions up to a lag of 1 year using the Ljung-Box test. The null hypothesis of the Ljung-Box test is that there is no autocorrelation up to the specified lag. In other words, the revisions are independent of one another. The alternative hypothesis is that there is autocorrelation in the revisions.
- For quarterly data (4 observations per year), the test checks for autocorrelation up to 4 lags.
- For monthly data (12 observations per year), the test checks for autocorrelation up to 12 lags.
- For other frequencies, the test is skipped.
The Ljung-Box test statistic is computed using the following formula:
is the Ljung-Box statistic.
is the number of observations in the time series.
is the sample autocorrelation at lag . This measures the correlation between the revisions at time and time .
-
is the number of lags considered (which corresponds to the frequency of the data: 4 for quarterly data, 12 for monthly data).
The test statistic is used to determine whether there is significant autocorrelation. If the autocorrelation at a particular lag is large, it suggests that the revisions at that lag are not independent of each other. The larger the -statistic, the stronger the evidence that the revisions are autocorrelated.
These metrics collectively help evaluate the reliability, predictability, and potential biases in data revisions.
results <- get_revision_analysis(
df,
final_release,
degree = 2
)
head(results)
#> # A tibble: 6 × 8
#> id release N Correlation Correlation (p-value…¹ Autocorrelation (1st…²
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 CHE release… 178 -0.209 0.00517 -0.0887
#> 2 CHE release… 177 -0.204 0.00656 -0.114
#> 3 EA release… 178 -0.315 0.0000187 -0.0922
#> 4 EA release… 177 -0.318 0.0000158 -0.157
#> 5 JP release… 178 -0.204 0.00626 -0.278
#> 6 JP release… 177 -0.241 0.00126 -0.292
#> # ℹ abbreviated names: ¹`Correlation (p-value)`, ²`Autocorrelation (1st)`
#> # ℹ 2 more variables: `Autocorrelation (1st p-value)` <dbl>,
#> # `Autocorrelation up to 1yr (Ljung-Box p-value)` <dbl>
Sign Switches
Sign switches in data revisions can occur when the direction of an economic indicator changes between its initial release and later revisions. We define two key metrics to assess the stability of these directional signals:
1. Fraction of sign changes
This metric ("Fraction of correct sign"
) evaluates how
often the sign of the initially reported value
differs from the final revised value
.
Mathematically, we compute:
where is an indicator function that equals 1 if the signs match and 0 otherwise.
2. Fraction of sign changes in the growth rate
This metric ("Fraction of correct growth rate change"
)
assesses whether the direction of change in the variable remains
consistent after revisions. Specifically, we compare the sign of the
period-over-period differences:
where and represent the first differences of the initially reported and final values, respectively.
A high fraction of incorrect signs in either metric suggests that early estimates may be unreliable in capturing the true direction of the variable.
Hypothesis Tests
News and Noise Tests for Data Revisions
The news and noise tests analyze the properties of data revisions in relation to the final and preliminary releases of an economic variable. These tests help to evaluate whether the revisions are systematic, whether they contain new information, or whether they are simply noisy adjustments (Mankiw and Shapiro 1986; Aruoba 2008).
Final revisions can be classified into two categories:
Noise: The initial announcement is an observation of the final series, measured with error. This means that the revision is uncorrelated with the final value but correlated with the data available when the estimate is made.
News: The initial announcement is an efficient forecast that reflects all available information, and subsequent estimates reduce the forecast error by incorporating new information. The revision is correlated with the final value but uncorrelated with the data available when the estimate is made, i.e., unpredictable using the information set at the time of the initial announcement.
To test these categories, we run two regression tests:
1. The Noise Test
The noise test examines whether the revision can be predicted by the final value:
If the preliminary value were an efficient and unbiased estimate of the final value (i.e incorporating all relevant information), revisions should be uncorrelated with the final value. The null hypothesis is:
- The function tests the joint hypothesis
(
"Noise joint test (p-value)"
) that both and using a heteroskedasticity and autocorrelation consistent (HAC) covariance matrix to ensure robust inference. -
tests (
"Noise test Intercept (p-value)"
) whether the revisions have a systematic bias. -
tests (
"Noise test Coefficient (p-value)"
) whether the revisions are correlated with the final value, indicating inefficiency.
2. The News Test
The news test examines whether the revision is predictable using the preliminary release:
Here, we test whether revisions are systematically related to the initial value. The null hypothesis is:
- The function tests the joint hypothesis
(
"News joint test (p-value)"
) that both and using a HAC covariance matrix to ensure robust inference. -
tests (
"News test Intercept (p-value)"
) whether the revisions have a systematic bias. -
tests (
"News test Coefficient (p-value)"
) whether the revisions are correlated with the initial value, indicating inefficiency.
Note: Instead of regressing on or , one can similarly regress $$ \text{Noise: } \quad y_t^h = \alpha + \beta \cdot y_t^f + \epsilon_t \\ \text{News: } \quad y_t^f = \alpha + \beta \cdot y_t^h + \epsilon_t $$ and test and for the presence of noise and news in the data revisions.
results <- get_revision_analysis(
df,
final_release,
degree = 3
)
head(results)
#> # A tibble: 6 × 17
#> id release N `News test Intercept` `News test Intercept (std.err)`
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 CHE release_0 178 0.149 0.0419
#> 2 CHE release_1 177 0.134 0.0440
#> 3 EA release_0 178 0.0741 0.0195
#> 4 EA release_1 177 0.0690 0.0180
#> 5 JP release_0 178 0.0585 0.0438
#> 6 JP release_1 177 0.0650 0.0407
#> # ℹ 12 more variables: `News test Intercept (p-value)` <dbl>,
#> # `News test Coefficient` <dbl>, `News test Coefficient (std.err)` <dbl>,
#> # `News test Coefficient (p-value)` <dbl>, `News joint test (p-value)` <dbl>,
#> # `Noise test Intercept` <dbl>, `Noise test Intercept (std.err)` <dbl>,
#> # `Noise test Intercept (p-value)` <dbl>, `Noise test Coefficient` <dbl>,
#> # `Noise test Coefficient (std.err)` <dbl>,
#> # `Noise test Coefficient (p-value)` <dbl>, …
Test of Seasonality in Revisions
To test whether seasonality is present in revisions, we employ a Friedman test. Note that this test is always performed on first-differentiated series.
Friedman Test
The Friedman test ("Seasonality (Friedman p-value)"
) is
a non-parametric statistical test that examines whether the distribution
of ranked data differs systematically across seasonal periods, such as
months or quarters. It does not require distributional assumptions,
making it a robust tool for detecting seasonality in revision series.
The test is applied by first transforming the revision series into a
matrix where each row represents a year and each column represents a
specific month or quarter. It is constructed as follows. Consider first
the matrix of data
with
rows (i.e., the number of years in the sample) and
columns (i.e., either 12 months or 4 quarters, depending on the
frequency of the data). The data matrix needs to be replaced by a new
matrix
,
where the entry
is the rank of
within the row
.
The test is executed using the stats::friedman.test()
function. The null hypothesis is that the average ranking across columns
is the same, indicating no seasonality in the revisions.
Theil’s U Statistics
In the context of revision analysis, Theil’s inequality coefficient, or Theil’s U, provides a measure of the accuracy of a initial estimates () relative to final or more refined values (). Various definitions of Theil’s statistics exist, leading to different interpretations. This package considers two widely used variants: U1 and U2.
1. Theil’s U1 Statistic
("Theil's U1"
)
The first measure, U1, is given by:
The U1 statistic is bounded between 0 and 1:
- A value of 0 implies perfect forecasting
(
for all
).
- A value closer to 1 indicates lower accuracy.
However, U1 suffers from several limitations, as highlighted by Granger and Newbold (1973). A critical issue arises when one of or equals zero in all periods. In such cases, the denominator and numerator become equal, leading to U1 = 1 even when preliminary estimates are close to the final values.
2. Theil’s U2 Statistic
("Theil's U2"
)
To address these shortcomings, Theil et al. (1966) proposed an alternative measure, U2, which is defined as:
where:
- The numerator captures the relative errors in the
preliminary estimates.
- The denominator measures the magnitude of changes in the final values over time.
Unlike U1, U2 is not bounded between 0 and 1. Instead:
- A value of 0 implies perfect accuracy
(
for all
).
-
means the accuracy of the preliminary estimates is equal to a
naïve forecast.
-
suggests the preliminary estimates are less accurate
than the naïve approach.
- indicates the preliminary estimates are more accurate than the naïve method.
Whenever possible (if for all ), U2 is the preferred metric for evaluating preliminary estimates.
results <- get_revision_analysis(
df,
final_release,
degree = 4
)
head(results)
#> # A tibble: 6 × 8
#> id release N Fraction of correct …¹ Fraction of correct …² `Theil's U1`
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 CHE releas… 178 0.803 0.652 0.262
#> 2 CHE releas… 177 0.802 0.644 0.250
#> 3 EA releas… 178 0.944 0.764 0.0818
#> 4 EA releas… 177 0.944 0.763 0.0797
#> 5 JP releas… 178 0.820 0.725 0.266
#> 6 JP releas… 177 0.802 0.712 0.262
#> # ℹ abbreviated names: ¹`Fraction of correct sign`,
#> # ²`Fraction of correct growth rate change`
#> # ℹ 2 more variables: `Theil's U2` <dbl>,
#> # `Seasonality (Friedman p-value)` <dbl>
In the above examples, we analyze GDP revisions by comparing the final release with the first and second releases, which correspond to the diagonal elements of the real-time data matrix. This approach provides insight into how initial estimates evolve over time.
Alternatively, we can compare GDP revisions between specific
publication dates. In the example below, we analyze revisions from the
GDP releases from April and July 2024, using the final release from
October 2024 as the benchmark. The grouping of statistics is by default
determined by the pub_date
or release
column.
Alternatively, it can be specified via the optional function argument
grouping_var
.
df <- gdp %>%
filter(pub_date %in% c("2024-04-01", "2024-07-01"))
results <- get_revision_analysis(
df,
final_release,
degree = 5
)
head(results)
#> # A tibble: 6 × 39
#> id pub_date N Frequency `Bias (mean)` `Bias (p-value)`
#> <chr> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 CHE 2024-04-01 176 4 0.00226 0.478
#> 2 CHE 2024-07-01 177 4 -0.00123 0.371
#> 3 EA 2024-04-01 176 4 0.00656 0.159
#> 4 EA 2024-07-01 177 4 0.00205 0.520
#> 5 JP 2024-04-01 176 4 -0.00323 0.490
#> 6 JP 2024-07-01 177 4 -0.00276 0.305
#> # ℹ 33 more variables: `Bias (robust p-value)` <dbl>, Minimum <dbl>,
#> # Maximum <dbl>, `10Q` <dbl>, Median <dbl>, `90Q` <dbl>, MAR <dbl>,
#> # `Std. Dev.` <dbl>, `Noise/Signal` <dbl>, Correlation <dbl>,
#> # `Correlation (p-value)` <dbl>, `Autocorrelation (1st)` <dbl>,
#> # `Autocorrelation (1st p-value)` <dbl>,
#> # `Autocorrelation up to 1yr (Ljung-Box p-value)` <dbl>, `Theil's U1` <dbl>,
#> # `Theil's U2` <dbl>, `Seasonality (Friedman p-value)` <dbl>, …
The get_revision_analysis()
function requires a single
time series in final_release
and compares it to multiple
time series in df
for the same id
. To always
compare two consecutive publication dates, it is necessary to structure
the analysis sequentially. The code below iterates over consecutive
publication dates in the gdp dataset, comparing revisions between each
pair. For each pair of consecutive dates, it extracts the corresponding
data, runs get_revision_analysis()
to analyze revisions,
and then combines the results into a single dataframe.
# Get unique sorted publication dates
pub_dates <- gdp %>%
distinct(pub_date) %>%
arrange(pub_date) %>%
pull(pub_date)
# Run the function for each pair of consecutive publication dates
results <- purrr::map_dfr(seq_along(pub_dates[-length(pub_dates)]), function(i)
{
df <- gdp %>%
filter(pub_date %in% pub_dates[i])
final_release <- gdp %>%
filter(pub_date %in% pub_dates[i + 1])
get_revision_analysis(df, final_release, degree = 5)
}
)
head(results)
#> # A tibble: 6 × 39
#> id pub_date N Frequency `Bias (mean)` `Bias (p-value)`
#> <chr> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 CHE 2002-10-01 90 4 0.000632 0.634
#> 2 EA 2002-10-01 90 4 0.00106 0.512
#> 3 JP 2002-10-01 90 4 0.0231 0.629
#> 4 US 2002-10-01 90 4 0 NaN
#> 5 CHE 2003-01-01 91 4 -0.00384 0.563
#> 6 EA 2003-01-01 91 4 0.00192 0.750
#> # ℹ 33 more variables: `Bias (robust p-value)` <dbl>, Minimum <dbl>,
#> # Maximum <dbl>, `10Q` <dbl>, Median <dbl>, `90Q` <dbl>, MAR <dbl>,
#> # `Std. Dev.` <dbl>, `Noise/Signal` <dbl>, Correlation <dbl>,
#> # `Correlation (p-value)` <dbl>, `Autocorrelation (1st)` <dbl>,
#> # `Autocorrelation (1st p-value)` <dbl>,
#> # `Autocorrelation up to 1yr (Ljung-Box p-value)` <dbl>, `Theil's U1` <dbl>,
#> # `Theil's U2` <dbl>, `Seasonality (Friedman p-value)` <dbl>, …