
Advanced usage: baselines, TD candidates & decision rules
Source:vignettes/seasight-advanced.Rmd
seasight-advanced.RmdThis vignette digs deeper into three aspects of
seasight:
- Using baselines (
current_model) to compare against existing production specifications. - Providing trading-day (TD) / calendar candidates explicitly.
- Understanding the built-in decision rules around “DO_NOT_ADJUST” vs “adjust” and “keep current” vs “switch to new”.
We keep working with AirPassengers for reproducibility,
but the workflow is designed with official statistics in mind (National
Accounts, STS, etc.).
1. Baselines: comparing against your current production model
In many institutions the starting point is a long-standing “legacy” model (e.g. maintained in X-13, JDemetra+ or custom scripts). Changing this model is costly: it affects time series, documentation and possibly downstream forecasting systems.
seasight makes this explicit via the
current_model argument.
library(seasonal)
library(seasight)
x <- AirPassengers
# A somewhat richer baseline spec for illustration
m_current <- seas(
x,
x11 = "",
regression = c("td", "easter"),
arima.model = c(0, 1, 1, 0, 1, 1)
)
sa_tests_model(m_current)We now run the automatic analysis with this baseline:
res <- auto_seasonal_analysis(
y = x,
current_model = m_current,
use_fivebest = TRUE,
include_easter = "auto",
engine = "auto"
)
names(res)Key differences compared to the “no baseline” case:
-
res$tableincludes an additional row for the current model (if it was not already part of the candidate grid). - Diagnostics include distances vs current SA
(e.g.
dist_sa_L1, correlation of seasonal factors). - The decision helper
sa_should_switch()becomes meaningful.
1.1 Visual comparison
You can use the full HTML report:
tmp_file <- tempfile(fileext = ".html")
sa_report_html(
y = x,
current_model = m_current,
title = "AirPassengers – baseline vs. seasight best",
outfile = tmp_file
)
tmp_fileInternally, the report:
- aligns the current and new SA series,
- compares levels and growth rates in two plots, and
- summarises differences in a small statistics table (mean, sd, min, max).
1.2 Numeric comparison: sa_top_candidates_table() and
sa_compare()
For shorter outputs (e.g. in an R Markdown report) you can insert the “Top candidates” table directly:
sa_top_candidates_table(
res,
current_model = m_current,
y = x,
n = 5
)- ✅ rows highlight the best model in green.
- ⭐ rows highlight the current model in blue (even if it is not in the top-5).
- The airline model ARIMA(0 1 1)(0 1 1) is appended in red if available.
For a programmatic comparison, use sa_compare():
cmp <- sa_compare(res, m_current)
cmp$decision
cmp$summary2. Providing calendar regressors (trading days, moving holidays)
Eurostat guidelines recommend treating calendar effects (working-day, leap-year, Easter, moving holidays, etc.) explicitly—especially for STS indicators and National Accounts.([European Commission][1])
seasight follows a simple rule:
You provide candidate regressors;
auto_seasonal_analysis() compares “no‑xreg” against
“with‑xreg” candidates and records what won.
This is done via the td_candidates argument: a
named list of ts (or ts-boxable)
regressors aligned to y.
2.1 Precomputed regressors (example: length-of-month)
For a proper trading-day setup you’d typically use weekday counts or contrasts. Here we build a simple length-of-month regressor just to show the mechanics.
days_in_month <- function(date) {
d0 <- as.Date(format(date, "%Y-%m-01"))
d1 <- seq(d0, by = "month", length.out = 2)[2]
as.integer(d1 - d0)
}
dates <- seq(from = as.Date("1949-01-01"), by = "month", length.out = length(x))
ndays <- vapply(dates, days_in_month, integer(1))
# centering is common in calendar regressors
td_len <- ts(ndays - mean(ndays), start = start(x), frequency = frequency(x))
plot(td_len, main = "Centered length-of-month regressor")Run the grid with this regressor as a candidate:
res_td <- auto_seasonal_analysis(
y = x,
current_model = m_current,
td_candidates = list(lenmonth = td_len),
td_usertype = "td",
include_easter = "auto",
engine = "auto",
max_specs = 3
)
head(res_td$table)Key columns:
-
with_td– whether a calendar regressor is included, -
td_name– which regressor (list name) was used, -
td_pandtd_sig– regressor significance diagnostics (min p-value over TD terms).
2.2 Moving-holiday pulses via build_user_xreg()
(Genhol-style)
If you have a moving holiday calendar (Diwali, Ramadan, Carnival, …)
you can build a pulse regressor via seasonal::genhol()
using build_user_xreg():
diwali_dates <- as.Date(c(
"2018-11-07","2019-10-27","2020-11-14","2021-11-04",
"2022-10-24","2023-11-12","2024-11-01"
))
td_hol <- build_user_xreg(
y = x,
holidays = list(diwali = list(dates = diwali_dates, start = -1, end = 1)),
td_usertype = "holiday"
)
res_hol <- auto_seasonal_analysis(
y = x,
current_model = m_current,
td_candidates = td_hol,
include_easter = "auto",
engine = "auto"
)2.3 SEATS and “future” xreg values (why padding exists)
When using SEATS, X-13 typically relies on an
internal forecast horizon (often ~3 years). If xreg is
used, X-13 requires those regressor values to exist over the same
horizon.
seasight therefore pads pulse-type holiday
regressors (like genhol() output) with zeros into
the forecast horizon automatically. This is correct for pulses.
For other calendar regressors (e.g. trading-day counts), the calendar
is known in advance—so you should ideally provide future values
yourself. If you cannot, consider running engine = "x11"
for those candidates.
3. Decision rules: existence of seasonality & switching models
3.1 “DO_NOT_ADJUST” vs “adjust”
The first decision is whether the series should be
seasonally adjusted at all. seasight exposes this via:
-
res$seasonality$overall$call_overall, and - the helper
sa_is_do_not_adjust().
The exact thresholds follow Eurostat recommendations (QS on original / SA, M-statistics, IDS), but the logic is:
-
If there is no material seasonality in the original series (QSori and M7 indicate “no seasonality”), then:
-
call_overallbecomes"DO_NOT_ADJUST"; - the report will keep the raw series as the preferred series;
-
sa_is_do_not_adjust()returnsTRUEfor the best candidate row.
-
Otherwise, the series is treated as seasonal and an SA model is selected.
Example:
# For AirPassengers we expect clear seasonality
res_td$seasonality$overall
# Translate the table row for the best model into a DO_NOT_ADJUST flag
best_row <- res_td$table[1, ]
sa_is_do_not_adjust(best_row)For series with borderline or unstable seasonality this rule helps to avoid spurious adjustment, as recommended by the Eurostat ESS Guidelines.([European Commission][1])
3.2 “Keep current model” vs “switch to new model”
Given that seasonal adjustment revisions can affect downstream users, Eurostat and central banks usually discourage frequent model changes unless there is a clear quality gain.([European Commission][1])
seasight encodes this in
sa_should_switch(res):
sa_should_switch(res_td)The function returns a string, typically:
-
"CHANGE_TO_NEW_MODEL"or -
"KEEP_CURRENT_MODEL".
Internally, it uses the best row of res$table.
The rule checks:
Residual diagnostics: The new model must pass basic tests (QS on SA, Ljung–Box). If the best model fails badly (e.g. strong remaining seasonality, severe autocorrelation), the recommendation is to keep the current model.
Similarity of seasonal pattern: The seasonal factors should remain sufficiently correlated with the incumbent when an incumbent comparison is available.
Distance to incumbent: The candidate’s seasonally adjusted series should not be an outlying distance from the incumbent relative to the candidate set.
The broader ranking step in auto_seasonal_analysis()
also uses AICc, revision metrics for the top candidates,
distance-to-incumbent diagnostics and an engine preference penalty.
Those quantities determine which row is considered “best”;
sa_should_switch() then applies the narrower switching gate
to that row.
You can tune the thresholds via arguments such as
sa_should_switch(
res_td,
thresholds = list(
min_qs_p = 0.10,
min_lb_p = 0.05,
min_corr_seas = 0.90,
max_dist_sa_mult = 1.25
)
)3.3 Putting it together: a minimal decision wrapper
In a production pipeline you might want a very compact decision summary per series. For example:
sa_decision <- function(y, current_model = NULL, td_candidates = NULL) {
res <- auto_seasonal_analysis(
y = y,
current_model = current_model,
td_candidates = td_candidates,
engine = "auto",
include_easter = "auto"
)
best_row <- res$table[1, ]
list(
call_overall = res$seasonality$overall$call_overall[1],
do_not_adjust = sa_is_do_not_adjust(best_row),
switch_decision = if (!is.null(current_model)) sa_should_switch(res) else "NO_BASELINE",
best_model = res$best,
diagnostics = sa_tests_model(res$best)
)
}
dec <- sa_decision(x, current_model = m_current, td_candidates = list(lenmonth = td_len))
dec$call_overall
dec$do_not_adjust
dec$switch_decisionYou can then store e.g. dec$diagnostics and
sa_copyable_call(dec$best_model, x_expr = "…") in your
internal documentation, and regenerate the full HTML report only when
needed.
4. Further reading
The methodological choices in seasight build on the
standard references used in official statistics:
Eurostat (2015): ESS Guidelines on Seasonal Adjustment. Practical framework for seasonal adjustment in the European Statistical System, including diagnostics, revision policies and documentation.([European Commission][1])
Eurostat (2018): Handbook on Seasonal Adjustment (2018 edition). Comprehensive treatment of X-13ARIMA-SEATS and TRAMO/SEATS methods, including M-statistics, QS tests, calendar effects and revision analysis.
Gómez & Maravall (1996) and subsequent work on TRAMO/SEATS, for background on model-based decomposition and automatic model identification.
Grudkowska (2015, 2016): JDemetra+ User Guide and Reference Manual, for an implementation of similar principles in an official ESS tool.
seasight aims to provide a lightweight, R-native
interface for these ideas, compatible with seasonal and
suitable for integration into reproducible forecasting and National
Accounts workflows.