How does an ARIMA model capture the effects of omitted causal variables?
Consider the following ...
and ...
thus we can substitute EQUATION B into EQUATION A ...
but since from EQUATION A ...
rearranging EQUATION D we get ...
thus we can substitute EQUATION E into EQUATION C ...
or the now familar ARIMA model ...
Consider the following ...
A classic transfer function is as follows.
If we omit the X variable then we have to deal with ...
If the omitted variable is stochastic and has no internal time dependency
(white noise) then its effect is simply to increase the background variance resulting
in a downward bias of the tests of necessity and sufficiency. If however the omitted
series is stochastic and has some internal autocorrelation then this structure evidences
itself in the error process and can be identified as a regular phenomenon and appears
as ARIMA structure. For example, if "degree days" is needed to forecast beer sales, but if it's omitted a seasonal ARIMA
structure will be identified and becomes a surrogate for the omitted variable.
If the omitted variable is deterministic and without recurring pattern
it may be identified via surrogate (intervention) series.