|
AFS's Forecasting Philosophy
At
AFS, our forecasting packages are based on two central beliefs.
One, that the Box-Jenkins (BJ) approach, both univariate (ARIMA)
and transfer function (causal models), to model identification,
estimation, model diagnostics and forecasting provides the proper
framework for forecasting. The Box-Jenkins modeling paradigm is
very rich and in fact subsumes most other common forms of numerical
forecasting techniques such as regression and exponential smoothing.
The
second philosophy is quite simple, procedures that consist of methods
applied in a consistent way are subject to automation. We strongly
believe that BJ forecasting techniques are one such set of methods.
So, we developed a sophisticated forecasting engine that applies
the modeling philosophy of Box and Jenkins to time series and will
at your option build the models automatically. The result is a powerful
package that provides a set of tools for both beginners and experts.
We
feel that it is crucial to understand our philosophy of forecasting
before evaluating our products. Given these philosophies, the elegance
and forethought in the design of these packages should be apparent.
Statistical
methods can be very useful in summarizing information and very powerful
in testing hypothesis. As in all things, there can be drawbacks.
To proceed with the application of a statistical test, one has to
be careful about validating the assumptions under which the test
is valid. It is often possible to remedy the violation and then
safer to proceed. One of the most often violated assumptions is
that the observations are independent. Unfortunately, the real world
operates in ignorance of many statistical assumptions, which can
lead to problems in analysis. The good news is that these problems
may be easily overcome, so long as they are recognized and dealt
with correctly.
The
assumption of independence of observations implies that the most
recent data point contains no more information or value than any
other data point, including the first one that was measured. In
practice, this means that the most recent reading has no special
effect on estimating the next reading or measurement. In summary,
it provides no information about the next reading. If this assumption,
i.e. independence of all readings, is true this implies that the
overall mean or average is the best estimate of future values. If
however there is some serial or autoprojective (autocorrelation)
structure, the best estimate of the next reading will depend on
the recently observed values. The exact form of this prediction
is based on the observed correlation structure. Stock market prices
are an example of an autocorrelated data set. Weather patterns move
slowly, thus daily temperatures have been found to be reasonably
described by an AR(2) model which implies that your temperature
is a weighted average of the last two days temperature. Try it and
see if doesn't work !
Statistical
Process Control (SPC) in industry has proved to be a major factor
in cost-savings and achieving productivity improvement. The Statistical
Control Charts developed by Walter Shewhart in 1931 is still the
most popular chart in use today, due to its simplicity and effectiveness.
Shewhart assumed that the observations are normally distributed
and statistically independent. If these assumptions are not met
then the standard formulas don't apply.
The
need for the samples to be statistically independent is critical
while the lack of normality is less serious. In many cases, statistical
independence can not be assumed. Inertial elements in the process
frequently cause the observations to become positively autocorrelated:
that is, if X(t) is positive, it is likely that X(t+1) will also
be positive. In sales and marketing applications, high sales in
one month might lead to low sales the following month. This is sometimes
known as "lumpy demand". In this case, there is negative
autocorrelation.
The
mean of the observations also tends to meander or drift. This does
not mean that the process is "out-of-control" -- it actually
represents the inherent process variation. The application of a
Shewhart control chart in this case results in many false alarms,
leading to expensive and fruitless searches for assignable causes.
There is then a clear need to incorporate the autocorrelation effect
into the Control Charts.
Autoprojective
tools or models are surrogates for omitted variables. An ARIMA model
is the ultimate case of an omitted variable or sets of variables.
In most cases, as long as true cause variables don't change, history
is prologue. However, one should always try to collect and use information
or data for the cause variables or potential cause variables. The
second approach can be contrasted to the former by referring to
autoprojective schemes as "rear-window driving" while
causative models or approaches are "front and rear-window driving"
because one often has good estimates of future values of the cause
variables ( price, promotion schedule, occurrence of holidays, competitive
price, etc. )
There
is no question that "ALL MODELS ARE WRONG" but simply
stated "SOME MODELS ARE USEFUL". When Box and Jenkins
codified the process for rigorous systematic identification of models
using (of all things) the observed data to aid this process, they
laid out the ground rules for model validation. Many researchers
failed to validate these assumptions due to a myriad number of reasons
such as: cost of computers, time to perform checks, etc. Box and
Jenkins themselves took shortcuts and often were misled by the results.
Researchers point out that simple minded differencing rather than
detrending can bear poor results. Very true! Autobox deals with
this issue.
In
all science there is one recurring theme "a more comprehensive
approach will lead to better results". There is one caveat
and that is "implemented correctly". The failure has not
been in the method, but in those implementing the method. AUTOBOX
tries to help the user avoid such pitfalls.
|