Ming Zhong
Doctoral Student
Faculty of Engineering, University of Regina
Regina, SK, Canada, S4S 0A2
Phone: (902) 496-8152
Fax:
(902) 420-5035
Email:
Ming.Zhong@stmarys.ca
Pawan
Lingras*
Professor
Dept. of Mathematics and Computing
Science
Saint
Mary’s University
Halifax,
NS, Canada, B3H 3C3
Phone:
(902) 420-5798
Fax: (902) 420-5035
Email:
Pawan.Lingras@stmarys.ca
And
Satish Sharma
Professor
Faculty
of Engineering, University of Regina
Regina,
SK, Canada, S4S 0A2
Phone:
(306) 585-4553
Fax: (306) 585-4855
Email: Satish.Sharma@uregina.ca
Ming Zhong, Pawan Lingras,
and Satish Sharma
ABSTRACT: The principle of Base Data
Integrity addressed by both American Association of State Highway and
Transportation Officials (AASHTO) and American Society for Testing and Materials
(ASTM) recommends that missing values should not be imputed in the base data.
However, updating missing values may be necessary in data analysis and helpful
in establishing more cost-effective traffic data programs. The analyses applied
to data sets from two highway agencies show that on average over 50% permanent
traffic counts (PTCs) have missing values. It will be difficult to eliminate
such a significant portion of data from the analysis. Literature review
indicates that the limited research uses factor or autoregressive integrated
moving average (ARIMA) models for predicting missing values. Factor-based
models tend to be less accurate. ARIMA models only use the historical data. In
this study, genetically designed neural network and regression models, factor
models, and ARIMA models were developed to update pseudo-missing values of six
PTCs from Alberta, Canada. Both short-term prediction models and the models
based on data from before and after the failure were developed. Factor models
were used as benchmark models. It was found that genetically designed
regression models based on data from before and after the failure had the most
accurate results. Average errors for refined models were lower than 1% and the
95th percentile errors were below 2% for counts with stable
patterns. Even for counts with relatively unstable patterns, average errors
were lower than 3% in most cases, and the 95th percentile errors
were consistently below 9%. ARIMA models and genetically designed neural
network models also showed superior performance than benchmark factor models.
It is believed that the models proposed in this study would be helpful for
highway agencies in their traffic data programs.
Key words: Missing values, Traffic counts, Genetic algorithms, Time delay neural network, Locally weighted
regression, Autoregressive integrated moving average
INTRODUCTION
Highway agencies commit a
significant portion of their resources to data collection, summarization, and
analysis (Sharma et al. 1996). The
data is used in planning, design, control, operation, and management of traffic
and highway facilities. However, the presence of missing values makes the data
analysis difficult. Without proper imputation methods, traffic counts with
missing values are usually discarded and new counts have to be retaken.
This
study analyzed missing values for the data sets from two highway agencies in
North America. First data set was from Alberta Transportation Department, and
the other was from the Minnesota Department of Transportation (MnDOT). In
Alberta, over seven years, more than half of total counts have missing values.
During some years the percentage is as high as 70% to 90%. A year data from
MnDOT shows more than 40% counts having missing values. Williams et al. (1998) applied seasonal ARIMA and
exponential smoothing models to predict short-term traffic for two study sites
on an urban freeway near Washington, D.C. It was reported that approximately 20
percent of the data in the development and test sets of their study were missing.
Ramsey and Hayden (1994) introduced a computer program – AutoCounts used by The
Countywide Traffic and Accident Data Unit (TADU) in England to process
automatic traffic count data. It was found that infill models had to be used to
estimate average flows for more than 50 percent of months for many years at a
study site.
There
are increasing concerns about data imputation and Base Data Integrity. The
principle of Base Data Integrity is an important theme discussed in both
American Society for Testing and Materials (ASTM) Standard Practice E1442, Highway Traffic Monitoring Standards
(America 1991) and the American Association of State Highway and Transportation
Officials (AASHTO) Guidelines for Traffic
Data Programs (America 1992). The principle says that traffic measurements
must be retained without modification and adjustment. Missing values should not
be imputed in the base data. However, this does not prohibit imputing data at
analysis stage. In some cases, traffic counts with missing values could be the
only data available for certain purpose and data imputation is necessary for
further analysis. In accordance with the principle of Truth-in-Data, AASHTO Guidelines (America 1992) also
recommends highway agencies should document the procedures for editing traffic
data.
For
the traffic counts with missing values, highway agencies usually either retake
the counts or estimate the missing values. Estimating missing values is known
as data imputation. Since sometimes retaking counts was impossible due to limited
resources and time, imputing the data became a popular method (Albright 1991a).
For example, it was reported that many highway agencies in the United States
estimated missing values for their traffic counts (New Mexico 1990). In Europe,
highway authorities in Netherlands, France, and the United Kingdom all used
some computer programs for data validation routines. Usually missing or invalid
data was replaced with historical data from the same site during the same
period (FHWA 1997). The experience with data from Alberta Transportation also
indicates that the agency used data imputation before 1995. The replaced values
of missing data were marked with minus signs for some years. Imputing data with
reasonable accuracy may help establish more cost-effective traffic data
program. The analysis of Alberta data also shows that a significant percent
(varying from 10% to 44% from year to year) of traffic counts have missing data
for several successive days or months. Usually these PTCs can not be used to
calculate AADT or DHV due to the missing data. Such PTCs may be used as
seasonal traffic counts (STCs), short-period traffic counts (SPTCs), or just
discarded by highway agencies. However, the information contained in these PTCs
is certainly more than that from STCs and SPTCs. If missing data from PTCs can
be accurately updated, further analysis could be applied based on AADT or DHV.
A
review of literature indicates that little research has been done on missing
values. Most methods used by transportation practitioners were simple factor
approaches or moving average regression analyses (New Mexico 1990; FHWA 1997).
Two studies (Ahmed and Cook 1979; Nihan and holmesland 1980) from the United
States used Box-Jenkins techniques to predict short-term traffic for urban freeways.
The models showed reasonable accuracy. These models can be used to update
missing values for traffic counts. Models developed by Nihan and Holmesland
(1980) were able to predict average weekday volumes for two months, in which
entire monthly data was missing. A group of scholars at University of Leeds,
England, tried to model outliers and missing values in traffic count time
series by employing exponentially weighted moving average, autocorrelation
based influence function, and autoregressive integrated moving average (ARIMA)
models (Clark 1992; Redfern et al.
1993; Waston et al. 1993). It was
found that ARIMA models outperformed other models in detecting missing values
and outliers.
In
this study, factor approaches, time series analysis, and genetically designed
neural network and regression models are tested on six permanent traffic counts
(PTCs) from Alberta, Canada to investigate their abilities of updating missing
values. This study also compares the models based on historical data with
models based on data from before and after failure. The six PTCs belong to
different groups based on the trip purpose and trip length distributions. The
experiments presented in this paper illustrate how to use proposed techniques
to update missing values of these PTCs. The techniques used in this study could
not only be applied to permanent traffic volume counts, but also to seasonal or
short-term traffic volume counts, vehicle classification counts, weight counts,
and speed counts.
LITERATURE REVIEW
There
is significant amount of research related to missing values (Little and Rubin
1987; Bole et al. 1990; Beveridge
1992; Wright 1993; Gupta and Lam 1996; Singh and Harmancioglu 1996). However,
limited research is available on how missing data are handled by transportation
practitioners. Southworth et al.
(1989) introduced a system called RTMAS for urban population evacuations in
times of threat. One subroutine of this system is AUTOBOX, which applies
Box-Jenkins time series model to the hourly or daily traffic count data. AUTOBOX allows complete autoregressive
integrated moving average (ARIMA) modeling. The example in their study clearly
showed that proposed ARIMA model was good at detecting unusual traffic profiles
and was also good at predicting hourly counts. They used past five days data to
predict 24 hourly volumes of the same day of the next week. It was found that
22 hourly volumes were within 95% confidence level of the observed counts. The
other two were detected as outliers caused by an evacuation response to the
threat of Hurricane Elena. Such system
can also be used to predict missing values for traffic counts.
In 1990, New Mexico State
Highway and Transportation conducted a survey of traffic monitoring practice
(New Mexico 1990) in the United States. It was shown that when portable devices
failed, 13 states used some procedure to estimate the missing values and
complete the data set. When permanent devices failed, 23 states employed some
procedure to estimate the missing values (Albright 1991b). Various
methods were used for this purpose. For example, in Alabama, if less than 6
hours are missing, the data are estimated using the previous year or other data
from the month. If more than 6 hours are missing, the day is voided. In
Delaware, estimates of missing values are based on a straight line using the
data from the months before and after the failure. Most of these methods apply
simple factors to historical data to estimate missing values. In Kentucky, a
computer program was used to estimate and fill in the blanks (New Mexico 1990).
In 1997, Federal Highway Administration (FHWA) conducted a research for traffic
monitoring programs and technologies in Europe (FHWA 1997). It was reported
that highway agencies in Netherlands, France, and the United Kingdom used some
computer programs for data validation routines. For example, a software system
INTENS was used in Netherlands for data analysis and validation. The software
used a “smart” linear interpolation process between locations from which data
were available to estimate missing traffic volumes. In France, a system MELODIE
was used for data validation. Data validation was conducted visually by the
system operator. Invalid data were replaced with previous month’s data. Several
data validation schemes were used in the United Kingdom. One of them was used
by Central Transport Group (CTG) to validate permanent recorder data. Invalid
data were replaced with data extracted from the valid data of last week
collected from that site. No research has been found for assessing the accuracy
of such imputations.
A series of studies (Clark 1992; Redfern et al. 1993; Watson et al.
1993) were carried out by a group of scholars at University of Leeds, England,
in the early 1990’s. Redfern et al.
(1993) tested four types of models on four traffic time series supplied by
Department of Transportation (DOT) in London. These models were exponentially
weighted moving average, autocorrelation based influence function, ARIMA model
using large residuals, and ARIMA model using the Tsay likelihood ratio
diagnostics. It was reported that the estimation of replacement values for both
extreme and missing values was most efficiently done using the parametric
ARIMA(1,0,0)(0,1,1)7 model. However, it was also reported that the
estimated replacements of the missing values showed considerable variation
(Redfern et al. 1993). The study also
mentioned concerns about the Base Data Integrity.
A survey of practical solutions used by consultancies and local
authorities in England (Redfern et al.
1993) reported that there were two broad categories of solutions. One is
“by-eye” method and the other is computerized packages (Redfern et al. 1993). Most automated practical
solutions to patching were based upon simple, moving or exponentially weighted
moving average, or their variants. For
example, DOT in London employed an exponentially weighted moving average model
to update missing values. The process involved validating new traffic count
data against old data from the same site collected over the previous weeks at
the same time. Following equation was used to estimate missing or rejected
data,
, at time t:
(1)
where xt-1,s, xt-2,s,
…, xt-n,s represent
the observations for that particular site and vehicle category at the same
times for weeks 1, 2, …, n before the
current observation;
is a constant such
that 0<
<1. A value of 0.7 was typically used for parameter
.
The Countywide Traffic and Accident Data Unit (TADU) used
AutoCounts to validate collected data and infill missing values from automatic
traffic counts (Ramsey and Hayden 1994). The agency needs monthly five- and
seven-day flow averages for trend and yearly analysis within AutoCounts.
Usually these statistics can be obtained directly from the validated data that
have been flagged as typical. However, when there are no typical data the
infill model is applied. The model
estimates weekly flows, and starts with a seasonal profile where all weeks are
considered to be equal. Then, considering the data in ascending order of age by
year, the profile is modified each year. As a starting point the previous
year’s profile is calculated as follows:
(2)
The model is applied on a week-to-week basis for: w = 1 to 53 and
= w +1. Here FWw is the actual weekly flow for the week w; fw
is the estimated weekly seasonal factor for week w; f42 (w = 42), mid-October, is always
1.0. The model is applied iteratively
either a maximum of 50 times or until no improvement in fit is achieved. The
output of the process is a full 53-week flow profile for the year under
consideration. No evaluations were made on the accuracy of such models (Ramsey
and Hayden 1994).
This section provides a brief review of factor
approaches, time series analysis, regression analysis, neural networks, and
genetic algorithms used in the present study.
Factor
approaches may be the most popular data imputation or prediction methods.
Factor approaches usually involve developing a set of factors from historical
data set and then applying these factors to new data for predictions. For
example, a set of hourly factors (HF), daily factors (DF), and monthly factors
(MF) can be developed based on data from permanent traffic counts. Traffic
parameters, such as AADT and DHV, then could be predicted by applying these
factors to short-period traffic counts (Garber and Hoel 1999). The virtue of
such methods is their simplicity. However, the results are usually less
accurate than more sophisticated models.
A time series is a
chronological sequence of observations on a particular variable. Time series
data are often examined in hopes of discovering a historical pattern that can
be exploited in the forecast. Time series modeling is based on the assumption
that the historical values of a variable provide an indication of its value in
the future (Box and Jenkins 1970).
Many techniques are available for modeling univariate
time series, such as exponential smoothing, Holt-Winters procedure, and
Box-Jenkins procedure. Exponential smoothing should only be used for
non-seasonal time series showing little visible trend. Exponential smoothing
may easily be generalized to deal with time series containing trend and
seasonal variation. The resulting procedure is usually referred to as the
Holt-Winters procedure. Box-Jenkins procedure is the most popular tool for time
series analysis. The procedure builds an autoregressive integrated moving
average (ARIMA) model using the Box-Jenkins methodology. Both autoregressive
and moving average components are considered in these models. Such a model is
called an integrated model because
the stationary model that is fitted to the differenced data has to be summed or
integrated to provide a model for the non-stationary data (Chatfield 1989). The
general autoregressive integrated moving average process is of the form:
Given
(3)
(4)
Where: Xt
is a non-stationary process;
Wt
is a stationary process;
Ñ is the differencing
operator;
Zt
is white noise;
at and bt are constants;
p, d, q are the order of autoregressive, differencing, and
moving average components.
The
above ARIMA process describing the dth
differences of the data is said to be of order (p, d, q), usually referred to as ARIMA(p, d, q). An ARIMA model considering seasonality in the data is
often represented by ARIMA(p, d, q)(P, D, Q)s, where P,
D, and Q are the order of seasonal autoregressive, differencing, and
moving average components; s is a
seasonal periodic component that repeats every s observations.
A
variant of regression analysis called locally weighted regression was used in
this study. Locally weighted regression is a form of instance-based (or
memory-based) algorithm for learning continuous mappings from real-valued input
vectors to real-valued output vectors. Local methods assign a weight to each
training observation that regulates its influence on the training process. The
weight depends upon the location of the training point in the input variable
space relative to that of the point to be predicted. Training observations
closer to the prediction point generally receive higher weights (Friedman
1995). The local weighted regression program used in this study can be
downloaded from a web site (Locally 2001).
Model-based methods, such as neural networks and the mixture of Gaussians, use the data to build a parameterized model. After training, the model is used for predictions and the data are generally discarded. In contrast, “memory-based” methods are non-parametric approaches that explicitly retain the training data, and use it each time a prediction needs to be made. Locally weighted regression is a memory-based method that performs regression around a point of interest using only training data that are “local” to that point. One recent study demonstrated that locally weighted regression was suitable for real-time control by constructing a locally-weighted-regression-based system that learned a difficult juggling task (Schaal and Atkeson 1994).
The neural networks used in this study consist of three layers: input, hidden, and output. The input layer receives data from the outside world. The input layer neurons send information to the hidden layer neurons. The hidden neurons are all the neurons between the input and output layers. They are part of the internal abstract pattern, which represents the neural network’s solution to the problem. The hidden layer neurons feed their output to the output layer neurons, which provide the neural network’s response to the input data.
The variant of neural network used in this study is called time delay
neural network (TDNN) (Hecht-Nielsen 1990). Figure 1 shows an example of a
TDNN, which are particularly useful for time series analysis. The neurons in a
given layer can receive delayed input from other neurons in the same layer. For
example, the network in Figure 1 receives a single input from the external
environment. The remaining nodes in the input layer get their input from the
neuron on the left delayed by one time interval. The input layer at any time
will hold a part of the time series. Such delays can also be incorporated in
other layers.
Neurons process input and produce output. Each
neuron takes in the output from many other neurons. Actual output from a neuron
is calculated using a transfer function. In this study, a sigmoid transfer
function is chosen because it produces a continuous value in the range [0,1]. A neuron in a given layer is
connected to neurons (n1,
n2… nm) in the previous layer. The
connection from
to
has the weight
. The weights of the
connections are initially assigned an arbitrary value between 0 and 1. The
appropriate weights are determined during the training phase. Input to the
is obtained using the
following equation:
|
|
(5) |
Output from the
is calculated using a
sigmoid transfer function as:
|
|
(6) |
It is necessary to train a neural network model
on a set of examples called the training set so that it adapts to the system it
is trying to simulate. Supervised learning is the most common form of
adaptation. In supervised learning, the correct output for the output layer is
known. Output neurons are told what the ideal response to input signals should
be. In the training phase, the network constructs an internal representation
that captures the regularities of the data in a distributed and generalized
way. The network attempts to adjust the weights of connections between neurons
to produce the desired output. The back-propagation method is used to adjust
the weights, in which errors from the output are fed back through the network,
altering weights as it goes, to prevent the repetition of the error.
The origin of genetic algorithms (GAs) is attributed to Holland’s work
(Holland 1975) on cellular automata. There has been significant interest in GAs
over the last two decades (Buckles and Petry 1994). The genetic algorithm is a
model of machine learning, which derives its behavior from a metaphor of the
processes of evolution in nature. This is done by the creation within a machine
of a population of individuals represented by chromosomes, in essence a set of
character strings that are analogous to the base-4 chromosomes in human DNA.
The individuals in the population then go through a process of evolution.
In practice, the evolutionary model of
computation can be implemented by having arrays of bits or characters to
represent the chromosomes
, where ci
is called a gene. Simple bit manipulation
operations allow the implementation of crossover, mutation and other
operations. When genetic algorithms are implemented, they are usually done in a
manner that involves the following cycle: “Evaluate the fitness of all of the
individuals in the population; Create a new population by performing operations
such as crossover, fitness-proportionate reproduction and mutation on the
individuals whose fitness has just been measured; Discard the old population
and iterate using the new population.”
The
first generation (generation 0) of this process operates on a population of
randomly generated individuals. From there on, the genetic operations, in
concert with the fitness measure, operate to improve the population.
Genetic Algorithms for Designing Neural
Networks
Many
researchers have used GAs to determine neural network architectures. Harp, et al. (1989)
and Miller, et al. (1989)
used GAs to determine the best connections among network units. Montana and
Davis (1989) used GAs for training the neural networks. Chalmers (1991)
developed learning rules for neural networks using GAs.
Hansen, et al. (1999)
used GAs to design time delay neural networks (TDNN), which included the
determination of important features such as number of inputs, the number of
hidden layers, and the number of hidden neurons in each hidden layer. Hansen, et al. (1999)
applied their networks to model chemical process concentration, chemical
process temperatures, and Wolfer sunspot numbers. Their results clearly showed
advantages of using TDNN configured by GAs over other techniques including
conventional autoregressive integrated moving average (ARIMA) methodology as
described by Box and Jenkins (1970).
Hansen
et al.’s approach (1999) consisted of
building neural networks based on the architectures indicated by the fittest
chromosome. The objective of the evolution was to minimize the training error.
Such an approach is computationally expensive. Another possibility that is used
in this study is to choose the architecture of the input layer using genetic
algorithms.
Lingras
and Mountford (2001) proposed the maximization of linear correlation between
input variables and the output variable as the objective for selecting the
connections between input and hidden layers. Since such an optimization is not
computationally feasible for large input layers, GAs were used to search for a
near optimal solution. It should be
noted here that since the input layer has a section of time series, it is not
possible to eliminate intermediate input neurons. They are necessary to
preserve their time delay connections. However, it is possible to eliminate
their feedforward connections. Lingras and Mountford (2001) achieved superior
performance using the GAs designed neural networks for the prediction of inter-city
traffic. The present study uses the same objective function for development of
regression and neural network models. The developed models were used to update
missing values of traffic counts.
STUDY DATA
Currently, Alberta Transportation employs about 350 permanent traffic
counters (PTCs) to monitor its highway networks. Hierarchical grouping method
proposed by Sharma and Werner (1981) was used to classify these PTCs into
groups. The ratios of monthly average daily traffic (MADT) to annual average daily
traffic (AADT) (known as monthly factor MF
= MADT/AADT) were used to represent the highway sections monitored by these
PTCs during the classification. After studying group patterns from 1996 to
2000, five groups seemed appropriate to represent study data. These groups are
labeled as commuter, regional commuter, rural long-distance, summer
recreational, and winter recreational groups. Figure 2 shows the grouping
results. It can be seen that commuter group has a flat yearly pattern due to
stable traffic flows across the year. Regional commuter and rural long-distance
groups show higher peaks in the summer and lower troughs in the winter.
Recreational group has the sharpest pattern and highest peak in the summer. The
largest monthly factor (in August) is about 6 times the smallest monthly factor
(in January) for recreational group. Winter recreational group shows an
interesting yearly pattern – the peak occurred in winter season (from December
to March).
Six counts were selected from various groups: two from the commuter
group, two from the regional commuter group, one from the rural long-distance,
and one from the recreational group. Due to insufficient data in winter
recreational group, no counts were selected from that group. Table 1 shows PTCs
selected from different groups, their functional classes, AADT values, and
training and test data used in this study.
Figure
3 shows daily patterns for these counts. For commuter group counts (C011145t
and C002181t) there are two peaks in a day: one is in the morning, and the
other is in the afternoon. Regional commuter count – C022161t also has two
peaks in a day, but they are smaller than commuter counts. Even though C003061t
was classified into regional commuter group based on its yearly pattern, its
daily pattern is very similar to that of rural long-distance count
C001025t. The daily patterns of both
C003061t and C001025t have two very small peaks. However, the first peak
occurred nearly at noon, instead of in the early morning. Recreational count –
C093001t only has one peak occurring nearly at noon. Majority of recreational
travel took place in a few hours in the afternoon.
For
each count, four or five years data was used in the experiments, as shown in
Table 1. Five years data was used for counts from the groups other than
recreational. Since there were large number of missing values in 1999 data for
C093001t, only four years data is available. There are no missing values in the
experimental data. The data is in the form of hourly traffic volumes for both
directions.
STUDY MODELS, RESULTS, AND DISCUSSION
STUDY MODELS
The models were trained and tested by assuming that a certain portion
of the data was missing. Various models were applied to estimate missing values
from six PTCs. This section gives a brief description of the models developed
in this study.
Genetically Designed
Regression and Neural Network Models
Two
types of genetically designed models were developed in this study. First type
consisted of short-term prediction models, which only used the data before the
failure as the input. For this type of models, one-week long (7 ´ 24 = 168) hourly volumes before the first
missing value were used as the candidate inputs. The second type of models used
the data from before and after the failure as the input. For models that used
the data from before and after the failure, a week-long hourly volumes from
each side of the occurrence of missing value(s) were used as the candidate
inputs. Totally 168 ´ 2 = 336 hourly volumes were
presented to GAs for selecting 24 final inputs.
Genetically
designed regression and neural network models were applied to estimate missing
values from traffic counts. If only one hourly volume was missing, models were
only applied once to update that missing value. If there were more than one successive
missing values, models were applied recursively for estimating missing values.
Figure 4 shows the prototype of models used in this study.
First,
assuming there was one or more than one successive missing values in traffic
counts, candidate inputs of models were presented to GAs for selecting 24 final
input variables. These 24 hourly
volumes were chosen because they have the maximum correlation with the traffic
volume of next hour, among all the combinations of 24 variables from candidate
inputs. The next hour here is the hour whose volume will be predicted based on
GAs selected 24 inputs. The GAs selected variables were used to train the
neural network and regression models for traffic prediction of the next hour.
The trained neural network or regression models were used to estimate missing
traffic volume of first hour P1. If
there were more than one successive missing values, same techniques would be
used to predict second missing value P2.
However, at this stage, the candidate pattern presented to GAs for selecting
final inputs included estimated volume of first hour P1, as shown in Figure 4. P1
may or may not be chosen as final input because there are different input
selection schemas for different hourly models. Figure 5 shows a TDNN model with
inputs selected from a week-long hourly-volume time series. Corresponding
regression model also used same inputs for prediction.
A top-down model design (Zhong et al. 2002) was used to search the
models with reasonable accuracy. First 24-hour universal models were
established to test their ability, then they were further split into 12-hour
universal models, single-hour models, seasonal single-hour models, day-hour
models, and seasonal day-hour models. The characteristics of these models are
as follows:
1.
24-hour
universal models: This approach involved a single TDNN and a single regression
model for all the hours across the year. The advantage of such models is the
simplicity of implementation during the operational phase.
2.
12-hour
universal models: In order to improve models’ accuracy, 12-hour universal
models were built based on 12-hour (from 8:00 a.m. to 8:00 p.m.) observations.
In other words, the observations from late evenings and early mornings were
eliminated from the models.
3.
Single-hour
models: 12-hour universal models were further split into 12 single-hour models.
That is, every hour had a separate model.
4.
Seasonal
single-hour models: Seasons have definite impact on the travel. So further
dividing single-hour models into seasonal models may improve models’ accuracy.
Yearly single-hour models were further split into May-August single-hour models
and July-August single-hour models.
5.
Day-hour
models: Travel patterns also vary by the day of the week. Further
classification of observations into groups for different days (e.g., Wednesday
or Saturday) may improve models’ accuracy.
6.
Seasonal
day-hour models. Day-hour models were further split into seasonal day-hour
models (e.g., July-August day-hour models) to explore the models with higher
accuracy.
The model refinement studied
by Zhong et al. (2002) was further
extended in this study. Genetic algorithms were used to identify 24 variables
from the input time series that have the highest linear correlation with the
output variable. Depending upon their position in the input time series, the
candidate-input variables were labeled from 1 to 168 for models that used
historical data, or 1 to 336 for models based on data from before and after
failure. Each gene was allowed to take a value from 1 to 168, or 336. Each chromosome
had 24 genes. The chromosomes with higher values of linear correlation were
selected for creating the next generation. The population size was set to 110.
The genetic algorithms were allowed to evolve for 1000 generations. The
crossover rate was set at 90%, and the probability of mutation was set to 1%.
The best chromosome from 1 to 1000 generations was used as the final solution
of the search. The connections selected by the genetic algorithms were used to
design and implement the neural network and regression models. That is, the
coefficients or weights of the 24 final selected input variables were nonzero
and the coefficients or weights of all the other input variables in the time
series were set to zeroes in the regression or neural network models.
Factor Models
Various
factor models were developed for the six experimental counts. These models
were:
1.
Average-history
model: Historical data of the same hour from training set were averaged to
update missing values in the test set. It is assumed that there is no change in
traffic volume from year to year. The replacement values of missing data were
calculated as:
. (7)
Where:
N is the number of years in training
set.
2.
Factor-history
models: This model added growth factor to equation (7). The replacement values
were calculated as weighted average of historical data. Growth factors were
used as weights in the calculations. The growth factors (GF) were calculated by using the ratio of AADT of the test year and
AADTs of years in the training set. The replacement values of missing data were
calculated as:
(8)
.
(9)
Where: AADTitraining
is AADT of year i in training set;
AADTtest is AADT of the test year;
N is the number of years in training set.
3.
Single-hour
monthly factor models: Monthly factors from the training set were used in test
set to calculate the replacement values. The monthly factors from the months
before and after the failure month were used in the calculations as:
(10)
Where: mfi is
average monthly factor of the failure month calculated from training set;
mfi-1 and mfi+1 are average monthly factors of the month before
and after the failure month respectively, which are also calculated from
training set;
Valuei-1 and Valuei+1 are hourly volumes of the same hour in the
months before and after the failure month respectively.
4.
Day-hour
monthly factor models: The previous model does not incorporate weekday impact.
The day-hour monthly factor models used the same equation as single-hour
monthly factor models. However, Valuei-1
and Valuei+1 were hourly
volumes from the same hour and day of the week.
Autoregressive Integrated
Moving Average (ARIMA) Models
Two
types of ARIMA models were developed in this study. The characteristics of
these models were as follows:
1.
Seasonal
ARIMA models for updating one-week long missing values: Previous 8 weeks data
were used to estimate 168 missing hourly volumes of the 9th week.
2.
July-August
day-hour ARIMA models for updating 12 successive hour missing values: Previous
8 days (same day of the week) data were used to predict 12 missing hourly
volumes of the 9th day (from 8:00 a.m. to 8:00 p.m.).
All the models were
trained and tested. Depending on the model, the number of patterns or
observations varied. The absolute percentage error was calculated as:
|
|
(11) |
The key evaluation
parameters consisted of the average, 50th, 85th and 95th
absolute percentile errors. These statistics give a clear profile of model’s
error distribution.
RESULTS AND DISCUSSION
Various
models described in previous section were tested on the data from six PTCs.
This paper only presents some of the key results for illustration. For example,
the results from initial models with low accuracy are only summarized for one
count. The results from refined models are presented for all six counts.
The present study uses the best
refinement suggested by Zhong et al. (2002)
to develop genetically designed models based on data from before and after
failure. Factor models were used as benchmark models. ARIMA models were also
developed for comparison purposes. The results for the commuter count –
C002181t are used here for the illustration.
Factor
models were used as benchmark models, since they typically represent current
practices. These models included average-history models, factor-history models,
single-hour monthly factor models, and day-hour monthly factor models. Both
yearly and seasonal factor models were developed. Table 2 shows the errors of
four factor models for updating missing values in July and August 2000 for
C002181t. Average errors for average-history models are usually more than 10%.
For factor-history and single-hour monthly factor models, the average errors
are usually between 5% and 10%. Most average errors for day-hour monthly factor
models are less than 7% and nearly half of them are below 5%. Most 95th
percentile errors for average-history models are more than 20%. For
factor-history models, half of the 95th percentile errors are more
than 20%. Most 95th percentile errors for Single-hour monthly factor
model are less than 20%. The day-hour monthly factor model has most accurate
results. Most 95th percentile errors are less than 12%.
Time
series analysis models were also developed for comparison. Two types of
autoregressive integrated moving average (ARIMA) models were developed. First
type was seasonal ARIMA model, which was used to update one-week long missing
values with previous two months data. Figure 6 shows how Winter’s
multiplicative ARIMA model updates one-week long missing values for C002181t in
July, 2000. Average errors for ARIMA model are usually between 10% and 15%. The
95th percentile errors are less than 30% for 7 out of 24 hours.
July-August
day-hour ARIMA models for updating 12-hour missing values on the 9th
day using previous 8 days data were also developed. Figure 7 shows that
ARIMA(0,1,1)(0,1,1) updates missing values on the 9th Wednesday in
July and August, 2000 for C002181t. Average errors are usually between 3% and
5% and the 95th percentile errors are usually between 7% to 10%.
A
set of genetically designed models was developed. For 24-hour universal models
and 12-hour universal models, GAs used same input selection schema for all
hours. There was only one set of weights or coefficients for neural network and
regression models. Universal neural network and regression models respond to
all input patterns with the same computation strategy. This led to high
prediction errors. For example, for 24-hour universal neural network model, the
highest 95th percentile error was up to 171.58%. Even the lowest 95th
percentile error was around 56%. For most cases, regression model performed
better than neural network model. However, nearly all the 95th
percentile prediction errors were higher than 25% for the 24-hour universal
regression model.
Training pat