9
Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2013, Article ID 418303, 8 pages http://dx.doi.org/10.1155/2013/418303 Research Article Accurate Multisteps Traffic Flow Prediction Based on SVM Zhang Mingheng, 1 Zhen Yaobao, 2 Hui Ganglong, 3 and Chen Gang 4 1 School of Automotive Engineering, Dalian University of Technology, Dalian 116024, China 2 School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, China 3 School of Navigation, Dalian Maritime University, Dalian 116026, China 4 Department of Mechanical and Manufacturing Engineering, Aalborg University, 169220 Aalborg, Denmark Correspondence should be addressed to Zhang Mingheng; [email protected] Received 29 August 2013; Accepted 12 September 2013 Academic Editor: Rui Mu Copyright © 2013 Zhang Mingheng et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Accurate traffic flow prediction is prerequisite and important for realizing intelligent traffic control and guidance, and it is also the objective requirement for intelligent traffic management. Due to the strong nonlinear, stochastic, time-varying characteristics of urban transport system, artificial intelligence methods such as support vector machine (SVM) are now receiving more and more attentions in this research field. Compared with the traditional single-step prediction method, the multisteps prediction has the ability that can predict the traffic state trends over a certain period in the future. From the perspective of dynamic decision, it is far important than the current traffic condition obtained. us, in this paper, an accurate multi-steps traffic flow prediction model based on SVM was proposed. In which, the input vectors were comprised of actual traffic volume and four different types of input vectors were compared to verify their prediction performance with each other. Finally, the model was verified with actual data in the empirical analysis phase and the test results showed that the proposed SVM model had a good ability for traffic flow prediction and the SVM-HPT model outperformed the other three models for prediction. 1. Introduction 1.1. Backgrounds. Accurate traffic flow prediction is an im- portant research content in intelligent transportation system (ITS). e traffic flow information predicted rapidly and accurately is essential for traffic control, guidance, and pro- viding traffic information services to the public. Especially in recent years, urban traffic congestion are now becoming the major problems in the traffic management all over the world, which produce a series influence hindering the social sustainable development and people’s daily work. e transit agencies also have realized that the rapid and accurate traffic information can help them efficiently make reasonable and effective traffic guidance strategy. us, there is a growing demands in providing accurate traffic flow duration and diffusion prediction over a period of time. e urban transport system has the characteristics such as nonlinear, stochastic, and time-varying. In order to achieve accurate prediction, some scholars applied traffic flow model to explain the dynamic changes and evolution of traffic flow. However, the traffic flow model is relatively complex in construction and has some difficulties with practical appli- cation. erefore, the models based undetermined predictive method encounters the problems with model construction and solving. In contrast, nonmathematical models such as nonparametric regression, neural networks, and SVM are now widely applied to the traffic flow prediction due to their characteristics of self-learning and without complicated mathematical model construction. On the other hand, most of the traditional prediction is based on the single-step method, which can only provide the current or single-step traffic parameters with obtained traffic data. e information provided is not enough for the public or traffic agencies’ making decision. us, multisteps traffic flow prediction is essential for obtaining the traffic state trends over a certain period in the future. From the perspective of dynamic decision, the development trends of traffic states within a certain time in the future are more important than the current traffic state. For example, the traffic congestion is considered occurred with the significant

Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2013 Article ID 418303 8 pageshttpdxdoiorg1011552013418303

Research ArticleAccurate Multisteps Traffic Flow Prediction Based on SVM

Zhang Mingheng1 Zhen Yaobao2 Hui Ganglong3 and Chen Gang4

1 School of Automotive Engineering Dalian University of Technology Dalian 116024 China2 School of Mechanical Engineering Dalian University of Technology Dalian 116024 China3 School of Navigation Dalian Maritime University Dalian 116026 China4Department of Mechanical and Manufacturing Engineering Aalborg University 169220 Aalborg Denmark

Correspondence should be addressed to Zhang Mingheng gloriazhang163com

Received 29 August 2013 Accepted 12 September 2013

Academic Editor Rui Mu

Copyright copy 2013 Zhang Mingheng et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Accurate traffic flow prediction is prerequisite and important for realizing intelligent traffic control and guidance and it is also theobjective requirement for intelligent traffic management Due to the strong nonlinear stochastic time-varying characteristics ofurban transport system artificial intelligence methods such as support vector machine (SVM) are now receiving more and moreattentions in this research field Compared with the traditional single-step prediction method the multisteps prediction has theability that can predict the traffic state trends over a certain period in the future From the perspective of dynamic decision it isfar important than the current traffic condition obtained Thus in this paper an accurate multi-steps traffic flow prediction modelbased on SVM was proposed In which the input vectors were comprised of actual traffic volume and four different types of inputvectors were compared to verify their prediction performance with each other Finally the model was verified with actual data inthe empirical analysis phase and the test results showed that the proposed SVMmodel had a good ability for traffic flow predictionand the SVM-HPT model outperformed the other three models for prediction

1 Introduction

11 Backgrounds Accurate traffic flow prediction is an im-portant research content in intelligent transportation system(ITS) The traffic flow information predicted rapidly andaccurately is essential for traffic control guidance and pro-viding traffic information services to the public Especiallyin recent years urban traffic congestion are now becomingthe major problems in the traffic management all over theworld which produce a series influence hindering the socialsustainable development and peoplersquos daily work The transitagencies also have realized that the rapid and accurate trafficinformation can help them efficiently make reasonable andeffective traffic guidance strategy Thus there is a growingdemands in providing accurate traffic flow duration anddiffusion prediction over a period of time

The urban transport system has the characteristics suchas nonlinear stochastic and time-varying In order to achieveaccurate prediction some scholars applied traffic flow modelto explain the dynamic changes and evolution of traffic flow

However the traffic flow model is relatively complex inconstruction and has some difficulties with practical appli-cationTherefore the models based undetermined predictivemethod encounters the problems with model constructionand solving In contrast nonmathematical models such asnonparametric regression neural networks and SVM arenow widely applied to the traffic flow prediction due totheir characteristics of self-learning and without complicatedmathematical model construction

On the other hand most of the traditional prediction isbased on the single-step method which can only providethe current or single-step traffic parameters with obtainedtraffic data The information provided is not enough for thepublic or traffic agenciesrsquo making decision Thus multistepstraffic flow prediction is essential for obtaining the trafficstate trends over a certain period in the future From theperspective of dynamic decision the development trends oftraffic states within a certain time in the future are moreimportant than the current traffic state For example thetraffic congestion is considered occurred with the significant

2 Mathematical Problems in Engineering

differences between the obtained traffic data and predictiontrends So we can estimate the range of possible spread ofcongestion and duration according to the traffic parametersof multistep prediction results

However accurate prediction of traffic flow is very dif-ficult due to many stochastic variables involved (eg trafficconditions) Therefore the deployment of traffic flow predic-tion model is a challenging task

12 Literature Review Accurate real-time traffic flow pre-diction is prerequisite and key to realize intelligent trafficcontrol and guidance and it is also the objective requirementto intelligent traffic management Over the past decadesvarious sophisticated techniques and algorithms have beendeveloped for traffic flow prediction These methods can beroughly categorized as prediction methods based on mathe-matics and physics including the historical average modeltime series model Kalman filter model and exponentialsmoothing model the other methods based on nonmath-ematical models such as artificial neural network (ANN)nonparametric regression (NPR) and SVM Here only abrief introduction about the typical method is made moredetailed information is found in their literature respectively

121 Kalman Filter Models The Kalman filter also knownas linear quadratic estimation (LQE) is an efficient recursiveprocedure that estimates the future states of dependentvariables It is originated from the state-space representationsin modern control theory Wang et al [1] developed a trafficstate estimator model based on stochastic macroscopic trafficflow modeling and extended Kalman filtering One majorinnovative feature of the traffic state estimator is the onlinejoint estimation of important model parameters (free speedcritical density and capacity) and traffic flow variables (flowsmean speeds and densities) Hage et al [2] proposed andvalidated an algorithm for estimating the urban links traveltimes based on an unscented Kalman filter (UKF) The algo-rithm integrates stochastically the vehicle count data fromunderground loop detectors at the end of every link and thetravel times fromprobe vehiclesTheir results showed that theproposedmethodology can be used for estimating travel timein real time Yuan et al [3] proposed a state estimator basedon the EKF technique in which observation models for bothEulerian and Lagrangian sensor data (from loop detectorsand vehicle trajectories resp) are incorporated The resultsindicate that the Lagrangian estimator is significantly moreaccurate and offers computational and theoretical benefitsover the Eulerian approach

122 Support Vector Machine Models Support vector ma-chine a novel supervised learning method used for clas-sification and regression has been recently proved to bea promising tool for both data classification and patternrecognition [4ndash6] SVM shows very resistant to the over-fitting problem achieving high generalization performancein solving various time series forecasting problems whichhas been applied in prediction of time series [7 8] Thesesuccessful applications motivate researchers to apply SVM

in the intelligent traffic system (ITS) Yang and Yu [9]proposed a freeway dynamic traffic flow forecasting modelbases on SMO support vector machine With the analysisof the freeway macroscopic dynamic traffic flow model andcarrying out detail research on selecting parameter of SMOsupport vector machines a test experiment was carried outand the result showed that the average relative error offorecasting was less than 384 Yu et al [6 10] developedthe SVM-based models to predict bus arrival time In themodels travel speeds of the preceding buses of the samebus route were used to estimate traffic conditions Theirresults showed that the SVM-based models outperformedthe ANN and historic mean prediction models and SVMseemed to be a powerful alternative for bus arrival timeprediction With versatile parallel distributed structures andadaptive learning processes ANN and SVM have recentlybeen gaining popularity in bus travelarrival time predictionYu et al [11] applied the SVM to predicting bus travel timesIn which a decay factor was introduced to adjust the weightsbetween new and old data and the test results showed that theSVM with the decay factor and the adaptive algorithm hadbetter prediction accuracy and dynamic performance thanother existing algorithms

In summary due to the characteristics of urban trafficstate such as uncertainty nonlinearity and complexity someresearchers use traffic flow model to illustrate the dynamicchanges of traffic state to predict the evolution of the trafficflow and then achieved the short-term traffic flow predictionmodel However the structure of the traffic flow model iscomplex relatively which brings difficulty to the practicalapplication The method based on mathematical model ishard to meet the practical real-time requirements and theneed for accuracy due to its difficulties in model building andsolving In contrast the method based on nonmathematicalmodel does not need complexmodel building the predictionaccuracy can meet the requirements of the intelligent trans-portation systemTherefore these methods have been widelyapplied to the traffic flow prediction

13 Contributions Urban road traffic conditions are not onlyclosely related to historical period conditions but also to theupstream and downstream road state At present most ofthe accurate traffic flow prediction methods are focusing onthe relation analysis between the traffic conditions and thehistorical traffic data from the time dimension of viewWhilethe spatial dimension such as the upstream and downstreamtraffic condition and the traffic mode changes daily areignored In addition most of the current traffic flow pre-dictions are concentrated on the single-step prediction lessin the multisteps prediction In other words the predictionconcentrates more on the coming traffic flow estimation fora specific time point not concerns with the traffic conditionchanging trends in the next certain time periods But infact from the perspective of dynamic decision making thetraffic condition changing trends aremore important than thecurrent traffic situation for the trafficmanagementThereforeit is absolutely essential to establish a multisteps model forthe traffic flow prediction which can be used for estimating

Mathematical Problems in Engineering 3

the road traffic status trends accurately and the predictionresult can be applied in improving rationality of the trafficmanagement and the travel decisionmaking

This paper seeks to make two contributions to theliterature Firstly it attempts to develop the models to predictmultisteps traffic flow with multiple steps using real-worlddata It is expected to help the transit agencies efficientlymakereasonable and effective traffic guidance strategy Secondlyin order to improve the prediction accuracy not only thehistorical traffic data but also the daily space-time sequencesdata are taken into consideration during the input state vectorconstructing The performance of the proposed model canprovide valuable insight for researchers as well as practition-ers

The structure of this paper is organized as followsSection 2 provides the accurate real-time prediction modelfor predicting traffic flow with multisteps Section 3 presentsa case study together with results and analysis includingperformance evaluation of the proposedmodel and lastly theconclusions are given in Section 4 together with suggestionsfor further study

2 Methodologies

21 SupportVectorMachineModels SVM is a type of learningalgorithm based on statistical learning theory which canbe adjusted to map the complex input-output relationshipfor the nonlinear system without dependent on the specificfunctions Unlike other nonlinear optimization methodsthe solution of SVM always can achieve the global optimalsolution without limitation to a local minimum point and itshows the strong resistance to the over-fitting problem andthe high generalization performance

Given the samples (1199091 1199101) (1199092 1199102) (119909

119894 119910119894) (119909119894isin 119883 sube

119877119899 119910119894isin 119884 sube 119877) SVM make use of a nonlinear mapping

120601 to map 119909 into a high-dimensional feature space 119867 inwhich linear approximation is conducted to find themappingfunction so that we can get a better approximation for thegiven data samples Based on the statistical learning theorywe can get the function as follows

119891 (119909) = 120596120601 (119909) + 119887 (1)

Regression can be defined as a problem that minimized therisks for a loss function The optimal regression functionis the minimum and regularized generic function 119876 undercertain constraints

119876 =1

21205962+

119862

119897

119897

sum119894=1

119871120576(119910119894 119891 (119909119894)) (2)

where 120596 is a standard vector The first item is named asthe regularized term which makes the function flat andimprove its generalization ability the second item is namedas the experience risk generic functional which can bedetermined by different loss functions 119862 is used to balancethe relationship between structure risks and experience risks(119862 gt 0) When the insensitive loss function is defined with

119871120576(119910119894 119891 (119909119894)) = max (1003816100381610038161003816119910119894 minus 119891 (119909

119894)1003816100381610038161003816 minus 120576 0) (3)

0200400600800

100012001400160018002000

000

120

240

400

520

640

800

920

104

012

00

132

014

40

160

017

20

184

020

00

212

022

40

000

Volu

me (

veh

h)

Time of the dayMondayWednesdayFriday

SaturdaySunday

Figure 1 Traffic volume diurnal variation trends

The minimization of (2) is a convex quadratic optimizationproblem and then the problem can be inferred with theLagrange multipliers 119886

119894and 119886lowast119894 shown as following

120596 minus

119897

sum119894=1

(120572119894minus 120572lowast

119894) 119909119894= 0 (4)

Then the SVM decision function is

119891 (119909) = sgn(

119897

sum119894=1

(120572119894minus 120572lowast

119894) 120601 (119909119894) sdot 0 (119909) + 119887) (5)

The following equation can be obtained

119891 (119909) =

119897

sum119894=1

(120572119894minus 120572lowast

119894)119870 (119909

119894 119909119895) + 119887 (6)

where 119870(119909119894 119909119895) = 120601(119909

119894) denotes the inner product of the

vector in the feature space All of the kernel functions can bedirectly computed in the input space

22 Multisteps Prediction Methods Based on SVM Trafficflow data are time-series data which have the characteristicssuch as autocorrelation and historical changes with similarityFigure 1 shows the traffic volume changes for a certain roadsection during a few days in a week In which during theweek days the traffic volume changes graduallyThat is to saythe traffic volume data have the autocorrelation propertieswhich can be demonstrated with the relationships betweenthe traffic flow data at time 119905 and before Therefore theprediction of the traffic flow can be conducted according withthis autocorrelation properties

Furthermore other some cues can be inferred Firstlythe traffic volume changes with similarity each day due tothe inhabitant regular daily traveling The data waveformpresents like a saddle and the peakvalley of the waveappeared in the same time In this paper this regularchanging pattern is named as the historical change of sim-ilarity which can be used for establishing the traffic flowchanging model to predict future multisteps traffic state

4 Mathematical Problems in Engineering

Secondly although the daily traffic volume has the samechanging trends the models have different properties fromeach other For example the traffic volume curve are similarfrom Monday to Friday but on Saturday and Sunday thecurve presents different pattern from the week-days thepeakvalley of traffic volume curve is not obvious for distin-guishing

Therefore a historical data model for prediction shouldbe constructed which can be used for predicting the trafficflow historical models for every day (from Monday toSunday) When the database is not enough for predicting wealso can establish the models for week-days and week endsrespectively In addition the trends of traffic flow data alsocan be influenced by weather or holidays

Besides the above demonstration about the traffic flowproperties in time domain the data are also correlated inspace domain The traffic flow data has some correlationsbetween the upstream and downstream for a certain roadsection The upstream traffic flow will reach the downstreamafter a while therefore we can predict the future traffic flowcondition of the downstream based on this correlation

Given the traffic volume 119902119894119895(119905) of road section 119894 time 119905

and day 119895 the observation volume sequence 119876119894119895

= 119902119894119895(1)

119902119894119895(2) 119902

119894119895(119905) the prediction of the traffic volume

119902119894119895(119905) then the forecasting traffic volume sequence of road

section in the future 119899 steps are expressed as 119876119894119895

= 119902119894119895(119905)

119902119894119895(119905+1) 119902

119894119895(119905+119899)The historical pattern data sequence

is defined as 119867119876119894119896

= ℎ119902119894119896(1) ℎ119902

119894119896(2) ℎ119902

119894119896(119905) in

which ℎ119902119894119896(119905) = (1119908)sum 119902

119894119895(119905) is the historical data of section

119894 on the 119896th day (from Monday to Sunday) and 119908 is thenumber of weeks for the data included

Based on the description above and the space-timecharacteristics of the traffic flow data in order to predict thetraffic flow for the 119899 steps (119905 + 1 119905 + 119899) at section 119894 time119905 three input feature vectors are designed for the predictionmodel

(1) autocorrelation time feature vector119876119894119895

= 119902119894119895(119905minus119898)

119902119894119895(119905)

(2) historical model feature vector119867119876119894119896

= ℎ119902119894119896(1)

ℎ119902119894119896(119905 minus 1) ℎ119902

119894119896(119905) ℎ119902

119894119896(119905 + 1) ℎ119902

119894119896(119905 + 119899) 119896 =

1 2 7 which represents fromMonday to Sundayrespectively

(3) space correlation feature vector119876119894minus1119895

= 119902119894minus1119895

(119905minus119898)

119902119894minus1119895

(119905) 119876119894minus2119895

= 119902119894minus2119895

(119905 minus 119898) 119902119894minus2119895

(119905) The number of time sequence119898 is determined by thetime consumption from the upstream to the currentroad section

The relationship between each input vector and outputvector is shown in Figure 2

From the aspect of theoretical analysis for the single-step mode accurate results can be obtained with only con-sideration of the time state and space state vectors due to thegradient traffic flow characteristics On the contrary for themultisteps mode the traffic flow changing trends play a moreimportant role in prediction therefore more accurate resultscan be achieved by taking the historical data into account

Based on the above analysis the combination of threestate vectors is used as the input data for SVM which are asfollows

(1) the combination of one-dimensional data for thespecified road section in the previous intervals (T)119876119894119895

= 119902119894119895(119905 minus 119898) 119902

119894119895(119905) which describes the

traffic volume changing property with the time inter-vals this property is similar to the single-step predic-tion method

(2) the combination ofmultidimensional space-time dataof the upstream and the specified road section (PT)119876119894minus1119895

= 119902119894minus1119895

(119905minus119898) 119902119894minus1119895

(119905)119876119894minus2119895

= 119902119894minus2119895

(119905minus

119898) 119902119894minus2119895

(119905) which describes the correlation ofthe traffic flow changing in time domain especiallythe correlations between the upstream and down-stream for a certain road section

(3) the combination of the time-series data and thehistorical pattern data for the specified road sectionin the current interval (HT) 119876

119894119895= 119902119894119895(119905 minus 119898)

119902119894119895(119905)119867119876

119894119896= ℎ119902119894119896(1) ℎ119902

119894119896(119905 minus 1) ℎ119902

119894119896(119905)

ℎ119902119894119896(119905 + 1) ℎ119902

119894119896(119905 + 119899) which describes the

historical similarity of the traffic volume changingproperty demonstrated in the Figure 1

(4) the combination of the time-series data for theupstream and the specified road section in the currentinterval and the historical pattern data for the targetroad section (HPT)

Each of the combination vectors above are used as the inputvariable of the SVM model and 119876

119894119895= 119902119894119895(119905) 119902119894119895(119905 + 1)

119902119894119895(119905 + 119899) as the output variable the SVM-T SVM-TP

SVM-TH and SVM-TPH can be achieved respectively duringthe SVM training process

3 Case Study

31 Evaluation of Model Performance As an indicator toreflect the accuracy and availability of the prediction modelerror plays an important role in evaluating the predictionmodel Common error indictors include absolute predictionerror and relative prediction error which can be calculated asfollows respectively

119864119886(119905) = 119902

119894(119905) minus 119902

119894(119905)

119864119903(119905) =

119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

(7)

where 119864119886(119905) denotes the absolute prediction error at time 119905

119864119903(119905) denotes the relative prediction error 119902

119894(119905) denotes the

measured value and 119902119894(119905) denotes the prediction value 119864

119886(119905)

and119864119903(119905) are the errors of the single-step prediction while for

the dynamic multisteps prediction both the value and errorcan be achieved during each prediction process which can bedescribed with 119902

119894(119905 + 1) 119902

119894(119905 + 2) 119902

119894(119905 + 119899) 119902

119894(119905 + 1) is the

prediction value of (119905 + 1) at the current time 119905 where 119899 is theprediction step size The greater the 119899 the more prediction

Mathematical Problems in Engineering 5

Historical data

qiminus2j (t minus m)

qiminus2j (t minus 1)

qiminus2j (t)

qiminus1j (t minus m)

qiminus1j (t minus 1)

qiminus1j (t)

qij (t minus m)

qij (t minus 1)

qij (t)

Target road

1

(00)

HQik

Time of day

qij(t) qij(t + 1) middot middot middot qij(t+

Roadiminus2 Roadiminus1 Roadi Roadi+1

Figure 2 Prediction mechanism based on the traffic flow space-time characteristics

error we can achieve But a maximum theoretical limit 119899maxexists for avoiding unlimited prediction loop

In order to obtain an objective and accurate evaluation ofthe prediction effect themean value of all the errors is used asthe evaluation index for multisteps prediction the formulasare as follows

119872119864119886(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905))

119872119864119903(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

)

(8)

where 119872119864119886(119905) is the mean value of the absolute error for

multisteps prediction at time 119905 119872119864119903(119905) is the mean value of

the relative error In fact for multisteps prediction the time-series synthetical error is more important for the predictionresult Therefore 119872119864

119886(119905) and 119872119864

119903(119905) are the direct sum

rather than the sum of absolute values of 119864119886(119905) and 119864

119903(119905)

32 Data Collection and Processing The presented SVMmodel for traffic flow prediction was tested with the actualsurvey data of Gaoerji Road in Dalian China dated fromMay 14 (Monday) to 20 (Sunday) 2012 Gaoerji road is aone-way lane with several importsexports and goes fromZhongshan road to Yierjiu street through the city center witha distance of 38 km The traffic state is highly congestedin the morning and afternoon peaks In this research theactual survey started from 700 in the morningThe collecteddata was obtained with SCOOT (Split Cycle and OffsetOptimization Technology) and consist of the traffic volumeduring the peak time (PT 0700ndash0830 h) and off-peak time(OT 1000ndash1200 h) with recording data once every 5 minutes

There are some influence factors including missing errorand random noise in the raw datasets which is obtainedwith SCOOTTherefore the data must be identified repairedand smoothed firstly Then the time-series data is achieved

Table 1 Four popular kernel functions

Kernel function Expression CommentLinear kernel 119870(119909

119894 119909119895) = 119909119879119894119909119895

Polynomialkernel 119870(119909

119894 119909119895) = (120574119909119879

119894119909119895+ 119903)119889

120574 gt 0

Radial basisfunction (RBF)kernel

119870(119909119894 119909119895) = 119890minus120574119909119894minus119909119895

2

120574 gt 0

Sigmoid kernel 119870(119909119894 119909119895) = tanh (120574119909119879

119894119909119895+ 119903)119889

119909119894 119909119895 are input vectors and 120574 119903 and 119889 are kernel parameters

with normalizationmethodThemain advantage of data nor-malization is to accelerate convergence velocity of iterationduring the SVM training Another advantage is to facilitatesubsequent data processing Singular value sample datamightcause numerical problems because the kernel values usuallydepend on the inner products of feature vectors such asthe linear kernel or the polynomial kernel Therefore it isrecommended that each attribute should be linearly scaled tothe range [minus1 +1] or [0 +1] In this research the data sets werescaled to the range between 0 and 1

33 Model Identification In Section 22 demonstrated abovethe input vectors for traffic flow prediction model have beendiscussed and constructed Here we only give an introduc-tion to SVM briefly More detailed descriptions can be foundin [12] Given a training set of instance-label pairs the inputvectors can be mapped implicitly and hence very efficientlyinto high-dimensional space (maybe infinite) by the kernelfunction Φ and SVM finds a linear separating hyper planewith the maximal margin in this higher dimensional spaceAt present new kernel functions are proposed by researcherssome popular kernel functions are as Table 1

The previous researches [13 14] suggested that radialbasis function (RBF) kernel had less numerical difficultieswith application and was efficient for traffic state and traffic

6 Mathematical Problems in Engineering

V1

V2

V3

Vs

K(x1 x)

K(x2 x)

K(xs x)

1205971y1

1205972y2

120597sys

+ f(x)

Figure 3 Structure of the SVMs model for traffic flow prediction

flow prediction Thus RBF kernel function is used for theSVM model in this study Before the RBF kernel is beingused three parameters associated should be determined(119862 120576 120574) Parameter 119862 is a regularization constant or penaltyparameter which is assigned in advance before each SVMmodel training process This parameter allows striking abalance between the two competing criteria of marginmaximization and error minimization whereas the slackvariables 120576 indicate the distance of the incorrectly classifiedpoints from the optimal hyper plane [15] The larger the119862 value the higher the penalty associated to misclassifiedsamples Parameter 120574 is the parameter of RBF function kernelIn summary the entire SVM model training process is aprocess that the parameters are optimized and determinedby simulation Kavzoglu and Colkesen [16] pointed outthat the parameters determination was a practical way toidentify good parameters Thus in this research they arecalibrated by grid-search method During the search processall possible combinations of (119862 120576 120574) are tested and the onewith the best performance is chosen After conducting thegrid search on the training dataset the optimal parameterswere selected as (2

minus2 2minus5 122) and a cross validated onvalidation dataset is conductedThe various SVMs are imple-mented with Libsvm (National Taiwan University availableat httpwwwcsientuedutwsimcjlinlibsvm) on aMicrosoftWindows platform

The structure of the SVMs model for SVM-T SVM-PT SVM-HT and SVM-HPT is shown as Figure 3 In thisstructure the feature vector V is composed with the differentdimensions of each SVMmodel respectively For example inSVM-T model V = (119881

1 1198812 119881

119904) denotes the traffic volume

of road section 119894 and day 119895 at different times 119902119894119895(119905 minus 119904 minus 1)

119902119894119895(119905) The features vector definitions for each SVMs

model are demonstrated in Section 22To determine the SVMrsquos inputs vector for practical appli-

cation some comparison tests have been conducted betweenthe models with different input variables (T PT HT andHPT) which have been calibrated based on the data collectedThe comparison of prediction errors about the four modelson off-peak period and peak period prediction error resultsare shown as Figure 4

During the whole prediction process the data processingwas divided into three steps respectively for training cross-validation and testing Firstly about 10 of samples datawere set as testing data Then 70 of the remaining samples

0000

0020

0040

0060

0080

0100

0120

0140

SVM_T SVM_PT SVM_HT SVM_HPT

n = 1

n = 2

n = 3

Figure 4 Prediction error results of the SVM model with differentinput vectors

data were assigned to training and the others to cross-validation In particular the training and testing processeswere conducted with the same datasets for the four modelsin order to have the same basis of comparison

34 Results Analysis From the comparison results of thefour models (SVM-T SVM-PT SVM-HT and s-HPT) inSection 33 some conclusions can be drawn as follows

(1) Firstly in the case of single-step prediction the num-ber of prediction steps 119899 = 1 The prediction errorcurves in Figure 4 indicate that the results of theprediction error are similar with each other for thefour models and SVM PT has more accuracy thanSVM TThe reason of this case is that the upstream ofthe traffic volume can reach the current road sectionafter a while so a more accurate prediction result canbe obtained with the introduction of upstream trafficvolume into SVM-T

(2) Secondly in the case of multisteps prediction thenumber of prediction steps 119899 gt 1 For the case of 119899 =

2 the prediction error has the similar characteristicsto single-stepmodel For the case of 119899 gt 2 the predic-tion error of SVM T increases significantly with thesteps 119899 increasingThe reason for this is that in SVM Tmodel the traffic volume data is time-series data thedistance between the predicted points and observingdata has the different impact effect for the predictionaccuracy the more closely the more important forprediction result Furthermore for the multistepsprediction the SVM HT and SVM HPT have a goodprediction performance compared with the SVM Twhich only uses the time-series data for predictingThe reason for this is mainly that the historical datacan represent the traffic trends changing mode withtraffic volume for the current road section so theprediction result can be better with the introduction

Mathematical Problems in Engineering 7

0100200300400500600700

000

224

448

712

936

120

0

142

4

164

8

191

2

213

6

000

Volu

me (

veh)

TimeObserving dataPredict valueActual value

Figure 5 Example of dynamic multisteps prediction results ofSVM PT

of upstream traffic volume data and historical datainto SVM-T

(3) It is important to note that the execution performanceof SVM HPT and SVM HT is different due to thecomplicated input vector dimensions in upstreamtraffic volume data and historical data Thereforein practical application the SVM HT model is apriority selection for prediction under the conditionof meeting the requirement prediction accuracy inorder to reduce the time-consumption of the wholesystem

In order to verify the actual performance of the predictionmodel presented above a test was conducted with SVM HTand the result was shown as Figure 5 In this test a certainwhole day traffic volume data were used for predicting thetraffic trends and the sample interval was assigned to 20minutes so as to simplify the computation process From theresult it can be seen that the prediction curves agree wellwith the actual observing data Test of the prediction errorfound that the traffic volume mean error is 168 vehicles andthe mean relative error is 128Thus it can be seen that thereseems to be a strong potential effectiveness to be used to thepractical application

4 Conclusions

The urban transport system has the characteristics such asnonlinear stochastic and time-varying Therefore artificialintelligence methods are now receiving more and moreattentions in ITS In order to predict traffic flow withmultisteps accurately this paper proposed a SVM model forthe prediction In the present research the traffic volume datawith actual observation surveys in urban area of Dalian wasused to predict the traffic flow by SVM models In orderto obtain the prediction effect with different input vectorssome comparison tests are conducted with different inputvariables (T PT HT and HPT) The test results showedthat the proposed SVM model had a good ability for trafficflow prediction and the comparison of different input vectorsindicated that the SVM-HPT model outperformed the otherthree models for prediction

In this paper only the traffic volume data was used toestimate the current traffic conditions Further studywill con-sider more factors analysis such as the input vectors dimen-sions and the prediction steps so as to enhance the perfor-mance of the proposed prediction models

Acknowledgments

This work was jointly supported by grants from the Human-ities and Social Sciences Foundation of the Ministry ofEducation in China (Project no 12YJCZH280) the PhDPrograms Foundation of Ministry of Education of China(Project no 20112125120004) and National Natural ScienceFoundation of China (Project no 11272075)

References

[1] YWangM Papageorgiou andAMessmer ldquoReal-time freewaytraffic state estimation based on extended Kalman filter adap-tive capabilities and real data testingrdquo Transportation ResearchA vol 42 no 10 pp 1340ndash1358 2008

[2] R M Hage D Betaille F Peyret and D Meizel ldquoUnscentedKalman filter for urban network travel time estimationrdquoProcediamdashSocial and Behavioral Sciences vol 54 no 4 pp1047ndash1057 2012

[3] Y Yuan J W C van Lint R E Wilson F van Wageningen-Kessels and S P Hoogendoorn ldquoReal-time lagrangian trafficstate estimator for freewaysrdquo IEEE Transactions on IntelligentTransportation Systems vol 13 no 1 pp 59ndash70 2012

[4] V N Vapnik ldquoAn overview of statistical learning theoryrdquo IEEETransactions on Neural Networks vol 10 no 5 pp 988ndash9991999

[5] B Yu Z Z Yang and B Z Yao ldquoBus arrival time predictionusing support vector machinesrdquo Journal of Intelligent Trans-portation Systems vol 10 no 4 pp 151ndash158 2006

[6] B YuWHK Lam andM L Tam ldquoBus arrival time predictionat bus stopwithmultiple routesrdquoTransportation Research C vol19 no 6 pp 1157ndash1170 2011

[7] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[8] B Z Yao C Y Yang J B Yao and J Sun ldquoTunnel surroundingrock displacement prediction using support vector machinerdquoInternational Journal of Computational Intelligence Systems vol3 no 6 pp 843ndash852 2010

[9] J H Yang and X N Yu ldquoThe support vector machinesprediction model of freeway dynamic traffic flowrdquo Journal ofXirsquoan Technological University no 3 pp 280ndash284 2009

[10] B Yu Z Z Yang K Chen and B Yu ldquoHybrid model forprediction of bus arrival times at next stationrdquo Journal ofAdvanced Transportation vol 44 no 3 pp 193ndash204 2010

[11] B Yu T Ye XM Tian G BNing and SQ Zhong ldquoBus travel-time prediction with forgetting factorrdquo Journal of Computing inCivil Engineering 2012

[12] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[13] V Tyagi S Kalyanaraman and R Krishnapuram ldquoVehiculartraffic density state estimation based on cumulative road acous-ticsrdquo IEEE Transactions on Intelligent Transportation Systemsvol 13 no 3 pp 1156ndash1166 2012

8 Mathematical Problems in Engineering

[14] Q Li ldquoShort-time traffic flow volume prediction based onsupport vector machine with time-dependent structurerdquo inProceedings of the IEEE Intrumentation and Measurement Tech-nology Conference (I2MTC rsquo09) pp 1730ndash1733 May 2009

[15] TOommenDMisra N K C Twarakavi A Prakash B Sahooand S Bandopadhyay ldquoAn objective analysis of support vectormachine based classification for remote sensingrdquoMathematicalGeosciences vol 40 no 4 pp 409ndash424 2008

[16] T Kavzoglu and I Colkesen ldquoA kernel functions analysis forsupport vector machines for land cover classificationrdquo Interna-tional Journal of Applied EarthObservation andGeoinformationvol 11 no 5 pp 352ndash359 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

2 Mathematical Problems in Engineering

differences between the obtained traffic data and predictiontrends So we can estimate the range of possible spread ofcongestion and duration according to the traffic parametersof multistep prediction results

However accurate prediction of traffic flow is very dif-ficult due to many stochastic variables involved (eg trafficconditions) Therefore the deployment of traffic flow predic-tion model is a challenging task

12 Literature Review Accurate real-time traffic flow pre-diction is prerequisite and key to realize intelligent trafficcontrol and guidance and it is also the objective requirementto intelligent traffic management Over the past decadesvarious sophisticated techniques and algorithms have beendeveloped for traffic flow prediction These methods can beroughly categorized as prediction methods based on mathe-matics and physics including the historical average modeltime series model Kalman filter model and exponentialsmoothing model the other methods based on nonmath-ematical models such as artificial neural network (ANN)nonparametric regression (NPR) and SVM Here only abrief introduction about the typical method is made moredetailed information is found in their literature respectively

121 Kalman Filter Models The Kalman filter also knownas linear quadratic estimation (LQE) is an efficient recursiveprocedure that estimates the future states of dependentvariables It is originated from the state-space representationsin modern control theory Wang et al [1] developed a trafficstate estimator model based on stochastic macroscopic trafficflow modeling and extended Kalman filtering One majorinnovative feature of the traffic state estimator is the onlinejoint estimation of important model parameters (free speedcritical density and capacity) and traffic flow variables (flowsmean speeds and densities) Hage et al [2] proposed andvalidated an algorithm for estimating the urban links traveltimes based on an unscented Kalman filter (UKF) The algo-rithm integrates stochastically the vehicle count data fromunderground loop detectors at the end of every link and thetravel times fromprobe vehiclesTheir results showed that theproposedmethodology can be used for estimating travel timein real time Yuan et al [3] proposed a state estimator basedon the EKF technique in which observation models for bothEulerian and Lagrangian sensor data (from loop detectorsand vehicle trajectories resp) are incorporated The resultsindicate that the Lagrangian estimator is significantly moreaccurate and offers computational and theoretical benefitsover the Eulerian approach

122 Support Vector Machine Models Support vector ma-chine a novel supervised learning method used for clas-sification and regression has been recently proved to bea promising tool for both data classification and patternrecognition [4ndash6] SVM shows very resistant to the over-fitting problem achieving high generalization performancein solving various time series forecasting problems whichhas been applied in prediction of time series [7 8] Thesesuccessful applications motivate researchers to apply SVM

in the intelligent traffic system (ITS) Yang and Yu [9]proposed a freeway dynamic traffic flow forecasting modelbases on SMO support vector machine With the analysisof the freeway macroscopic dynamic traffic flow model andcarrying out detail research on selecting parameter of SMOsupport vector machines a test experiment was carried outand the result showed that the average relative error offorecasting was less than 384 Yu et al [6 10] developedthe SVM-based models to predict bus arrival time In themodels travel speeds of the preceding buses of the samebus route were used to estimate traffic conditions Theirresults showed that the SVM-based models outperformedthe ANN and historic mean prediction models and SVMseemed to be a powerful alternative for bus arrival timeprediction With versatile parallel distributed structures andadaptive learning processes ANN and SVM have recentlybeen gaining popularity in bus travelarrival time predictionYu et al [11] applied the SVM to predicting bus travel timesIn which a decay factor was introduced to adjust the weightsbetween new and old data and the test results showed that theSVM with the decay factor and the adaptive algorithm hadbetter prediction accuracy and dynamic performance thanother existing algorithms

In summary due to the characteristics of urban trafficstate such as uncertainty nonlinearity and complexity someresearchers use traffic flow model to illustrate the dynamicchanges of traffic state to predict the evolution of the trafficflow and then achieved the short-term traffic flow predictionmodel However the structure of the traffic flow model iscomplex relatively which brings difficulty to the practicalapplication The method based on mathematical model ishard to meet the practical real-time requirements and theneed for accuracy due to its difficulties in model building andsolving In contrast the method based on nonmathematicalmodel does not need complexmodel building the predictionaccuracy can meet the requirements of the intelligent trans-portation systemTherefore these methods have been widelyapplied to the traffic flow prediction

13 Contributions Urban road traffic conditions are not onlyclosely related to historical period conditions but also to theupstream and downstream road state At present most ofthe accurate traffic flow prediction methods are focusing onthe relation analysis between the traffic conditions and thehistorical traffic data from the time dimension of viewWhilethe spatial dimension such as the upstream and downstreamtraffic condition and the traffic mode changes daily areignored In addition most of the current traffic flow pre-dictions are concentrated on the single-step prediction lessin the multisteps prediction In other words the predictionconcentrates more on the coming traffic flow estimation fora specific time point not concerns with the traffic conditionchanging trends in the next certain time periods But infact from the perspective of dynamic decision making thetraffic condition changing trends aremore important than thecurrent traffic situation for the trafficmanagementThereforeit is absolutely essential to establish a multisteps model forthe traffic flow prediction which can be used for estimating

Mathematical Problems in Engineering 3

the road traffic status trends accurately and the predictionresult can be applied in improving rationality of the trafficmanagement and the travel decisionmaking

This paper seeks to make two contributions to theliterature Firstly it attempts to develop the models to predictmultisteps traffic flow with multiple steps using real-worlddata It is expected to help the transit agencies efficientlymakereasonable and effective traffic guidance strategy Secondlyin order to improve the prediction accuracy not only thehistorical traffic data but also the daily space-time sequencesdata are taken into consideration during the input state vectorconstructing The performance of the proposed model canprovide valuable insight for researchers as well as practition-ers

The structure of this paper is organized as followsSection 2 provides the accurate real-time prediction modelfor predicting traffic flow with multisteps Section 3 presentsa case study together with results and analysis includingperformance evaluation of the proposedmodel and lastly theconclusions are given in Section 4 together with suggestionsfor further study

2 Methodologies

21 SupportVectorMachineModels SVM is a type of learningalgorithm based on statistical learning theory which canbe adjusted to map the complex input-output relationshipfor the nonlinear system without dependent on the specificfunctions Unlike other nonlinear optimization methodsthe solution of SVM always can achieve the global optimalsolution without limitation to a local minimum point and itshows the strong resistance to the over-fitting problem andthe high generalization performance

Given the samples (1199091 1199101) (1199092 1199102) (119909

119894 119910119894) (119909119894isin 119883 sube

119877119899 119910119894isin 119884 sube 119877) SVM make use of a nonlinear mapping

120601 to map 119909 into a high-dimensional feature space 119867 inwhich linear approximation is conducted to find themappingfunction so that we can get a better approximation for thegiven data samples Based on the statistical learning theorywe can get the function as follows

119891 (119909) = 120596120601 (119909) + 119887 (1)

Regression can be defined as a problem that minimized therisks for a loss function The optimal regression functionis the minimum and regularized generic function 119876 undercertain constraints

119876 =1

21205962+

119862

119897

119897

sum119894=1

119871120576(119910119894 119891 (119909119894)) (2)

where 120596 is a standard vector The first item is named asthe regularized term which makes the function flat andimprove its generalization ability the second item is namedas the experience risk generic functional which can bedetermined by different loss functions 119862 is used to balancethe relationship between structure risks and experience risks(119862 gt 0) When the insensitive loss function is defined with

119871120576(119910119894 119891 (119909119894)) = max (1003816100381610038161003816119910119894 minus 119891 (119909

119894)1003816100381610038161003816 minus 120576 0) (3)

0200400600800

100012001400160018002000

000

120

240

400

520

640

800

920

104

012

00

132

014

40

160

017

20

184

020

00

212

022

40

000

Volu

me (

veh

h)

Time of the dayMondayWednesdayFriday

SaturdaySunday

Figure 1 Traffic volume diurnal variation trends

The minimization of (2) is a convex quadratic optimizationproblem and then the problem can be inferred with theLagrange multipliers 119886

119894and 119886lowast119894 shown as following

120596 minus

119897

sum119894=1

(120572119894minus 120572lowast

119894) 119909119894= 0 (4)

Then the SVM decision function is

119891 (119909) = sgn(

119897

sum119894=1

(120572119894minus 120572lowast

119894) 120601 (119909119894) sdot 0 (119909) + 119887) (5)

The following equation can be obtained

119891 (119909) =

119897

sum119894=1

(120572119894minus 120572lowast

119894)119870 (119909

119894 119909119895) + 119887 (6)

where 119870(119909119894 119909119895) = 120601(119909

119894) denotes the inner product of the

vector in the feature space All of the kernel functions can bedirectly computed in the input space

22 Multisteps Prediction Methods Based on SVM Trafficflow data are time-series data which have the characteristicssuch as autocorrelation and historical changes with similarityFigure 1 shows the traffic volume changes for a certain roadsection during a few days in a week In which during theweek days the traffic volume changes graduallyThat is to saythe traffic volume data have the autocorrelation propertieswhich can be demonstrated with the relationships betweenthe traffic flow data at time 119905 and before Therefore theprediction of the traffic flow can be conducted according withthis autocorrelation properties

Furthermore other some cues can be inferred Firstlythe traffic volume changes with similarity each day due tothe inhabitant regular daily traveling The data waveformpresents like a saddle and the peakvalley of the waveappeared in the same time In this paper this regularchanging pattern is named as the historical change of sim-ilarity which can be used for establishing the traffic flowchanging model to predict future multisteps traffic state

4 Mathematical Problems in Engineering

Secondly although the daily traffic volume has the samechanging trends the models have different properties fromeach other For example the traffic volume curve are similarfrom Monday to Friday but on Saturday and Sunday thecurve presents different pattern from the week-days thepeakvalley of traffic volume curve is not obvious for distin-guishing

Therefore a historical data model for prediction shouldbe constructed which can be used for predicting the trafficflow historical models for every day (from Monday toSunday) When the database is not enough for predicting wealso can establish the models for week-days and week endsrespectively In addition the trends of traffic flow data alsocan be influenced by weather or holidays

Besides the above demonstration about the traffic flowproperties in time domain the data are also correlated inspace domain The traffic flow data has some correlationsbetween the upstream and downstream for a certain roadsection The upstream traffic flow will reach the downstreamafter a while therefore we can predict the future traffic flowcondition of the downstream based on this correlation

Given the traffic volume 119902119894119895(119905) of road section 119894 time 119905

and day 119895 the observation volume sequence 119876119894119895

= 119902119894119895(1)

119902119894119895(2) 119902

119894119895(119905) the prediction of the traffic volume

119902119894119895(119905) then the forecasting traffic volume sequence of road

section in the future 119899 steps are expressed as 119876119894119895

= 119902119894119895(119905)

119902119894119895(119905+1) 119902

119894119895(119905+119899)The historical pattern data sequence

is defined as 119867119876119894119896

= ℎ119902119894119896(1) ℎ119902

119894119896(2) ℎ119902

119894119896(119905) in

which ℎ119902119894119896(119905) = (1119908)sum 119902

119894119895(119905) is the historical data of section

119894 on the 119896th day (from Monday to Sunday) and 119908 is thenumber of weeks for the data included

Based on the description above and the space-timecharacteristics of the traffic flow data in order to predict thetraffic flow for the 119899 steps (119905 + 1 119905 + 119899) at section 119894 time119905 three input feature vectors are designed for the predictionmodel

(1) autocorrelation time feature vector119876119894119895

= 119902119894119895(119905minus119898)

119902119894119895(119905)

(2) historical model feature vector119867119876119894119896

= ℎ119902119894119896(1)

ℎ119902119894119896(119905 minus 1) ℎ119902

119894119896(119905) ℎ119902

119894119896(119905 + 1) ℎ119902

119894119896(119905 + 119899) 119896 =

1 2 7 which represents fromMonday to Sundayrespectively

(3) space correlation feature vector119876119894minus1119895

= 119902119894minus1119895

(119905minus119898)

119902119894minus1119895

(119905) 119876119894minus2119895

= 119902119894minus2119895

(119905 minus 119898) 119902119894minus2119895

(119905) The number of time sequence119898 is determined by thetime consumption from the upstream to the currentroad section

The relationship between each input vector and outputvector is shown in Figure 2

From the aspect of theoretical analysis for the single-step mode accurate results can be obtained with only con-sideration of the time state and space state vectors due to thegradient traffic flow characteristics On the contrary for themultisteps mode the traffic flow changing trends play a moreimportant role in prediction therefore more accurate resultscan be achieved by taking the historical data into account

Based on the above analysis the combination of threestate vectors is used as the input data for SVM which are asfollows

(1) the combination of one-dimensional data for thespecified road section in the previous intervals (T)119876119894119895

= 119902119894119895(119905 minus 119898) 119902

119894119895(119905) which describes the

traffic volume changing property with the time inter-vals this property is similar to the single-step predic-tion method

(2) the combination ofmultidimensional space-time dataof the upstream and the specified road section (PT)119876119894minus1119895

= 119902119894minus1119895

(119905minus119898) 119902119894minus1119895

(119905)119876119894minus2119895

= 119902119894minus2119895

(119905minus

119898) 119902119894minus2119895

(119905) which describes the correlation ofthe traffic flow changing in time domain especiallythe correlations between the upstream and down-stream for a certain road section

(3) the combination of the time-series data and thehistorical pattern data for the specified road sectionin the current interval (HT) 119876

119894119895= 119902119894119895(119905 minus 119898)

119902119894119895(119905)119867119876

119894119896= ℎ119902119894119896(1) ℎ119902

119894119896(119905 minus 1) ℎ119902

119894119896(119905)

ℎ119902119894119896(119905 + 1) ℎ119902

119894119896(119905 + 119899) which describes the

historical similarity of the traffic volume changingproperty demonstrated in the Figure 1

(4) the combination of the time-series data for theupstream and the specified road section in the currentinterval and the historical pattern data for the targetroad section (HPT)

Each of the combination vectors above are used as the inputvariable of the SVM model and 119876

119894119895= 119902119894119895(119905) 119902119894119895(119905 + 1)

119902119894119895(119905 + 119899) as the output variable the SVM-T SVM-TP

SVM-TH and SVM-TPH can be achieved respectively duringthe SVM training process

3 Case Study

31 Evaluation of Model Performance As an indicator toreflect the accuracy and availability of the prediction modelerror plays an important role in evaluating the predictionmodel Common error indictors include absolute predictionerror and relative prediction error which can be calculated asfollows respectively

119864119886(119905) = 119902

119894(119905) minus 119902

119894(119905)

119864119903(119905) =

119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

(7)

where 119864119886(119905) denotes the absolute prediction error at time 119905

119864119903(119905) denotes the relative prediction error 119902

119894(119905) denotes the

measured value and 119902119894(119905) denotes the prediction value 119864

119886(119905)

and119864119903(119905) are the errors of the single-step prediction while for

the dynamic multisteps prediction both the value and errorcan be achieved during each prediction process which can bedescribed with 119902

119894(119905 + 1) 119902

119894(119905 + 2) 119902

119894(119905 + 119899) 119902

119894(119905 + 1) is the

prediction value of (119905 + 1) at the current time 119905 where 119899 is theprediction step size The greater the 119899 the more prediction

Mathematical Problems in Engineering 5

Historical data

qiminus2j (t minus m)

qiminus2j (t minus 1)

qiminus2j (t)

qiminus1j (t minus m)

qiminus1j (t minus 1)

qiminus1j (t)

qij (t minus m)

qij (t minus 1)

qij (t)

Target road

1

(00)

HQik

Time of day

qij(t) qij(t + 1) middot middot middot qij(t+

Roadiminus2 Roadiminus1 Roadi Roadi+1

Figure 2 Prediction mechanism based on the traffic flow space-time characteristics

error we can achieve But a maximum theoretical limit 119899maxexists for avoiding unlimited prediction loop

In order to obtain an objective and accurate evaluation ofthe prediction effect themean value of all the errors is used asthe evaluation index for multisteps prediction the formulasare as follows

119872119864119886(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905))

119872119864119903(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

)

(8)

where 119872119864119886(119905) is the mean value of the absolute error for

multisteps prediction at time 119905 119872119864119903(119905) is the mean value of

the relative error In fact for multisteps prediction the time-series synthetical error is more important for the predictionresult Therefore 119872119864

119886(119905) and 119872119864

119903(119905) are the direct sum

rather than the sum of absolute values of 119864119886(119905) and 119864

119903(119905)

32 Data Collection and Processing The presented SVMmodel for traffic flow prediction was tested with the actualsurvey data of Gaoerji Road in Dalian China dated fromMay 14 (Monday) to 20 (Sunday) 2012 Gaoerji road is aone-way lane with several importsexports and goes fromZhongshan road to Yierjiu street through the city center witha distance of 38 km The traffic state is highly congestedin the morning and afternoon peaks In this research theactual survey started from 700 in the morningThe collecteddata was obtained with SCOOT (Split Cycle and OffsetOptimization Technology) and consist of the traffic volumeduring the peak time (PT 0700ndash0830 h) and off-peak time(OT 1000ndash1200 h) with recording data once every 5 minutes

There are some influence factors including missing errorand random noise in the raw datasets which is obtainedwith SCOOTTherefore the data must be identified repairedand smoothed firstly Then the time-series data is achieved

Table 1 Four popular kernel functions

Kernel function Expression CommentLinear kernel 119870(119909

119894 119909119895) = 119909119879119894119909119895

Polynomialkernel 119870(119909

119894 119909119895) = (120574119909119879

119894119909119895+ 119903)119889

120574 gt 0

Radial basisfunction (RBF)kernel

119870(119909119894 119909119895) = 119890minus120574119909119894minus119909119895

2

120574 gt 0

Sigmoid kernel 119870(119909119894 119909119895) = tanh (120574119909119879

119894119909119895+ 119903)119889

119909119894 119909119895 are input vectors and 120574 119903 and 119889 are kernel parameters

with normalizationmethodThemain advantage of data nor-malization is to accelerate convergence velocity of iterationduring the SVM training Another advantage is to facilitatesubsequent data processing Singular value sample datamightcause numerical problems because the kernel values usuallydepend on the inner products of feature vectors such asthe linear kernel or the polynomial kernel Therefore it isrecommended that each attribute should be linearly scaled tothe range [minus1 +1] or [0 +1] In this research the data sets werescaled to the range between 0 and 1

33 Model Identification In Section 22 demonstrated abovethe input vectors for traffic flow prediction model have beendiscussed and constructed Here we only give an introduc-tion to SVM briefly More detailed descriptions can be foundin [12] Given a training set of instance-label pairs the inputvectors can be mapped implicitly and hence very efficientlyinto high-dimensional space (maybe infinite) by the kernelfunction Φ and SVM finds a linear separating hyper planewith the maximal margin in this higher dimensional spaceAt present new kernel functions are proposed by researcherssome popular kernel functions are as Table 1

The previous researches [13 14] suggested that radialbasis function (RBF) kernel had less numerical difficultieswith application and was efficient for traffic state and traffic

6 Mathematical Problems in Engineering

V1

V2

V3

Vs

K(x1 x)

K(x2 x)

K(xs x)

1205971y1

1205972y2

120597sys

+ f(x)

Figure 3 Structure of the SVMs model for traffic flow prediction

flow prediction Thus RBF kernel function is used for theSVM model in this study Before the RBF kernel is beingused three parameters associated should be determined(119862 120576 120574) Parameter 119862 is a regularization constant or penaltyparameter which is assigned in advance before each SVMmodel training process This parameter allows striking abalance between the two competing criteria of marginmaximization and error minimization whereas the slackvariables 120576 indicate the distance of the incorrectly classifiedpoints from the optimal hyper plane [15] The larger the119862 value the higher the penalty associated to misclassifiedsamples Parameter 120574 is the parameter of RBF function kernelIn summary the entire SVM model training process is aprocess that the parameters are optimized and determinedby simulation Kavzoglu and Colkesen [16] pointed outthat the parameters determination was a practical way toidentify good parameters Thus in this research they arecalibrated by grid-search method During the search processall possible combinations of (119862 120576 120574) are tested and the onewith the best performance is chosen After conducting thegrid search on the training dataset the optimal parameterswere selected as (2

minus2 2minus5 122) and a cross validated onvalidation dataset is conductedThe various SVMs are imple-mented with Libsvm (National Taiwan University availableat httpwwwcsientuedutwsimcjlinlibsvm) on aMicrosoftWindows platform

The structure of the SVMs model for SVM-T SVM-PT SVM-HT and SVM-HPT is shown as Figure 3 In thisstructure the feature vector V is composed with the differentdimensions of each SVMmodel respectively For example inSVM-T model V = (119881

1 1198812 119881

119904) denotes the traffic volume

of road section 119894 and day 119895 at different times 119902119894119895(119905 minus 119904 minus 1)

119902119894119895(119905) The features vector definitions for each SVMs

model are demonstrated in Section 22To determine the SVMrsquos inputs vector for practical appli-

cation some comparison tests have been conducted betweenthe models with different input variables (T PT HT andHPT) which have been calibrated based on the data collectedThe comparison of prediction errors about the four modelson off-peak period and peak period prediction error resultsare shown as Figure 4

During the whole prediction process the data processingwas divided into three steps respectively for training cross-validation and testing Firstly about 10 of samples datawere set as testing data Then 70 of the remaining samples

0000

0020

0040

0060

0080

0100

0120

0140

SVM_T SVM_PT SVM_HT SVM_HPT

n = 1

n = 2

n = 3

Figure 4 Prediction error results of the SVM model with differentinput vectors

data were assigned to training and the others to cross-validation In particular the training and testing processeswere conducted with the same datasets for the four modelsin order to have the same basis of comparison

34 Results Analysis From the comparison results of thefour models (SVM-T SVM-PT SVM-HT and s-HPT) inSection 33 some conclusions can be drawn as follows

(1) Firstly in the case of single-step prediction the num-ber of prediction steps 119899 = 1 The prediction errorcurves in Figure 4 indicate that the results of theprediction error are similar with each other for thefour models and SVM PT has more accuracy thanSVM TThe reason of this case is that the upstream ofthe traffic volume can reach the current road sectionafter a while so a more accurate prediction result canbe obtained with the introduction of upstream trafficvolume into SVM-T

(2) Secondly in the case of multisteps prediction thenumber of prediction steps 119899 gt 1 For the case of 119899 =

2 the prediction error has the similar characteristicsto single-stepmodel For the case of 119899 gt 2 the predic-tion error of SVM T increases significantly with thesteps 119899 increasingThe reason for this is that in SVM Tmodel the traffic volume data is time-series data thedistance between the predicted points and observingdata has the different impact effect for the predictionaccuracy the more closely the more important forprediction result Furthermore for the multistepsprediction the SVM HT and SVM HPT have a goodprediction performance compared with the SVM Twhich only uses the time-series data for predictingThe reason for this is mainly that the historical datacan represent the traffic trends changing mode withtraffic volume for the current road section so theprediction result can be better with the introduction

Mathematical Problems in Engineering 7

0100200300400500600700

000

224

448

712

936

120

0

142

4

164

8

191

2

213

6

000

Volu

me (

veh)

TimeObserving dataPredict valueActual value

Figure 5 Example of dynamic multisteps prediction results ofSVM PT

of upstream traffic volume data and historical datainto SVM-T

(3) It is important to note that the execution performanceof SVM HPT and SVM HT is different due to thecomplicated input vector dimensions in upstreamtraffic volume data and historical data Thereforein practical application the SVM HT model is apriority selection for prediction under the conditionof meeting the requirement prediction accuracy inorder to reduce the time-consumption of the wholesystem

In order to verify the actual performance of the predictionmodel presented above a test was conducted with SVM HTand the result was shown as Figure 5 In this test a certainwhole day traffic volume data were used for predicting thetraffic trends and the sample interval was assigned to 20minutes so as to simplify the computation process From theresult it can be seen that the prediction curves agree wellwith the actual observing data Test of the prediction errorfound that the traffic volume mean error is 168 vehicles andthe mean relative error is 128Thus it can be seen that thereseems to be a strong potential effectiveness to be used to thepractical application

4 Conclusions

The urban transport system has the characteristics such asnonlinear stochastic and time-varying Therefore artificialintelligence methods are now receiving more and moreattentions in ITS In order to predict traffic flow withmultisteps accurately this paper proposed a SVM model forthe prediction In the present research the traffic volume datawith actual observation surveys in urban area of Dalian wasused to predict the traffic flow by SVM models In orderto obtain the prediction effect with different input vectorssome comparison tests are conducted with different inputvariables (T PT HT and HPT) The test results showedthat the proposed SVM model had a good ability for trafficflow prediction and the comparison of different input vectorsindicated that the SVM-HPT model outperformed the otherthree models for prediction

In this paper only the traffic volume data was used toestimate the current traffic conditions Further studywill con-sider more factors analysis such as the input vectors dimen-sions and the prediction steps so as to enhance the perfor-mance of the proposed prediction models

Acknowledgments

This work was jointly supported by grants from the Human-ities and Social Sciences Foundation of the Ministry ofEducation in China (Project no 12YJCZH280) the PhDPrograms Foundation of Ministry of Education of China(Project no 20112125120004) and National Natural ScienceFoundation of China (Project no 11272075)

References

[1] YWangM Papageorgiou andAMessmer ldquoReal-time freewaytraffic state estimation based on extended Kalman filter adap-tive capabilities and real data testingrdquo Transportation ResearchA vol 42 no 10 pp 1340ndash1358 2008

[2] R M Hage D Betaille F Peyret and D Meizel ldquoUnscentedKalman filter for urban network travel time estimationrdquoProcediamdashSocial and Behavioral Sciences vol 54 no 4 pp1047ndash1057 2012

[3] Y Yuan J W C van Lint R E Wilson F van Wageningen-Kessels and S P Hoogendoorn ldquoReal-time lagrangian trafficstate estimator for freewaysrdquo IEEE Transactions on IntelligentTransportation Systems vol 13 no 1 pp 59ndash70 2012

[4] V N Vapnik ldquoAn overview of statistical learning theoryrdquo IEEETransactions on Neural Networks vol 10 no 5 pp 988ndash9991999

[5] B Yu Z Z Yang and B Z Yao ldquoBus arrival time predictionusing support vector machinesrdquo Journal of Intelligent Trans-portation Systems vol 10 no 4 pp 151ndash158 2006

[6] B YuWHK Lam andM L Tam ldquoBus arrival time predictionat bus stopwithmultiple routesrdquoTransportation Research C vol19 no 6 pp 1157ndash1170 2011

[7] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[8] B Z Yao C Y Yang J B Yao and J Sun ldquoTunnel surroundingrock displacement prediction using support vector machinerdquoInternational Journal of Computational Intelligence Systems vol3 no 6 pp 843ndash852 2010

[9] J H Yang and X N Yu ldquoThe support vector machinesprediction model of freeway dynamic traffic flowrdquo Journal ofXirsquoan Technological University no 3 pp 280ndash284 2009

[10] B Yu Z Z Yang K Chen and B Yu ldquoHybrid model forprediction of bus arrival times at next stationrdquo Journal ofAdvanced Transportation vol 44 no 3 pp 193ndash204 2010

[11] B Yu T Ye XM Tian G BNing and SQ Zhong ldquoBus travel-time prediction with forgetting factorrdquo Journal of Computing inCivil Engineering 2012

[12] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[13] V Tyagi S Kalyanaraman and R Krishnapuram ldquoVehiculartraffic density state estimation based on cumulative road acous-ticsrdquo IEEE Transactions on Intelligent Transportation Systemsvol 13 no 3 pp 1156ndash1166 2012

8 Mathematical Problems in Engineering

[14] Q Li ldquoShort-time traffic flow volume prediction based onsupport vector machine with time-dependent structurerdquo inProceedings of the IEEE Intrumentation and Measurement Tech-nology Conference (I2MTC rsquo09) pp 1730ndash1733 May 2009

[15] TOommenDMisra N K C Twarakavi A Prakash B Sahooand S Bandopadhyay ldquoAn objective analysis of support vectormachine based classification for remote sensingrdquoMathematicalGeosciences vol 40 no 4 pp 409ndash424 2008

[16] T Kavzoglu and I Colkesen ldquoA kernel functions analysis forsupport vector machines for land cover classificationrdquo Interna-tional Journal of Applied EarthObservation andGeoinformationvol 11 no 5 pp 352ndash359 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

Mathematical Problems in Engineering 3

the road traffic status trends accurately and the predictionresult can be applied in improving rationality of the trafficmanagement and the travel decisionmaking

This paper seeks to make two contributions to theliterature Firstly it attempts to develop the models to predictmultisteps traffic flow with multiple steps using real-worlddata It is expected to help the transit agencies efficientlymakereasonable and effective traffic guidance strategy Secondlyin order to improve the prediction accuracy not only thehistorical traffic data but also the daily space-time sequencesdata are taken into consideration during the input state vectorconstructing The performance of the proposed model canprovide valuable insight for researchers as well as practition-ers

The structure of this paper is organized as followsSection 2 provides the accurate real-time prediction modelfor predicting traffic flow with multisteps Section 3 presentsa case study together with results and analysis includingperformance evaluation of the proposedmodel and lastly theconclusions are given in Section 4 together with suggestionsfor further study

2 Methodologies

21 SupportVectorMachineModels SVM is a type of learningalgorithm based on statistical learning theory which canbe adjusted to map the complex input-output relationshipfor the nonlinear system without dependent on the specificfunctions Unlike other nonlinear optimization methodsthe solution of SVM always can achieve the global optimalsolution without limitation to a local minimum point and itshows the strong resistance to the over-fitting problem andthe high generalization performance

Given the samples (1199091 1199101) (1199092 1199102) (119909

119894 119910119894) (119909119894isin 119883 sube

119877119899 119910119894isin 119884 sube 119877) SVM make use of a nonlinear mapping

120601 to map 119909 into a high-dimensional feature space 119867 inwhich linear approximation is conducted to find themappingfunction so that we can get a better approximation for thegiven data samples Based on the statistical learning theorywe can get the function as follows

119891 (119909) = 120596120601 (119909) + 119887 (1)

Regression can be defined as a problem that minimized therisks for a loss function The optimal regression functionis the minimum and regularized generic function 119876 undercertain constraints

119876 =1

21205962+

119862

119897

119897

sum119894=1

119871120576(119910119894 119891 (119909119894)) (2)

where 120596 is a standard vector The first item is named asthe regularized term which makes the function flat andimprove its generalization ability the second item is namedas the experience risk generic functional which can bedetermined by different loss functions 119862 is used to balancethe relationship between structure risks and experience risks(119862 gt 0) When the insensitive loss function is defined with

119871120576(119910119894 119891 (119909119894)) = max (1003816100381610038161003816119910119894 minus 119891 (119909

119894)1003816100381610038161003816 minus 120576 0) (3)

0200400600800

100012001400160018002000

000

120

240

400

520

640

800

920

104

012

00

132

014

40

160

017

20

184

020

00

212

022

40

000

Volu

me (

veh

h)

Time of the dayMondayWednesdayFriday

SaturdaySunday

Figure 1 Traffic volume diurnal variation trends

The minimization of (2) is a convex quadratic optimizationproblem and then the problem can be inferred with theLagrange multipliers 119886

119894and 119886lowast119894 shown as following

120596 minus

119897

sum119894=1

(120572119894minus 120572lowast

119894) 119909119894= 0 (4)

Then the SVM decision function is

119891 (119909) = sgn(

119897

sum119894=1

(120572119894minus 120572lowast

119894) 120601 (119909119894) sdot 0 (119909) + 119887) (5)

The following equation can be obtained

119891 (119909) =

119897

sum119894=1

(120572119894minus 120572lowast

119894)119870 (119909

119894 119909119895) + 119887 (6)

where 119870(119909119894 119909119895) = 120601(119909

119894) denotes the inner product of the

vector in the feature space All of the kernel functions can bedirectly computed in the input space

22 Multisteps Prediction Methods Based on SVM Trafficflow data are time-series data which have the characteristicssuch as autocorrelation and historical changes with similarityFigure 1 shows the traffic volume changes for a certain roadsection during a few days in a week In which during theweek days the traffic volume changes graduallyThat is to saythe traffic volume data have the autocorrelation propertieswhich can be demonstrated with the relationships betweenthe traffic flow data at time 119905 and before Therefore theprediction of the traffic flow can be conducted according withthis autocorrelation properties

Furthermore other some cues can be inferred Firstlythe traffic volume changes with similarity each day due tothe inhabitant regular daily traveling The data waveformpresents like a saddle and the peakvalley of the waveappeared in the same time In this paper this regularchanging pattern is named as the historical change of sim-ilarity which can be used for establishing the traffic flowchanging model to predict future multisteps traffic state

4 Mathematical Problems in Engineering

Secondly although the daily traffic volume has the samechanging trends the models have different properties fromeach other For example the traffic volume curve are similarfrom Monday to Friday but on Saturday and Sunday thecurve presents different pattern from the week-days thepeakvalley of traffic volume curve is not obvious for distin-guishing

Therefore a historical data model for prediction shouldbe constructed which can be used for predicting the trafficflow historical models for every day (from Monday toSunday) When the database is not enough for predicting wealso can establish the models for week-days and week endsrespectively In addition the trends of traffic flow data alsocan be influenced by weather or holidays

Besides the above demonstration about the traffic flowproperties in time domain the data are also correlated inspace domain The traffic flow data has some correlationsbetween the upstream and downstream for a certain roadsection The upstream traffic flow will reach the downstreamafter a while therefore we can predict the future traffic flowcondition of the downstream based on this correlation

Given the traffic volume 119902119894119895(119905) of road section 119894 time 119905

and day 119895 the observation volume sequence 119876119894119895

= 119902119894119895(1)

119902119894119895(2) 119902

119894119895(119905) the prediction of the traffic volume

119902119894119895(119905) then the forecasting traffic volume sequence of road

section in the future 119899 steps are expressed as 119876119894119895

= 119902119894119895(119905)

119902119894119895(119905+1) 119902

119894119895(119905+119899)The historical pattern data sequence

is defined as 119867119876119894119896

= ℎ119902119894119896(1) ℎ119902

119894119896(2) ℎ119902

119894119896(119905) in

which ℎ119902119894119896(119905) = (1119908)sum 119902

119894119895(119905) is the historical data of section

119894 on the 119896th day (from Monday to Sunday) and 119908 is thenumber of weeks for the data included

Based on the description above and the space-timecharacteristics of the traffic flow data in order to predict thetraffic flow for the 119899 steps (119905 + 1 119905 + 119899) at section 119894 time119905 three input feature vectors are designed for the predictionmodel

(1) autocorrelation time feature vector119876119894119895

= 119902119894119895(119905minus119898)

119902119894119895(119905)

(2) historical model feature vector119867119876119894119896

= ℎ119902119894119896(1)

ℎ119902119894119896(119905 minus 1) ℎ119902

119894119896(119905) ℎ119902

119894119896(119905 + 1) ℎ119902

119894119896(119905 + 119899) 119896 =

1 2 7 which represents fromMonday to Sundayrespectively

(3) space correlation feature vector119876119894minus1119895

= 119902119894minus1119895

(119905minus119898)

119902119894minus1119895

(119905) 119876119894minus2119895

= 119902119894minus2119895

(119905 minus 119898) 119902119894minus2119895

(119905) The number of time sequence119898 is determined by thetime consumption from the upstream to the currentroad section

The relationship between each input vector and outputvector is shown in Figure 2

From the aspect of theoretical analysis for the single-step mode accurate results can be obtained with only con-sideration of the time state and space state vectors due to thegradient traffic flow characteristics On the contrary for themultisteps mode the traffic flow changing trends play a moreimportant role in prediction therefore more accurate resultscan be achieved by taking the historical data into account

Based on the above analysis the combination of threestate vectors is used as the input data for SVM which are asfollows

(1) the combination of one-dimensional data for thespecified road section in the previous intervals (T)119876119894119895

= 119902119894119895(119905 minus 119898) 119902

119894119895(119905) which describes the

traffic volume changing property with the time inter-vals this property is similar to the single-step predic-tion method

(2) the combination ofmultidimensional space-time dataof the upstream and the specified road section (PT)119876119894minus1119895

= 119902119894minus1119895

(119905minus119898) 119902119894minus1119895

(119905)119876119894minus2119895

= 119902119894minus2119895

(119905minus

119898) 119902119894minus2119895

(119905) which describes the correlation ofthe traffic flow changing in time domain especiallythe correlations between the upstream and down-stream for a certain road section

(3) the combination of the time-series data and thehistorical pattern data for the specified road sectionin the current interval (HT) 119876

119894119895= 119902119894119895(119905 minus 119898)

119902119894119895(119905)119867119876

119894119896= ℎ119902119894119896(1) ℎ119902

119894119896(119905 minus 1) ℎ119902

119894119896(119905)

ℎ119902119894119896(119905 + 1) ℎ119902

119894119896(119905 + 119899) which describes the

historical similarity of the traffic volume changingproperty demonstrated in the Figure 1

(4) the combination of the time-series data for theupstream and the specified road section in the currentinterval and the historical pattern data for the targetroad section (HPT)

Each of the combination vectors above are used as the inputvariable of the SVM model and 119876

119894119895= 119902119894119895(119905) 119902119894119895(119905 + 1)

119902119894119895(119905 + 119899) as the output variable the SVM-T SVM-TP

SVM-TH and SVM-TPH can be achieved respectively duringthe SVM training process

3 Case Study

31 Evaluation of Model Performance As an indicator toreflect the accuracy and availability of the prediction modelerror plays an important role in evaluating the predictionmodel Common error indictors include absolute predictionerror and relative prediction error which can be calculated asfollows respectively

119864119886(119905) = 119902

119894(119905) minus 119902

119894(119905)

119864119903(119905) =

119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

(7)

where 119864119886(119905) denotes the absolute prediction error at time 119905

119864119903(119905) denotes the relative prediction error 119902

119894(119905) denotes the

measured value and 119902119894(119905) denotes the prediction value 119864

119886(119905)

and119864119903(119905) are the errors of the single-step prediction while for

the dynamic multisteps prediction both the value and errorcan be achieved during each prediction process which can bedescribed with 119902

119894(119905 + 1) 119902

119894(119905 + 2) 119902

119894(119905 + 119899) 119902

119894(119905 + 1) is the

prediction value of (119905 + 1) at the current time 119905 where 119899 is theprediction step size The greater the 119899 the more prediction

Mathematical Problems in Engineering 5

Historical data

qiminus2j (t minus m)

qiminus2j (t minus 1)

qiminus2j (t)

qiminus1j (t minus m)

qiminus1j (t minus 1)

qiminus1j (t)

qij (t minus m)

qij (t minus 1)

qij (t)

Target road

1

(00)

HQik

Time of day

qij(t) qij(t + 1) middot middot middot qij(t+

Roadiminus2 Roadiminus1 Roadi Roadi+1

Figure 2 Prediction mechanism based on the traffic flow space-time characteristics

error we can achieve But a maximum theoretical limit 119899maxexists for avoiding unlimited prediction loop

In order to obtain an objective and accurate evaluation ofthe prediction effect themean value of all the errors is used asthe evaluation index for multisteps prediction the formulasare as follows

119872119864119886(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905))

119872119864119903(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

)

(8)

where 119872119864119886(119905) is the mean value of the absolute error for

multisteps prediction at time 119905 119872119864119903(119905) is the mean value of

the relative error In fact for multisteps prediction the time-series synthetical error is more important for the predictionresult Therefore 119872119864

119886(119905) and 119872119864

119903(119905) are the direct sum

rather than the sum of absolute values of 119864119886(119905) and 119864

119903(119905)

32 Data Collection and Processing The presented SVMmodel for traffic flow prediction was tested with the actualsurvey data of Gaoerji Road in Dalian China dated fromMay 14 (Monday) to 20 (Sunday) 2012 Gaoerji road is aone-way lane with several importsexports and goes fromZhongshan road to Yierjiu street through the city center witha distance of 38 km The traffic state is highly congestedin the morning and afternoon peaks In this research theactual survey started from 700 in the morningThe collecteddata was obtained with SCOOT (Split Cycle and OffsetOptimization Technology) and consist of the traffic volumeduring the peak time (PT 0700ndash0830 h) and off-peak time(OT 1000ndash1200 h) with recording data once every 5 minutes

There are some influence factors including missing errorand random noise in the raw datasets which is obtainedwith SCOOTTherefore the data must be identified repairedand smoothed firstly Then the time-series data is achieved

Table 1 Four popular kernel functions

Kernel function Expression CommentLinear kernel 119870(119909

119894 119909119895) = 119909119879119894119909119895

Polynomialkernel 119870(119909

119894 119909119895) = (120574119909119879

119894119909119895+ 119903)119889

120574 gt 0

Radial basisfunction (RBF)kernel

119870(119909119894 119909119895) = 119890minus120574119909119894minus119909119895

2

120574 gt 0

Sigmoid kernel 119870(119909119894 119909119895) = tanh (120574119909119879

119894119909119895+ 119903)119889

119909119894 119909119895 are input vectors and 120574 119903 and 119889 are kernel parameters

with normalizationmethodThemain advantage of data nor-malization is to accelerate convergence velocity of iterationduring the SVM training Another advantage is to facilitatesubsequent data processing Singular value sample datamightcause numerical problems because the kernel values usuallydepend on the inner products of feature vectors such asthe linear kernel or the polynomial kernel Therefore it isrecommended that each attribute should be linearly scaled tothe range [minus1 +1] or [0 +1] In this research the data sets werescaled to the range between 0 and 1

33 Model Identification In Section 22 demonstrated abovethe input vectors for traffic flow prediction model have beendiscussed and constructed Here we only give an introduc-tion to SVM briefly More detailed descriptions can be foundin [12] Given a training set of instance-label pairs the inputvectors can be mapped implicitly and hence very efficientlyinto high-dimensional space (maybe infinite) by the kernelfunction Φ and SVM finds a linear separating hyper planewith the maximal margin in this higher dimensional spaceAt present new kernel functions are proposed by researcherssome popular kernel functions are as Table 1

The previous researches [13 14] suggested that radialbasis function (RBF) kernel had less numerical difficultieswith application and was efficient for traffic state and traffic

6 Mathematical Problems in Engineering

V1

V2

V3

Vs

K(x1 x)

K(x2 x)

K(xs x)

1205971y1

1205972y2

120597sys

+ f(x)

Figure 3 Structure of the SVMs model for traffic flow prediction

flow prediction Thus RBF kernel function is used for theSVM model in this study Before the RBF kernel is beingused three parameters associated should be determined(119862 120576 120574) Parameter 119862 is a regularization constant or penaltyparameter which is assigned in advance before each SVMmodel training process This parameter allows striking abalance between the two competing criteria of marginmaximization and error minimization whereas the slackvariables 120576 indicate the distance of the incorrectly classifiedpoints from the optimal hyper plane [15] The larger the119862 value the higher the penalty associated to misclassifiedsamples Parameter 120574 is the parameter of RBF function kernelIn summary the entire SVM model training process is aprocess that the parameters are optimized and determinedby simulation Kavzoglu and Colkesen [16] pointed outthat the parameters determination was a practical way toidentify good parameters Thus in this research they arecalibrated by grid-search method During the search processall possible combinations of (119862 120576 120574) are tested and the onewith the best performance is chosen After conducting thegrid search on the training dataset the optimal parameterswere selected as (2

minus2 2minus5 122) and a cross validated onvalidation dataset is conductedThe various SVMs are imple-mented with Libsvm (National Taiwan University availableat httpwwwcsientuedutwsimcjlinlibsvm) on aMicrosoftWindows platform

The structure of the SVMs model for SVM-T SVM-PT SVM-HT and SVM-HPT is shown as Figure 3 In thisstructure the feature vector V is composed with the differentdimensions of each SVMmodel respectively For example inSVM-T model V = (119881

1 1198812 119881

119904) denotes the traffic volume

of road section 119894 and day 119895 at different times 119902119894119895(119905 minus 119904 minus 1)

119902119894119895(119905) The features vector definitions for each SVMs

model are demonstrated in Section 22To determine the SVMrsquos inputs vector for practical appli-

cation some comparison tests have been conducted betweenthe models with different input variables (T PT HT andHPT) which have been calibrated based on the data collectedThe comparison of prediction errors about the four modelson off-peak period and peak period prediction error resultsare shown as Figure 4

During the whole prediction process the data processingwas divided into three steps respectively for training cross-validation and testing Firstly about 10 of samples datawere set as testing data Then 70 of the remaining samples

0000

0020

0040

0060

0080

0100

0120

0140

SVM_T SVM_PT SVM_HT SVM_HPT

n = 1

n = 2

n = 3

Figure 4 Prediction error results of the SVM model with differentinput vectors

data were assigned to training and the others to cross-validation In particular the training and testing processeswere conducted with the same datasets for the four modelsin order to have the same basis of comparison

34 Results Analysis From the comparison results of thefour models (SVM-T SVM-PT SVM-HT and s-HPT) inSection 33 some conclusions can be drawn as follows

(1) Firstly in the case of single-step prediction the num-ber of prediction steps 119899 = 1 The prediction errorcurves in Figure 4 indicate that the results of theprediction error are similar with each other for thefour models and SVM PT has more accuracy thanSVM TThe reason of this case is that the upstream ofthe traffic volume can reach the current road sectionafter a while so a more accurate prediction result canbe obtained with the introduction of upstream trafficvolume into SVM-T

(2) Secondly in the case of multisteps prediction thenumber of prediction steps 119899 gt 1 For the case of 119899 =

2 the prediction error has the similar characteristicsto single-stepmodel For the case of 119899 gt 2 the predic-tion error of SVM T increases significantly with thesteps 119899 increasingThe reason for this is that in SVM Tmodel the traffic volume data is time-series data thedistance between the predicted points and observingdata has the different impact effect for the predictionaccuracy the more closely the more important forprediction result Furthermore for the multistepsprediction the SVM HT and SVM HPT have a goodprediction performance compared with the SVM Twhich only uses the time-series data for predictingThe reason for this is mainly that the historical datacan represent the traffic trends changing mode withtraffic volume for the current road section so theprediction result can be better with the introduction

Mathematical Problems in Engineering 7

0100200300400500600700

000

224

448

712

936

120

0

142

4

164

8

191

2

213

6

000

Volu

me (

veh)

TimeObserving dataPredict valueActual value

Figure 5 Example of dynamic multisteps prediction results ofSVM PT

of upstream traffic volume data and historical datainto SVM-T

(3) It is important to note that the execution performanceof SVM HPT and SVM HT is different due to thecomplicated input vector dimensions in upstreamtraffic volume data and historical data Thereforein practical application the SVM HT model is apriority selection for prediction under the conditionof meeting the requirement prediction accuracy inorder to reduce the time-consumption of the wholesystem

In order to verify the actual performance of the predictionmodel presented above a test was conducted with SVM HTand the result was shown as Figure 5 In this test a certainwhole day traffic volume data were used for predicting thetraffic trends and the sample interval was assigned to 20minutes so as to simplify the computation process From theresult it can be seen that the prediction curves agree wellwith the actual observing data Test of the prediction errorfound that the traffic volume mean error is 168 vehicles andthe mean relative error is 128Thus it can be seen that thereseems to be a strong potential effectiveness to be used to thepractical application

4 Conclusions

The urban transport system has the characteristics such asnonlinear stochastic and time-varying Therefore artificialintelligence methods are now receiving more and moreattentions in ITS In order to predict traffic flow withmultisteps accurately this paper proposed a SVM model forthe prediction In the present research the traffic volume datawith actual observation surveys in urban area of Dalian wasused to predict the traffic flow by SVM models In orderto obtain the prediction effect with different input vectorssome comparison tests are conducted with different inputvariables (T PT HT and HPT) The test results showedthat the proposed SVM model had a good ability for trafficflow prediction and the comparison of different input vectorsindicated that the SVM-HPT model outperformed the otherthree models for prediction

In this paper only the traffic volume data was used toestimate the current traffic conditions Further studywill con-sider more factors analysis such as the input vectors dimen-sions and the prediction steps so as to enhance the perfor-mance of the proposed prediction models

Acknowledgments

This work was jointly supported by grants from the Human-ities and Social Sciences Foundation of the Ministry ofEducation in China (Project no 12YJCZH280) the PhDPrograms Foundation of Ministry of Education of China(Project no 20112125120004) and National Natural ScienceFoundation of China (Project no 11272075)

References

[1] YWangM Papageorgiou andAMessmer ldquoReal-time freewaytraffic state estimation based on extended Kalman filter adap-tive capabilities and real data testingrdquo Transportation ResearchA vol 42 no 10 pp 1340ndash1358 2008

[2] R M Hage D Betaille F Peyret and D Meizel ldquoUnscentedKalman filter for urban network travel time estimationrdquoProcediamdashSocial and Behavioral Sciences vol 54 no 4 pp1047ndash1057 2012

[3] Y Yuan J W C van Lint R E Wilson F van Wageningen-Kessels and S P Hoogendoorn ldquoReal-time lagrangian trafficstate estimator for freewaysrdquo IEEE Transactions on IntelligentTransportation Systems vol 13 no 1 pp 59ndash70 2012

[4] V N Vapnik ldquoAn overview of statistical learning theoryrdquo IEEETransactions on Neural Networks vol 10 no 5 pp 988ndash9991999

[5] B Yu Z Z Yang and B Z Yao ldquoBus arrival time predictionusing support vector machinesrdquo Journal of Intelligent Trans-portation Systems vol 10 no 4 pp 151ndash158 2006

[6] B YuWHK Lam andM L Tam ldquoBus arrival time predictionat bus stopwithmultiple routesrdquoTransportation Research C vol19 no 6 pp 1157ndash1170 2011

[7] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[8] B Z Yao C Y Yang J B Yao and J Sun ldquoTunnel surroundingrock displacement prediction using support vector machinerdquoInternational Journal of Computational Intelligence Systems vol3 no 6 pp 843ndash852 2010

[9] J H Yang and X N Yu ldquoThe support vector machinesprediction model of freeway dynamic traffic flowrdquo Journal ofXirsquoan Technological University no 3 pp 280ndash284 2009

[10] B Yu Z Z Yang K Chen and B Yu ldquoHybrid model forprediction of bus arrival times at next stationrdquo Journal ofAdvanced Transportation vol 44 no 3 pp 193ndash204 2010

[11] B Yu T Ye XM Tian G BNing and SQ Zhong ldquoBus travel-time prediction with forgetting factorrdquo Journal of Computing inCivil Engineering 2012

[12] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[13] V Tyagi S Kalyanaraman and R Krishnapuram ldquoVehiculartraffic density state estimation based on cumulative road acous-ticsrdquo IEEE Transactions on Intelligent Transportation Systemsvol 13 no 3 pp 1156ndash1166 2012

8 Mathematical Problems in Engineering

[14] Q Li ldquoShort-time traffic flow volume prediction based onsupport vector machine with time-dependent structurerdquo inProceedings of the IEEE Intrumentation and Measurement Tech-nology Conference (I2MTC rsquo09) pp 1730ndash1733 May 2009

[15] TOommenDMisra N K C Twarakavi A Prakash B Sahooand S Bandopadhyay ldquoAn objective analysis of support vectormachine based classification for remote sensingrdquoMathematicalGeosciences vol 40 no 4 pp 409ndash424 2008

[16] T Kavzoglu and I Colkesen ldquoA kernel functions analysis forsupport vector machines for land cover classificationrdquo Interna-tional Journal of Applied EarthObservation andGeoinformationvol 11 no 5 pp 352ndash359 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

4 Mathematical Problems in Engineering

Secondly although the daily traffic volume has the samechanging trends the models have different properties fromeach other For example the traffic volume curve are similarfrom Monday to Friday but on Saturday and Sunday thecurve presents different pattern from the week-days thepeakvalley of traffic volume curve is not obvious for distin-guishing

Therefore a historical data model for prediction shouldbe constructed which can be used for predicting the trafficflow historical models for every day (from Monday toSunday) When the database is not enough for predicting wealso can establish the models for week-days and week endsrespectively In addition the trends of traffic flow data alsocan be influenced by weather or holidays

Besides the above demonstration about the traffic flowproperties in time domain the data are also correlated inspace domain The traffic flow data has some correlationsbetween the upstream and downstream for a certain roadsection The upstream traffic flow will reach the downstreamafter a while therefore we can predict the future traffic flowcondition of the downstream based on this correlation

Given the traffic volume 119902119894119895(119905) of road section 119894 time 119905

and day 119895 the observation volume sequence 119876119894119895

= 119902119894119895(1)

119902119894119895(2) 119902

119894119895(119905) the prediction of the traffic volume

119902119894119895(119905) then the forecasting traffic volume sequence of road

section in the future 119899 steps are expressed as 119876119894119895

= 119902119894119895(119905)

119902119894119895(119905+1) 119902

119894119895(119905+119899)The historical pattern data sequence

is defined as 119867119876119894119896

= ℎ119902119894119896(1) ℎ119902

119894119896(2) ℎ119902

119894119896(119905) in

which ℎ119902119894119896(119905) = (1119908)sum 119902

119894119895(119905) is the historical data of section

119894 on the 119896th day (from Monday to Sunday) and 119908 is thenumber of weeks for the data included

Based on the description above and the space-timecharacteristics of the traffic flow data in order to predict thetraffic flow for the 119899 steps (119905 + 1 119905 + 119899) at section 119894 time119905 three input feature vectors are designed for the predictionmodel

(1) autocorrelation time feature vector119876119894119895

= 119902119894119895(119905minus119898)

119902119894119895(119905)

(2) historical model feature vector119867119876119894119896

= ℎ119902119894119896(1)

ℎ119902119894119896(119905 minus 1) ℎ119902

119894119896(119905) ℎ119902

119894119896(119905 + 1) ℎ119902

119894119896(119905 + 119899) 119896 =

1 2 7 which represents fromMonday to Sundayrespectively

(3) space correlation feature vector119876119894minus1119895

= 119902119894minus1119895

(119905minus119898)

119902119894minus1119895

(119905) 119876119894minus2119895

= 119902119894minus2119895

(119905 minus 119898) 119902119894minus2119895

(119905) The number of time sequence119898 is determined by thetime consumption from the upstream to the currentroad section

The relationship between each input vector and outputvector is shown in Figure 2

From the aspect of theoretical analysis for the single-step mode accurate results can be obtained with only con-sideration of the time state and space state vectors due to thegradient traffic flow characteristics On the contrary for themultisteps mode the traffic flow changing trends play a moreimportant role in prediction therefore more accurate resultscan be achieved by taking the historical data into account

Based on the above analysis the combination of threestate vectors is used as the input data for SVM which are asfollows

(1) the combination of one-dimensional data for thespecified road section in the previous intervals (T)119876119894119895

= 119902119894119895(119905 minus 119898) 119902

119894119895(119905) which describes the

traffic volume changing property with the time inter-vals this property is similar to the single-step predic-tion method

(2) the combination ofmultidimensional space-time dataof the upstream and the specified road section (PT)119876119894minus1119895

= 119902119894minus1119895

(119905minus119898) 119902119894minus1119895

(119905)119876119894minus2119895

= 119902119894minus2119895

(119905minus

119898) 119902119894minus2119895

(119905) which describes the correlation ofthe traffic flow changing in time domain especiallythe correlations between the upstream and down-stream for a certain road section

(3) the combination of the time-series data and thehistorical pattern data for the specified road sectionin the current interval (HT) 119876

119894119895= 119902119894119895(119905 minus 119898)

119902119894119895(119905)119867119876

119894119896= ℎ119902119894119896(1) ℎ119902

119894119896(119905 minus 1) ℎ119902

119894119896(119905)

ℎ119902119894119896(119905 + 1) ℎ119902

119894119896(119905 + 119899) which describes the

historical similarity of the traffic volume changingproperty demonstrated in the Figure 1

(4) the combination of the time-series data for theupstream and the specified road section in the currentinterval and the historical pattern data for the targetroad section (HPT)

Each of the combination vectors above are used as the inputvariable of the SVM model and 119876

119894119895= 119902119894119895(119905) 119902119894119895(119905 + 1)

119902119894119895(119905 + 119899) as the output variable the SVM-T SVM-TP

SVM-TH and SVM-TPH can be achieved respectively duringthe SVM training process

3 Case Study

31 Evaluation of Model Performance As an indicator toreflect the accuracy and availability of the prediction modelerror plays an important role in evaluating the predictionmodel Common error indictors include absolute predictionerror and relative prediction error which can be calculated asfollows respectively

119864119886(119905) = 119902

119894(119905) minus 119902

119894(119905)

119864119903(119905) =

119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

(7)

where 119864119886(119905) denotes the absolute prediction error at time 119905

119864119903(119905) denotes the relative prediction error 119902

119894(119905) denotes the

measured value and 119902119894(119905) denotes the prediction value 119864

119886(119905)

and119864119903(119905) are the errors of the single-step prediction while for

the dynamic multisteps prediction both the value and errorcan be achieved during each prediction process which can bedescribed with 119902

119894(119905 + 1) 119902

119894(119905 + 2) 119902

119894(119905 + 119899) 119902

119894(119905 + 1) is the

prediction value of (119905 + 1) at the current time 119905 where 119899 is theprediction step size The greater the 119899 the more prediction

Mathematical Problems in Engineering 5

Historical data

qiminus2j (t minus m)

qiminus2j (t minus 1)

qiminus2j (t)

qiminus1j (t minus m)

qiminus1j (t minus 1)

qiminus1j (t)

qij (t minus m)

qij (t minus 1)

qij (t)

Target road

1

(00)

HQik

Time of day

qij(t) qij(t + 1) middot middot middot qij(t+

Roadiminus2 Roadiminus1 Roadi Roadi+1

Figure 2 Prediction mechanism based on the traffic flow space-time characteristics

error we can achieve But a maximum theoretical limit 119899maxexists for avoiding unlimited prediction loop

In order to obtain an objective and accurate evaluation ofthe prediction effect themean value of all the errors is used asthe evaluation index for multisteps prediction the formulasare as follows

119872119864119886(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905))

119872119864119903(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

)

(8)

where 119872119864119886(119905) is the mean value of the absolute error for

multisteps prediction at time 119905 119872119864119903(119905) is the mean value of

the relative error In fact for multisteps prediction the time-series synthetical error is more important for the predictionresult Therefore 119872119864

119886(119905) and 119872119864

119903(119905) are the direct sum

rather than the sum of absolute values of 119864119886(119905) and 119864

119903(119905)

32 Data Collection and Processing The presented SVMmodel for traffic flow prediction was tested with the actualsurvey data of Gaoerji Road in Dalian China dated fromMay 14 (Monday) to 20 (Sunday) 2012 Gaoerji road is aone-way lane with several importsexports and goes fromZhongshan road to Yierjiu street through the city center witha distance of 38 km The traffic state is highly congestedin the morning and afternoon peaks In this research theactual survey started from 700 in the morningThe collecteddata was obtained with SCOOT (Split Cycle and OffsetOptimization Technology) and consist of the traffic volumeduring the peak time (PT 0700ndash0830 h) and off-peak time(OT 1000ndash1200 h) with recording data once every 5 minutes

There are some influence factors including missing errorand random noise in the raw datasets which is obtainedwith SCOOTTherefore the data must be identified repairedand smoothed firstly Then the time-series data is achieved

Table 1 Four popular kernel functions

Kernel function Expression CommentLinear kernel 119870(119909

119894 119909119895) = 119909119879119894119909119895

Polynomialkernel 119870(119909

119894 119909119895) = (120574119909119879

119894119909119895+ 119903)119889

120574 gt 0

Radial basisfunction (RBF)kernel

119870(119909119894 119909119895) = 119890minus120574119909119894minus119909119895

2

120574 gt 0

Sigmoid kernel 119870(119909119894 119909119895) = tanh (120574119909119879

119894119909119895+ 119903)119889

119909119894 119909119895 are input vectors and 120574 119903 and 119889 are kernel parameters

with normalizationmethodThemain advantage of data nor-malization is to accelerate convergence velocity of iterationduring the SVM training Another advantage is to facilitatesubsequent data processing Singular value sample datamightcause numerical problems because the kernel values usuallydepend on the inner products of feature vectors such asthe linear kernel or the polynomial kernel Therefore it isrecommended that each attribute should be linearly scaled tothe range [minus1 +1] or [0 +1] In this research the data sets werescaled to the range between 0 and 1

33 Model Identification In Section 22 demonstrated abovethe input vectors for traffic flow prediction model have beendiscussed and constructed Here we only give an introduc-tion to SVM briefly More detailed descriptions can be foundin [12] Given a training set of instance-label pairs the inputvectors can be mapped implicitly and hence very efficientlyinto high-dimensional space (maybe infinite) by the kernelfunction Φ and SVM finds a linear separating hyper planewith the maximal margin in this higher dimensional spaceAt present new kernel functions are proposed by researcherssome popular kernel functions are as Table 1

The previous researches [13 14] suggested that radialbasis function (RBF) kernel had less numerical difficultieswith application and was efficient for traffic state and traffic

6 Mathematical Problems in Engineering

V1

V2

V3

Vs

K(x1 x)

K(x2 x)

K(xs x)

1205971y1

1205972y2

120597sys

+ f(x)

Figure 3 Structure of the SVMs model for traffic flow prediction

flow prediction Thus RBF kernel function is used for theSVM model in this study Before the RBF kernel is beingused three parameters associated should be determined(119862 120576 120574) Parameter 119862 is a regularization constant or penaltyparameter which is assigned in advance before each SVMmodel training process This parameter allows striking abalance between the two competing criteria of marginmaximization and error minimization whereas the slackvariables 120576 indicate the distance of the incorrectly classifiedpoints from the optimal hyper plane [15] The larger the119862 value the higher the penalty associated to misclassifiedsamples Parameter 120574 is the parameter of RBF function kernelIn summary the entire SVM model training process is aprocess that the parameters are optimized and determinedby simulation Kavzoglu and Colkesen [16] pointed outthat the parameters determination was a practical way toidentify good parameters Thus in this research they arecalibrated by grid-search method During the search processall possible combinations of (119862 120576 120574) are tested and the onewith the best performance is chosen After conducting thegrid search on the training dataset the optimal parameterswere selected as (2

minus2 2minus5 122) and a cross validated onvalidation dataset is conductedThe various SVMs are imple-mented with Libsvm (National Taiwan University availableat httpwwwcsientuedutwsimcjlinlibsvm) on aMicrosoftWindows platform

The structure of the SVMs model for SVM-T SVM-PT SVM-HT and SVM-HPT is shown as Figure 3 In thisstructure the feature vector V is composed with the differentdimensions of each SVMmodel respectively For example inSVM-T model V = (119881

1 1198812 119881

119904) denotes the traffic volume

of road section 119894 and day 119895 at different times 119902119894119895(119905 minus 119904 minus 1)

119902119894119895(119905) The features vector definitions for each SVMs

model are demonstrated in Section 22To determine the SVMrsquos inputs vector for practical appli-

cation some comparison tests have been conducted betweenthe models with different input variables (T PT HT andHPT) which have been calibrated based on the data collectedThe comparison of prediction errors about the four modelson off-peak period and peak period prediction error resultsare shown as Figure 4

During the whole prediction process the data processingwas divided into three steps respectively for training cross-validation and testing Firstly about 10 of samples datawere set as testing data Then 70 of the remaining samples

0000

0020

0040

0060

0080

0100

0120

0140

SVM_T SVM_PT SVM_HT SVM_HPT

n = 1

n = 2

n = 3

Figure 4 Prediction error results of the SVM model with differentinput vectors

data were assigned to training and the others to cross-validation In particular the training and testing processeswere conducted with the same datasets for the four modelsin order to have the same basis of comparison

34 Results Analysis From the comparison results of thefour models (SVM-T SVM-PT SVM-HT and s-HPT) inSection 33 some conclusions can be drawn as follows

(1) Firstly in the case of single-step prediction the num-ber of prediction steps 119899 = 1 The prediction errorcurves in Figure 4 indicate that the results of theprediction error are similar with each other for thefour models and SVM PT has more accuracy thanSVM TThe reason of this case is that the upstream ofthe traffic volume can reach the current road sectionafter a while so a more accurate prediction result canbe obtained with the introduction of upstream trafficvolume into SVM-T

(2) Secondly in the case of multisteps prediction thenumber of prediction steps 119899 gt 1 For the case of 119899 =

2 the prediction error has the similar characteristicsto single-stepmodel For the case of 119899 gt 2 the predic-tion error of SVM T increases significantly with thesteps 119899 increasingThe reason for this is that in SVM Tmodel the traffic volume data is time-series data thedistance between the predicted points and observingdata has the different impact effect for the predictionaccuracy the more closely the more important forprediction result Furthermore for the multistepsprediction the SVM HT and SVM HPT have a goodprediction performance compared with the SVM Twhich only uses the time-series data for predictingThe reason for this is mainly that the historical datacan represent the traffic trends changing mode withtraffic volume for the current road section so theprediction result can be better with the introduction

Mathematical Problems in Engineering 7

0100200300400500600700

000

224

448

712

936

120

0

142

4

164

8

191

2

213

6

000

Volu

me (

veh)

TimeObserving dataPredict valueActual value

Figure 5 Example of dynamic multisteps prediction results ofSVM PT

of upstream traffic volume data and historical datainto SVM-T

(3) It is important to note that the execution performanceof SVM HPT and SVM HT is different due to thecomplicated input vector dimensions in upstreamtraffic volume data and historical data Thereforein practical application the SVM HT model is apriority selection for prediction under the conditionof meeting the requirement prediction accuracy inorder to reduce the time-consumption of the wholesystem

In order to verify the actual performance of the predictionmodel presented above a test was conducted with SVM HTand the result was shown as Figure 5 In this test a certainwhole day traffic volume data were used for predicting thetraffic trends and the sample interval was assigned to 20minutes so as to simplify the computation process From theresult it can be seen that the prediction curves agree wellwith the actual observing data Test of the prediction errorfound that the traffic volume mean error is 168 vehicles andthe mean relative error is 128Thus it can be seen that thereseems to be a strong potential effectiveness to be used to thepractical application

4 Conclusions

The urban transport system has the characteristics such asnonlinear stochastic and time-varying Therefore artificialintelligence methods are now receiving more and moreattentions in ITS In order to predict traffic flow withmultisteps accurately this paper proposed a SVM model forthe prediction In the present research the traffic volume datawith actual observation surveys in urban area of Dalian wasused to predict the traffic flow by SVM models In orderto obtain the prediction effect with different input vectorssome comparison tests are conducted with different inputvariables (T PT HT and HPT) The test results showedthat the proposed SVM model had a good ability for trafficflow prediction and the comparison of different input vectorsindicated that the SVM-HPT model outperformed the otherthree models for prediction

In this paper only the traffic volume data was used toestimate the current traffic conditions Further studywill con-sider more factors analysis such as the input vectors dimen-sions and the prediction steps so as to enhance the perfor-mance of the proposed prediction models

Acknowledgments

This work was jointly supported by grants from the Human-ities and Social Sciences Foundation of the Ministry ofEducation in China (Project no 12YJCZH280) the PhDPrograms Foundation of Ministry of Education of China(Project no 20112125120004) and National Natural ScienceFoundation of China (Project no 11272075)

References

[1] YWangM Papageorgiou andAMessmer ldquoReal-time freewaytraffic state estimation based on extended Kalman filter adap-tive capabilities and real data testingrdquo Transportation ResearchA vol 42 no 10 pp 1340ndash1358 2008

[2] R M Hage D Betaille F Peyret and D Meizel ldquoUnscentedKalman filter for urban network travel time estimationrdquoProcediamdashSocial and Behavioral Sciences vol 54 no 4 pp1047ndash1057 2012

[3] Y Yuan J W C van Lint R E Wilson F van Wageningen-Kessels and S P Hoogendoorn ldquoReal-time lagrangian trafficstate estimator for freewaysrdquo IEEE Transactions on IntelligentTransportation Systems vol 13 no 1 pp 59ndash70 2012

[4] V N Vapnik ldquoAn overview of statistical learning theoryrdquo IEEETransactions on Neural Networks vol 10 no 5 pp 988ndash9991999

[5] B Yu Z Z Yang and B Z Yao ldquoBus arrival time predictionusing support vector machinesrdquo Journal of Intelligent Trans-portation Systems vol 10 no 4 pp 151ndash158 2006

[6] B YuWHK Lam andM L Tam ldquoBus arrival time predictionat bus stopwithmultiple routesrdquoTransportation Research C vol19 no 6 pp 1157ndash1170 2011

[7] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[8] B Z Yao C Y Yang J B Yao and J Sun ldquoTunnel surroundingrock displacement prediction using support vector machinerdquoInternational Journal of Computational Intelligence Systems vol3 no 6 pp 843ndash852 2010

[9] J H Yang and X N Yu ldquoThe support vector machinesprediction model of freeway dynamic traffic flowrdquo Journal ofXirsquoan Technological University no 3 pp 280ndash284 2009

[10] B Yu Z Z Yang K Chen and B Yu ldquoHybrid model forprediction of bus arrival times at next stationrdquo Journal ofAdvanced Transportation vol 44 no 3 pp 193ndash204 2010

[11] B Yu T Ye XM Tian G BNing and SQ Zhong ldquoBus travel-time prediction with forgetting factorrdquo Journal of Computing inCivil Engineering 2012

[12] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[13] V Tyagi S Kalyanaraman and R Krishnapuram ldquoVehiculartraffic density state estimation based on cumulative road acous-ticsrdquo IEEE Transactions on Intelligent Transportation Systemsvol 13 no 3 pp 1156ndash1166 2012

8 Mathematical Problems in Engineering

[14] Q Li ldquoShort-time traffic flow volume prediction based onsupport vector machine with time-dependent structurerdquo inProceedings of the IEEE Intrumentation and Measurement Tech-nology Conference (I2MTC rsquo09) pp 1730ndash1733 May 2009

[15] TOommenDMisra N K C Twarakavi A Prakash B Sahooand S Bandopadhyay ldquoAn objective analysis of support vectormachine based classification for remote sensingrdquoMathematicalGeosciences vol 40 no 4 pp 409ndash424 2008

[16] T Kavzoglu and I Colkesen ldquoA kernel functions analysis forsupport vector machines for land cover classificationrdquo Interna-tional Journal of Applied EarthObservation andGeoinformationvol 11 no 5 pp 352ndash359 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

Mathematical Problems in Engineering 5

Historical data

qiminus2j (t minus m)

qiminus2j (t minus 1)

qiminus2j (t)

qiminus1j (t minus m)

qiminus1j (t minus 1)

qiminus1j (t)

qij (t minus m)

qij (t minus 1)

qij (t)

Target road

1

(00)

HQik

Time of day

qij(t) qij(t + 1) middot middot middot qij(t+

Roadiminus2 Roadiminus1 Roadi Roadi+1

Figure 2 Prediction mechanism based on the traffic flow space-time characteristics

error we can achieve But a maximum theoretical limit 119899maxexists for avoiding unlimited prediction loop

In order to obtain an objective and accurate evaluation ofthe prediction effect themean value of all the errors is used asthe evaluation index for multisteps prediction the formulasare as follows

119872119864119886(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905))

119872119864119903(119905) =

1

119899

119899

sum119905=119905+1

(119902119894(119905) minus 119902

119894(119905)

119902119894(119905)

)

(8)

where 119872119864119886(119905) is the mean value of the absolute error for

multisteps prediction at time 119905 119872119864119903(119905) is the mean value of

the relative error In fact for multisteps prediction the time-series synthetical error is more important for the predictionresult Therefore 119872119864

119886(119905) and 119872119864

119903(119905) are the direct sum

rather than the sum of absolute values of 119864119886(119905) and 119864

119903(119905)

32 Data Collection and Processing The presented SVMmodel for traffic flow prediction was tested with the actualsurvey data of Gaoerji Road in Dalian China dated fromMay 14 (Monday) to 20 (Sunday) 2012 Gaoerji road is aone-way lane with several importsexports and goes fromZhongshan road to Yierjiu street through the city center witha distance of 38 km The traffic state is highly congestedin the morning and afternoon peaks In this research theactual survey started from 700 in the morningThe collecteddata was obtained with SCOOT (Split Cycle and OffsetOptimization Technology) and consist of the traffic volumeduring the peak time (PT 0700ndash0830 h) and off-peak time(OT 1000ndash1200 h) with recording data once every 5 minutes

There are some influence factors including missing errorand random noise in the raw datasets which is obtainedwith SCOOTTherefore the data must be identified repairedand smoothed firstly Then the time-series data is achieved

Table 1 Four popular kernel functions

Kernel function Expression CommentLinear kernel 119870(119909

119894 119909119895) = 119909119879119894119909119895

Polynomialkernel 119870(119909

119894 119909119895) = (120574119909119879

119894119909119895+ 119903)119889

120574 gt 0

Radial basisfunction (RBF)kernel

119870(119909119894 119909119895) = 119890minus120574119909119894minus119909119895

2

120574 gt 0

Sigmoid kernel 119870(119909119894 119909119895) = tanh (120574119909119879

119894119909119895+ 119903)119889

119909119894 119909119895 are input vectors and 120574 119903 and 119889 are kernel parameters

with normalizationmethodThemain advantage of data nor-malization is to accelerate convergence velocity of iterationduring the SVM training Another advantage is to facilitatesubsequent data processing Singular value sample datamightcause numerical problems because the kernel values usuallydepend on the inner products of feature vectors such asthe linear kernel or the polynomial kernel Therefore it isrecommended that each attribute should be linearly scaled tothe range [minus1 +1] or [0 +1] In this research the data sets werescaled to the range between 0 and 1

33 Model Identification In Section 22 demonstrated abovethe input vectors for traffic flow prediction model have beendiscussed and constructed Here we only give an introduc-tion to SVM briefly More detailed descriptions can be foundin [12] Given a training set of instance-label pairs the inputvectors can be mapped implicitly and hence very efficientlyinto high-dimensional space (maybe infinite) by the kernelfunction Φ and SVM finds a linear separating hyper planewith the maximal margin in this higher dimensional spaceAt present new kernel functions are proposed by researcherssome popular kernel functions are as Table 1

The previous researches [13 14] suggested that radialbasis function (RBF) kernel had less numerical difficultieswith application and was efficient for traffic state and traffic

6 Mathematical Problems in Engineering

V1

V2

V3

Vs

K(x1 x)

K(x2 x)

K(xs x)

1205971y1

1205972y2

120597sys

+ f(x)

Figure 3 Structure of the SVMs model for traffic flow prediction

flow prediction Thus RBF kernel function is used for theSVM model in this study Before the RBF kernel is beingused three parameters associated should be determined(119862 120576 120574) Parameter 119862 is a regularization constant or penaltyparameter which is assigned in advance before each SVMmodel training process This parameter allows striking abalance between the two competing criteria of marginmaximization and error minimization whereas the slackvariables 120576 indicate the distance of the incorrectly classifiedpoints from the optimal hyper plane [15] The larger the119862 value the higher the penalty associated to misclassifiedsamples Parameter 120574 is the parameter of RBF function kernelIn summary the entire SVM model training process is aprocess that the parameters are optimized and determinedby simulation Kavzoglu and Colkesen [16] pointed outthat the parameters determination was a practical way toidentify good parameters Thus in this research they arecalibrated by grid-search method During the search processall possible combinations of (119862 120576 120574) are tested and the onewith the best performance is chosen After conducting thegrid search on the training dataset the optimal parameterswere selected as (2

minus2 2minus5 122) and a cross validated onvalidation dataset is conductedThe various SVMs are imple-mented with Libsvm (National Taiwan University availableat httpwwwcsientuedutwsimcjlinlibsvm) on aMicrosoftWindows platform

The structure of the SVMs model for SVM-T SVM-PT SVM-HT and SVM-HPT is shown as Figure 3 In thisstructure the feature vector V is composed with the differentdimensions of each SVMmodel respectively For example inSVM-T model V = (119881

1 1198812 119881

119904) denotes the traffic volume

of road section 119894 and day 119895 at different times 119902119894119895(119905 minus 119904 minus 1)

119902119894119895(119905) The features vector definitions for each SVMs

model are demonstrated in Section 22To determine the SVMrsquos inputs vector for practical appli-

cation some comparison tests have been conducted betweenthe models with different input variables (T PT HT andHPT) which have been calibrated based on the data collectedThe comparison of prediction errors about the four modelson off-peak period and peak period prediction error resultsare shown as Figure 4

During the whole prediction process the data processingwas divided into three steps respectively for training cross-validation and testing Firstly about 10 of samples datawere set as testing data Then 70 of the remaining samples

0000

0020

0040

0060

0080

0100

0120

0140

SVM_T SVM_PT SVM_HT SVM_HPT

n = 1

n = 2

n = 3

Figure 4 Prediction error results of the SVM model with differentinput vectors

data were assigned to training and the others to cross-validation In particular the training and testing processeswere conducted with the same datasets for the four modelsin order to have the same basis of comparison

34 Results Analysis From the comparison results of thefour models (SVM-T SVM-PT SVM-HT and s-HPT) inSection 33 some conclusions can be drawn as follows

(1) Firstly in the case of single-step prediction the num-ber of prediction steps 119899 = 1 The prediction errorcurves in Figure 4 indicate that the results of theprediction error are similar with each other for thefour models and SVM PT has more accuracy thanSVM TThe reason of this case is that the upstream ofthe traffic volume can reach the current road sectionafter a while so a more accurate prediction result canbe obtained with the introduction of upstream trafficvolume into SVM-T

(2) Secondly in the case of multisteps prediction thenumber of prediction steps 119899 gt 1 For the case of 119899 =

2 the prediction error has the similar characteristicsto single-stepmodel For the case of 119899 gt 2 the predic-tion error of SVM T increases significantly with thesteps 119899 increasingThe reason for this is that in SVM Tmodel the traffic volume data is time-series data thedistance between the predicted points and observingdata has the different impact effect for the predictionaccuracy the more closely the more important forprediction result Furthermore for the multistepsprediction the SVM HT and SVM HPT have a goodprediction performance compared with the SVM Twhich only uses the time-series data for predictingThe reason for this is mainly that the historical datacan represent the traffic trends changing mode withtraffic volume for the current road section so theprediction result can be better with the introduction

Mathematical Problems in Engineering 7

0100200300400500600700

000

224

448

712

936

120

0

142

4

164

8

191

2

213

6

000

Volu

me (

veh)

TimeObserving dataPredict valueActual value

Figure 5 Example of dynamic multisteps prediction results ofSVM PT

of upstream traffic volume data and historical datainto SVM-T

(3) It is important to note that the execution performanceof SVM HPT and SVM HT is different due to thecomplicated input vector dimensions in upstreamtraffic volume data and historical data Thereforein practical application the SVM HT model is apriority selection for prediction under the conditionof meeting the requirement prediction accuracy inorder to reduce the time-consumption of the wholesystem

In order to verify the actual performance of the predictionmodel presented above a test was conducted with SVM HTand the result was shown as Figure 5 In this test a certainwhole day traffic volume data were used for predicting thetraffic trends and the sample interval was assigned to 20minutes so as to simplify the computation process From theresult it can be seen that the prediction curves agree wellwith the actual observing data Test of the prediction errorfound that the traffic volume mean error is 168 vehicles andthe mean relative error is 128Thus it can be seen that thereseems to be a strong potential effectiveness to be used to thepractical application

4 Conclusions

The urban transport system has the characteristics such asnonlinear stochastic and time-varying Therefore artificialintelligence methods are now receiving more and moreattentions in ITS In order to predict traffic flow withmultisteps accurately this paper proposed a SVM model forthe prediction In the present research the traffic volume datawith actual observation surveys in urban area of Dalian wasused to predict the traffic flow by SVM models In orderto obtain the prediction effect with different input vectorssome comparison tests are conducted with different inputvariables (T PT HT and HPT) The test results showedthat the proposed SVM model had a good ability for trafficflow prediction and the comparison of different input vectorsindicated that the SVM-HPT model outperformed the otherthree models for prediction

In this paper only the traffic volume data was used toestimate the current traffic conditions Further studywill con-sider more factors analysis such as the input vectors dimen-sions and the prediction steps so as to enhance the perfor-mance of the proposed prediction models

Acknowledgments

This work was jointly supported by grants from the Human-ities and Social Sciences Foundation of the Ministry ofEducation in China (Project no 12YJCZH280) the PhDPrograms Foundation of Ministry of Education of China(Project no 20112125120004) and National Natural ScienceFoundation of China (Project no 11272075)

References

[1] YWangM Papageorgiou andAMessmer ldquoReal-time freewaytraffic state estimation based on extended Kalman filter adap-tive capabilities and real data testingrdquo Transportation ResearchA vol 42 no 10 pp 1340ndash1358 2008

[2] R M Hage D Betaille F Peyret and D Meizel ldquoUnscentedKalman filter for urban network travel time estimationrdquoProcediamdashSocial and Behavioral Sciences vol 54 no 4 pp1047ndash1057 2012

[3] Y Yuan J W C van Lint R E Wilson F van Wageningen-Kessels and S P Hoogendoorn ldquoReal-time lagrangian trafficstate estimator for freewaysrdquo IEEE Transactions on IntelligentTransportation Systems vol 13 no 1 pp 59ndash70 2012

[4] V N Vapnik ldquoAn overview of statistical learning theoryrdquo IEEETransactions on Neural Networks vol 10 no 5 pp 988ndash9991999

[5] B Yu Z Z Yang and B Z Yao ldquoBus arrival time predictionusing support vector machinesrdquo Journal of Intelligent Trans-portation Systems vol 10 no 4 pp 151ndash158 2006

[6] B YuWHK Lam andM L Tam ldquoBus arrival time predictionat bus stopwithmultiple routesrdquoTransportation Research C vol19 no 6 pp 1157ndash1170 2011

[7] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[8] B Z Yao C Y Yang J B Yao and J Sun ldquoTunnel surroundingrock displacement prediction using support vector machinerdquoInternational Journal of Computational Intelligence Systems vol3 no 6 pp 843ndash852 2010

[9] J H Yang and X N Yu ldquoThe support vector machinesprediction model of freeway dynamic traffic flowrdquo Journal ofXirsquoan Technological University no 3 pp 280ndash284 2009

[10] B Yu Z Z Yang K Chen and B Yu ldquoHybrid model forprediction of bus arrival times at next stationrdquo Journal ofAdvanced Transportation vol 44 no 3 pp 193ndash204 2010

[11] B Yu T Ye XM Tian G BNing and SQ Zhong ldquoBus travel-time prediction with forgetting factorrdquo Journal of Computing inCivil Engineering 2012

[12] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[13] V Tyagi S Kalyanaraman and R Krishnapuram ldquoVehiculartraffic density state estimation based on cumulative road acous-ticsrdquo IEEE Transactions on Intelligent Transportation Systemsvol 13 no 3 pp 1156ndash1166 2012

8 Mathematical Problems in Engineering

[14] Q Li ldquoShort-time traffic flow volume prediction based onsupport vector machine with time-dependent structurerdquo inProceedings of the IEEE Intrumentation and Measurement Tech-nology Conference (I2MTC rsquo09) pp 1730ndash1733 May 2009

[15] TOommenDMisra N K C Twarakavi A Prakash B Sahooand S Bandopadhyay ldquoAn objective analysis of support vectormachine based classification for remote sensingrdquoMathematicalGeosciences vol 40 no 4 pp 409ndash424 2008

[16] T Kavzoglu and I Colkesen ldquoA kernel functions analysis forsupport vector machines for land cover classificationrdquo Interna-tional Journal of Applied EarthObservation andGeoinformationvol 11 no 5 pp 352ndash359 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

6 Mathematical Problems in Engineering

V1

V2

V3

Vs

K(x1 x)

K(x2 x)

K(xs x)

1205971y1

1205972y2

120597sys

+ f(x)

Figure 3 Structure of the SVMs model for traffic flow prediction

flow prediction Thus RBF kernel function is used for theSVM model in this study Before the RBF kernel is beingused three parameters associated should be determined(119862 120576 120574) Parameter 119862 is a regularization constant or penaltyparameter which is assigned in advance before each SVMmodel training process This parameter allows striking abalance between the two competing criteria of marginmaximization and error minimization whereas the slackvariables 120576 indicate the distance of the incorrectly classifiedpoints from the optimal hyper plane [15] The larger the119862 value the higher the penalty associated to misclassifiedsamples Parameter 120574 is the parameter of RBF function kernelIn summary the entire SVM model training process is aprocess that the parameters are optimized and determinedby simulation Kavzoglu and Colkesen [16] pointed outthat the parameters determination was a practical way toidentify good parameters Thus in this research they arecalibrated by grid-search method During the search processall possible combinations of (119862 120576 120574) are tested and the onewith the best performance is chosen After conducting thegrid search on the training dataset the optimal parameterswere selected as (2

minus2 2minus5 122) and a cross validated onvalidation dataset is conductedThe various SVMs are imple-mented with Libsvm (National Taiwan University availableat httpwwwcsientuedutwsimcjlinlibsvm) on aMicrosoftWindows platform

The structure of the SVMs model for SVM-T SVM-PT SVM-HT and SVM-HPT is shown as Figure 3 In thisstructure the feature vector V is composed with the differentdimensions of each SVMmodel respectively For example inSVM-T model V = (119881

1 1198812 119881

119904) denotes the traffic volume

of road section 119894 and day 119895 at different times 119902119894119895(119905 minus 119904 minus 1)

119902119894119895(119905) The features vector definitions for each SVMs

model are demonstrated in Section 22To determine the SVMrsquos inputs vector for practical appli-

cation some comparison tests have been conducted betweenthe models with different input variables (T PT HT andHPT) which have been calibrated based on the data collectedThe comparison of prediction errors about the four modelson off-peak period and peak period prediction error resultsare shown as Figure 4

During the whole prediction process the data processingwas divided into three steps respectively for training cross-validation and testing Firstly about 10 of samples datawere set as testing data Then 70 of the remaining samples

0000

0020

0040

0060

0080

0100

0120

0140

SVM_T SVM_PT SVM_HT SVM_HPT

n = 1

n = 2

n = 3

Figure 4 Prediction error results of the SVM model with differentinput vectors

data were assigned to training and the others to cross-validation In particular the training and testing processeswere conducted with the same datasets for the four modelsin order to have the same basis of comparison

34 Results Analysis From the comparison results of thefour models (SVM-T SVM-PT SVM-HT and s-HPT) inSection 33 some conclusions can be drawn as follows

(1) Firstly in the case of single-step prediction the num-ber of prediction steps 119899 = 1 The prediction errorcurves in Figure 4 indicate that the results of theprediction error are similar with each other for thefour models and SVM PT has more accuracy thanSVM TThe reason of this case is that the upstream ofthe traffic volume can reach the current road sectionafter a while so a more accurate prediction result canbe obtained with the introduction of upstream trafficvolume into SVM-T

(2) Secondly in the case of multisteps prediction thenumber of prediction steps 119899 gt 1 For the case of 119899 =

2 the prediction error has the similar characteristicsto single-stepmodel For the case of 119899 gt 2 the predic-tion error of SVM T increases significantly with thesteps 119899 increasingThe reason for this is that in SVM Tmodel the traffic volume data is time-series data thedistance between the predicted points and observingdata has the different impact effect for the predictionaccuracy the more closely the more important forprediction result Furthermore for the multistepsprediction the SVM HT and SVM HPT have a goodprediction performance compared with the SVM Twhich only uses the time-series data for predictingThe reason for this is mainly that the historical datacan represent the traffic trends changing mode withtraffic volume for the current road section so theprediction result can be better with the introduction

Mathematical Problems in Engineering 7

0100200300400500600700

000

224

448

712

936

120

0

142

4

164

8

191

2

213

6

000

Volu

me (

veh)

TimeObserving dataPredict valueActual value

Figure 5 Example of dynamic multisteps prediction results ofSVM PT

of upstream traffic volume data and historical datainto SVM-T

(3) It is important to note that the execution performanceof SVM HPT and SVM HT is different due to thecomplicated input vector dimensions in upstreamtraffic volume data and historical data Thereforein practical application the SVM HT model is apriority selection for prediction under the conditionof meeting the requirement prediction accuracy inorder to reduce the time-consumption of the wholesystem

In order to verify the actual performance of the predictionmodel presented above a test was conducted with SVM HTand the result was shown as Figure 5 In this test a certainwhole day traffic volume data were used for predicting thetraffic trends and the sample interval was assigned to 20minutes so as to simplify the computation process From theresult it can be seen that the prediction curves agree wellwith the actual observing data Test of the prediction errorfound that the traffic volume mean error is 168 vehicles andthe mean relative error is 128Thus it can be seen that thereseems to be a strong potential effectiveness to be used to thepractical application

4 Conclusions

The urban transport system has the characteristics such asnonlinear stochastic and time-varying Therefore artificialintelligence methods are now receiving more and moreattentions in ITS In order to predict traffic flow withmultisteps accurately this paper proposed a SVM model forthe prediction In the present research the traffic volume datawith actual observation surveys in urban area of Dalian wasused to predict the traffic flow by SVM models In orderto obtain the prediction effect with different input vectorssome comparison tests are conducted with different inputvariables (T PT HT and HPT) The test results showedthat the proposed SVM model had a good ability for trafficflow prediction and the comparison of different input vectorsindicated that the SVM-HPT model outperformed the otherthree models for prediction

In this paper only the traffic volume data was used toestimate the current traffic conditions Further studywill con-sider more factors analysis such as the input vectors dimen-sions and the prediction steps so as to enhance the perfor-mance of the proposed prediction models

Acknowledgments

This work was jointly supported by grants from the Human-ities and Social Sciences Foundation of the Ministry ofEducation in China (Project no 12YJCZH280) the PhDPrograms Foundation of Ministry of Education of China(Project no 20112125120004) and National Natural ScienceFoundation of China (Project no 11272075)

References

[1] YWangM Papageorgiou andAMessmer ldquoReal-time freewaytraffic state estimation based on extended Kalman filter adap-tive capabilities and real data testingrdquo Transportation ResearchA vol 42 no 10 pp 1340ndash1358 2008

[2] R M Hage D Betaille F Peyret and D Meizel ldquoUnscentedKalman filter for urban network travel time estimationrdquoProcediamdashSocial and Behavioral Sciences vol 54 no 4 pp1047ndash1057 2012

[3] Y Yuan J W C van Lint R E Wilson F van Wageningen-Kessels and S P Hoogendoorn ldquoReal-time lagrangian trafficstate estimator for freewaysrdquo IEEE Transactions on IntelligentTransportation Systems vol 13 no 1 pp 59ndash70 2012

[4] V N Vapnik ldquoAn overview of statistical learning theoryrdquo IEEETransactions on Neural Networks vol 10 no 5 pp 988ndash9991999

[5] B Yu Z Z Yang and B Z Yao ldquoBus arrival time predictionusing support vector machinesrdquo Journal of Intelligent Trans-portation Systems vol 10 no 4 pp 151ndash158 2006

[6] B YuWHK Lam andM L Tam ldquoBus arrival time predictionat bus stopwithmultiple routesrdquoTransportation Research C vol19 no 6 pp 1157ndash1170 2011

[7] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[8] B Z Yao C Y Yang J B Yao and J Sun ldquoTunnel surroundingrock displacement prediction using support vector machinerdquoInternational Journal of Computational Intelligence Systems vol3 no 6 pp 843ndash852 2010

[9] J H Yang and X N Yu ldquoThe support vector machinesprediction model of freeway dynamic traffic flowrdquo Journal ofXirsquoan Technological University no 3 pp 280ndash284 2009

[10] B Yu Z Z Yang K Chen and B Yu ldquoHybrid model forprediction of bus arrival times at next stationrdquo Journal ofAdvanced Transportation vol 44 no 3 pp 193ndash204 2010

[11] B Yu T Ye XM Tian G BNing and SQ Zhong ldquoBus travel-time prediction with forgetting factorrdquo Journal of Computing inCivil Engineering 2012

[12] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[13] V Tyagi S Kalyanaraman and R Krishnapuram ldquoVehiculartraffic density state estimation based on cumulative road acous-ticsrdquo IEEE Transactions on Intelligent Transportation Systemsvol 13 no 3 pp 1156ndash1166 2012

8 Mathematical Problems in Engineering

[14] Q Li ldquoShort-time traffic flow volume prediction based onsupport vector machine with time-dependent structurerdquo inProceedings of the IEEE Intrumentation and Measurement Tech-nology Conference (I2MTC rsquo09) pp 1730ndash1733 May 2009

[15] TOommenDMisra N K C Twarakavi A Prakash B Sahooand S Bandopadhyay ldquoAn objective analysis of support vectormachine based classification for remote sensingrdquoMathematicalGeosciences vol 40 no 4 pp 409ndash424 2008

[16] T Kavzoglu and I Colkesen ldquoA kernel functions analysis forsupport vector machines for land cover classificationrdquo Interna-tional Journal of Applied EarthObservation andGeoinformationvol 11 no 5 pp 352ndash359 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

Mathematical Problems in Engineering 7

0100200300400500600700

000

224

448

712

936

120

0

142

4

164

8

191

2

213

6

000

Volu

me (

veh)

TimeObserving dataPredict valueActual value

Figure 5 Example of dynamic multisteps prediction results ofSVM PT

of upstream traffic volume data and historical datainto SVM-T

(3) It is important to note that the execution performanceof SVM HPT and SVM HT is different due to thecomplicated input vector dimensions in upstreamtraffic volume data and historical data Thereforein practical application the SVM HT model is apriority selection for prediction under the conditionof meeting the requirement prediction accuracy inorder to reduce the time-consumption of the wholesystem

In order to verify the actual performance of the predictionmodel presented above a test was conducted with SVM HTand the result was shown as Figure 5 In this test a certainwhole day traffic volume data were used for predicting thetraffic trends and the sample interval was assigned to 20minutes so as to simplify the computation process From theresult it can be seen that the prediction curves agree wellwith the actual observing data Test of the prediction errorfound that the traffic volume mean error is 168 vehicles andthe mean relative error is 128Thus it can be seen that thereseems to be a strong potential effectiveness to be used to thepractical application

4 Conclusions

The urban transport system has the characteristics such asnonlinear stochastic and time-varying Therefore artificialintelligence methods are now receiving more and moreattentions in ITS In order to predict traffic flow withmultisteps accurately this paper proposed a SVM model forthe prediction In the present research the traffic volume datawith actual observation surveys in urban area of Dalian wasused to predict the traffic flow by SVM models In orderto obtain the prediction effect with different input vectorssome comparison tests are conducted with different inputvariables (T PT HT and HPT) The test results showedthat the proposed SVM model had a good ability for trafficflow prediction and the comparison of different input vectorsindicated that the SVM-HPT model outperformed the otherthree models for prediction

In this paper only the traffic volume data was used toestimate the current traffic conditions Further studywill con-sider more factors analysis such as the input vectors dimen-sions and the prediction steps so as to enhance the perfor-mance of the proposed prediction models

Acknowledgments

This work was jointly supported by grants from the Human-ities and Social Sciences Foundation of the Ministry ofEducation in China (Project no 12YJCZH280) the PhDPrograms Foundation of Ministry of Education of China(Project no 20112125120004) and National Natural ScienceFoundation of China (Project no 11272075)

References

[1] YWangM Papageorgiou andAMessmer ldquoReal-time freewaytraffic state estimation based on extended Kalman filter adap-tive capabilities and real data testingrdquo Transportation ResearchA vol 42 no 10 pp 1340ndash1358 2008

[2] R M Hage D Betaille F Peyret and D Meizel ldquoUnscentedKalman filter for urban network travel time estimationrdquoProcediamdashSocial and Behavioral Sciences vol 54 no 4 pp1047ndash1057 2012

[3] Y Yuan J W C van Lint R E Wilson F van Wageningen-Kessels and S P Hoogendoorn ldquoReal-time lagrangian trafficstate estimator for freewaysrdquo IEEE Transactions on IntelligentTransportation Systems vol 13 no 1 pp 59ndash70 2012

[4] V N Vapnik ldquoAn overview of statistical learning theoryrdquo IEEETransactions on Neural Networks vol 10 no 5 pp 988ndash9991999

[5] B Yu Z Z Yang and B Z Yao ldquoBus arrival time predictionusing support vector machinesrdquo Journal of Intelligent Trans-portation Systems vol 10 no 4 pp 151ndash158 2006

[6] B YuWHK Lam andM L Tam ldquoBus arrival time predictionat bus stopwithmultiple routesrdquoTransportation Research C vol19 no 6 pp 1157ndash1170 2011

[7] L J Cao and F E H Tay ldquoSupport vector machine withadaptive parameters in financial time series forecastingrdquo IEEETransactions on Neural Networks vol 14 no 6 pp 1506ndash15182003

[8] B Z Yao C Y Yang J B Yao and J Sun ldquoTunnel surroundingrock displacement prediction using support vector machinerdquoInternational Journal of Computational Intelligence Systems vol3 no 6 pp 843ndash852 2010

[9] J H Yang and X N Yu ldquoThe support vector machinesprediction model of freeway dynamic traffic flowrdquo Journal ofXirsquoan Technological University no 3 pp 280ndash284 2009

[10] B Yu Z Z Yang K Chen and B Yu ldquoHybrid model forprediction of bus arrival times at next stationrdquo Journal ofAdvanced Transportation vol 44 no 3 pp 193ndash204 2010

[11] B Yu T Ye XM Tian G BNing and SQ Zhong ldquoBus travel-time prediction with forgetting factorrdquo Journal of Computing inCivil Engineering 2012

[12] C Cortes and V Vapnik ldquoSupport-vector networksrdquo MachineLearning vol 20 no 3 pp 273ndash297 1995

[13] V Tyagi S Kalyanaraman and R Krishnapuram ldquoVehiculartraffic density state estimation based on cumulative road acous-ticsrdquo IEEE Transactions on Intelligent Transportation Systemsvol 13 no 3 pp 1156ndash1166 2012

8 Mathematical Problems in Engineering

[14] Q Li ldquoShort-time traffic flow volume prediction based onsupport vector machine with time-dependent structurerdquo inProceedings of the IEEE Intrumentation and Measurement Tech-nology Conference (I2MTC rsquo09) pp 1730ndash1733 May 2009

[15] TOommenDMisra N K C Twarakavi A Prakash B Sahooand S Bandopadhyay ldquoAn objective analysis of support vectormachine based classification for remote sensingrdquoMathematicalGeosciences vol 40 no 4 pp 409ndash424 2008

[16] T Kavzoglu and I Colkesen ldquoA kernel functions analysis forsupport vector machines for land cover classificationrdquo Interna-tional Journal of Applied EarthObservation andGeoinformationvol 11 no 5 pp 352ndash359 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

8 Mathematical Problems in Engineering

[14] Q Li ldquoShort-time traffic flow volume prediction based onsupport vector machine with time-dependent structurerdquo inProceedings of the IEEE Intrumentation and Measurement Tech-nology Conference (I2MTC rsquo09) pp 1730ndash1733 May 2009

[15] TOommenDMisra N K C Twarakavi A Prakash B Sahooand S Bandopadhyay ldquoAn objective analysis of support vectormachine based classification for remote sensingrdquoMathematicalGeosciences vol 40 no 4 pp 409ndash424 2008

[16] T Kavzoglu and I Colkesen ldquoA kernel functions analysis forsupport vector machines for land cover classificationrdquo Interna-tional Journal of Applied EarthObservation andGeoinformationvol 11 no 5 pp 352ndash359 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Accurate Multisteps Traffic Flow …downloads.hindawi.com/journals/mpe/2013/418303.pdfAccurate Multisteps Traffic Flow Prediction Based on SVM ZhangMingheng, 1 ZhenYaobao,

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of