16
1 Modeling Cyclists’ Route Choice Based on GPS Data 2 3 4 5 6 Jeffrey M. Casello 7 Associate Professor 8 School of Planning and Department of Civil and Environmental Engineering 9 University of Waterloo 10 200 University Ave. West 11 Waterloo, ON Canada N2L 3G1 12 [email protected] 13 (1) 519 888 4567 ext. 37538 14 Corresponding author 15 16 17 18 Vladimir Usyukov 19 Masters Candidate 20 Department of Civil and Environmental Engineering 21 University of Waterloo 22 200 University Ave. West 23 Waterloo, ON Canada N2L 3G1 24 [email protected] 25 26 27 Word Count: 28 Abstract: 190 29 Body: 4341 30 Tables + Figures: 10x250=2500 31 Total: 7031 32 33 34 35 TRB 2014 Annual Meeting Paper revised from original submittal.

Modeling Cyclists' Route Choice Based on GPS Data

Embed Size (px)

Citation preview

Page 1: Modeling Cyclists' Route Choice Based on GPS Data

Modeling Cyclists’ Route Choice Based on GPS Data 2 

5  6 

Jeffrey M. Casello 7 

Associate Professor 8 

School of Planning and Department of Civil and Environmental Engineering 9 

University of Waterloo 10 

200 University Ave. West 11 

Waterloo, ON Canada N2L 3G1 12 

[email protected] 13 

(1) 519 888 4567 ext. 37538 14 

Corresponding author 15 

16 

17 

18 

Vladimir Usyukov 19 

Masters Candidate 20 

Department of Civil and Environmental Engineering 21 

University of Waterloo 22 

200 University Ave. West 23 

Waterloo, ON Canada N2L 3G1 24 

[email protected] 25 

26 

27 Word Count: 28 

Abstract: 190 29 

Body: 4341 30 

Tables + Figures: 10x250=2500 31 

Total: 7031 32 

33 

34  35 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 2: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 2  

  

 

ABSTRACT 36 With increased emphasis on sustainable transportation, advancements are necessary in the 37 

technical methods used in the planning and engineering of investments for non-motorized 38 

modes. In this paper, we utilize GPS data on cyclists’ activities to estimate a utility or 39 

generalized cost function that reflects cyclists’ evaluation of path alternatives. For 724 cycling 40 

trips, we compile the path attributes of the observed cycling path to four feasible but un-chosen 41 

alternatives. Using two logit formulations, we estimate the relative importance of statistically 42 

significant path parameters – length, auto speed, grade and the presence (or absence) of bike 43 

lanes. We then test the predictive powers of our models on 181 trips that were observed in the 44 

same data set, but were not used to calibrate the model. In the best case, our model correctly 45 

predicted the actual path for 65% of these trips; for an additional 13% of trips, the difference in 46 

probabilities of selecting the best alternative path and the actual path was less than 5%. We 47 

interpret these results to mean that relatively robust path choice (and ultimately mode choice) 48 

models may be generated and included in enhanced multimodal travel forecasting models. 49 

50 

51 

52 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 3: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 3  

  

 

1. INTRODUCTION 53 In the past decade, transportation planning and engineering has increasingly focused on 54 

achieving balanced transportation – the provision and operation of transport systems that allow 55 

convenient travel by multiple modes (1). These systems may have lower infrastructure and 56 

operating costs, better reliability for users and lower environmental impacts. Moreover, cities 57 

with these systems are also seen as more economically vibrant and livable. Despite this focus on 58 

balanced transportation, technical methods to plan and design for non-motorized modes remain 59 

under-developed compared to those available for assessing auto or transit investments. 60 

Quantitative, behavioral models have been used for decades to estimate the utilization of 61 

proposed roadways or transit facilities. Only in the past few years have such models been 62 

developed to predict cyclists’ behavior (2, 3, 4) such that cycling investments may be 63 

appropriately evaluated. 64 

In this paper, we develop, calibrate and validate a cycling route choice model for 65 

Waterloo Ontario. We test the predictive power of our model on a subset of the data from which 66 

the model was created. In forthcoming work, we also test the predictive power of our models in 67 

a second Region – Peel Region Ontario – to assess the transferability of the models. 68 

Conceptually, the research takes the following approach. Using GPS data from a 2011 69 

study of cyclists’ activities (5), we are able to observe origins, destinations, and path choices. 70 

Using GIS, we identify several other possible paths that the cyclist did not choose. We can then 71 

quantify those characteristics of both the chosen and un-chosen paths that the literature suggests 72 

are of importance to cyclists: path length, auto volumes and speeds on shared facilities, elevation 73 

changes and the presence and absence of cycling facilities (bike lanes, etc.) Using two logit 74 

formulations, we estimate the relative importance of the path characteristics. The outputs are 75 

two models that take the form of a linear sum of significant parameters – essentially a 76 

generalized cost (utility) representation for cyclists. The models are validated by predicting the 77 

path choice for a series of trips from the same data set, but not used in the model calibration. 78 

The models generated demonstrate reasonably strong predictive power with relatively modest 79 

data requirements. 80 

The remainder of the paper is formatted as follows. The next section reviews the 81 

literature in two primary areas: those factors influencing cyclists’ behavior and previous efforts 82 

to generate cycling generalized cost functions as well as path choice. We then describe our data 83 

and our modeling efforts in more detail. Next, we present the models generated and assess their 84 

predictive powers. Finally, we assess the limitations of our methods and suggest further research 85 

opportunities. 86 

87 

2. LITERATURE REVIEW 88 Most studies focusing on route choice behaviour can be categorized by the data collection 89 

method - either stated preference (SP) or revealed preference (RP) surveys. SP surveys regarding 90 

cycling behavior are ubiquitous in the literature. Table 1 presents a substantive list of research 91 

on factors influencing cycling behavior. We are not the first to generate this kind of summary. 92 

Similarly comprehensive literature reviews were published in (6, 7). 93 

94 

95 

96 

97 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 4: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 4  

  

 

Table 1 Stated preference studies relating bicycle characteristics to propensity to cycle 98 Factor Reference(s)

Facility Characteristics Type of facility (whether mixed with traffic, bike lane, or bike path)

4, 8, 9, 16, 17, 18, 21, 22, 23, 24, 25, 26, 27

Nature of shared roadway, including road class, sight distances, turning radii, lane/median configurations

4, 18, 19, 20, 25, 27

Existence of on-street parking 19, 20, 27, 28 Pavement surface type or/and quality 8, 16, 17, 19, 20, 25 Gradients 6, 16, 17, 19, 28 Intersection spacing and/or configuration 4, 19, 20 Cycling treatments at signals, including timing and detection

18

Completeness and directness of cycling infrastructure 18 Availability of showers at origin or/and destination 9, 22 Availability of secure parking for bicycle at origin or/and destination

9, 18, 22, 27

Continuity of cycling facilities 28 Non-cycle traffic characteristics

Motor vehicle speeds and driver behaviour 16, 19, 20, 25, 27, 29 Volume or mix of motor vehicle types, including proportion of trucks

8, 16, 17, 19, 20, 25, 27

Pedestrian interaction 27 Age 4, 9, 16 Safety concerns 16, 23, 24, 26, 27 Level of cycling experience 16, 17, 29

Individual and trip characteristics

Gender 4, 9, 16 Income 9 Private vehicle ownership 28 Trip length by time or distance 6, 8, 22, 28

Environmental/situational characteristics

Weather, season, temperature, rain 28 Sweeping/Snowplowing 18 Nature of abutting land uses 19, 20, 25 Aesthetics along route 16 Degree of political and public support for cycling 18 Education and enforcement regarding cycling 16 Cost and other disincentives to use other modes 9

Level of Cycling Experience 16, 17, 29

 99 

100 

The well-known weakness of SP surveys is that we may not know if stated responses correspond 101 

to actual choices of travelers in a similar situation. To improve the likelihood of observing 102 

“actual” traveler behavior, some researchers (6, 8, 9) have employed simulation techniques 103 

where travelers are immersed in a situation and their behaviors are observed. 104 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 5: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 5  

  

 

Still, observations of travelers’ behavior in the environment are generally considered 105 

most robust. A pioneering, RP study (4) used actual trips to model cyclists’ route choice. In that 106 

study she attempted to relate a number of variables - road type (arterial, collector, minor), bike 107 

facility, speed limit, volume, grade, and a range of socio-economic characteristics – to route 108 

choice. Actual trip data were recorded using hand-drawn maps. Choice sets were processed using 109 

a multinomial logit framework. The major findings were that cyclists tend to avoid gradients, 110 

grade-separated railway crossings and high-activity areas. In addition, the study was able to 111 

capture two types of behaviour: one group of cyclists (experienced) who preferred travelling 112 

along the shortest route; the second group (inexperienced) preferred travelling longer routes 113 

through residential neighbourhoods, which is consistent with our perception of safety. Since this 114 

study was one of the first ones to use RP data, there were certain limitations found within it. The 115 

gradient variable did not stand out strongly because the direction of travel was not recorded 116 

during the survey; most of the calibrated models had weak statistic quality; the predictive power 117 

of models was not offered for peer review. 118 

A more recent study was performed by a group of researchers in Switzerland and is 119 

recognized as the "first route choice model for cyclists estimated from a large sample of GPS 120 

observations" (3). Cyclists were found to be sensitive to trip length, presence of cycling facility 121 

and gradient. The study explored non-linear parameters for the multinomial family of models, for 122 

which Box-Cox transformation was necessary. The significance of model parameters was found 123 

to be quite high and the elasticity of variables with respect to trip length was evaluated. 124 

However, certain data limitations precluded researchers from adding road volume variable as a 125 

part of the model structure which we think is important. 126 

The third study was performed in Portland and its findings are currently implemented 127 

into the region’s travel forecasting model (2). Several findings were consistent with other 128 

studies, as cyclists were sensitive to path length, gradient, traffic volume, presence of cycling 129 

facility and vehicle turn frequency. 130 

131 

132 

3. METHODOLOGY 133 As noted above, the purpose of our research is to generate a generalized cost representation to 134 

further inform models of cyclists’ path and mode choice behavior. We also validate the 135 

predictive power of our models. In this section, we describe the methods applied in this study. 136 

137 

3.1 Data Needs and trip properties 138 This research builds upon a bicycling study conducted in Waterloo, Ontario from February 2010 139 

to March 2011. More than 400 cyclists were given low-cost GPS units and asked to record their 140 

cycling activity over a two week period. More than 2000 individual trips were made. The GPS 141 

units stored location and time data (x,y,z,t) every three seconds. The raw data were downloaded 142 

and “cleaned” to eliminate suspect points. (The details of the study design and data cleaning 143 

efforts can be found in (5)). 144 

For this research, the GPS data from that study are stored as individual trips. The points 145 

associated with each trip are projected using GIS onto links contained in one of two 146 

transportation networks employed in this study – roadways and trails (accessible only by 147 

pedestrians and cyclists). Once a set of points is associated with a network link, we use the sum 148 

of each link’s attributes to quantify the characteristics of the cyclist’s full path. Based on a 149 

review of the literature, and on the availability of data, for on-road paths, we elected to quantify: 150 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 6: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 6  

  

 

1. The length of each link; 151 

2. The posted auto speed of each link; 152 

3. The auto volume of each link; 153 

4. The gradient (elevation change) of each link; and 154 

5. The presence or absence of a cycling lane. 155 

156 

For links on the trail network, we quantified length and gradient. In both cases, the gradient 157 

was obtained by performing a projection of a digital elevation model (DEM) onto the nodes of 158 

the networks. The horizontal and vertical accuracy of DEM data was stated at ±0.5 m, 1σ-level 159 

with the density of data 1 point per 10 m2. 160 

161 

To generate a path attribute from a series of link attributes we used a length-weighted sum of 162 

each link’s values. Figure 1 demonstrates this conceptually for posted speed. The same method 163 

was used for auto volume. For links on trails, we assumed that these auto-related attributes – 164 

speed and volume – were 0. 165 

166 

167  168 

Link Link ID

Length [km]

Posted Speed [km/hr]

AB 1 1.8 60 BC 2 1.0 50 CD 3 2.4 40

169 

The total length of this path, LAD, is given by: 170 

∑ = 1.8 + 1.0 + 2.4 = 5.2 km 171 

172 

The weighted posted speed, is given by: 173 

∑.

1.8 60 1.0 50 2.4 40 48.84km/hr 174 

175 

We only had data that indicated whether or not a cycling facility (bike lane) existed on each link. 176 

177 

No descriptive data for cycling facilities (width, method of segregation, etc.) were 178 

available. As such, for each path we were only able to calculate the percentage of total length on 179 

which a cycling lane is present. Thus, the cycling facility variable takes continuous values 180 

between 0 and 1. 181 

182 

For gradient, we considered only positive grades (uphill segments) at the link level; we 183 

considered methods to quantify positive utility for cyclists traveling on negative grades 184 

(downhill), but our a prior assumption is that (non-recreational) cyclists tend to experience more 185 

disutility in climbing hills than the utility gained in traveling downhill. So, in Figure 1, if the 186 

C

B

A

D

Figure 1 Method to convert link properties to path properties

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 7: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 7  

  

 

elevation at node B was greater than the elevation at node A, we computed the elevation change 187 

over that link. We then translated that elevation change to a percent using simple trigonometry. 188 

If a link had a negative grade, we set the grade to 0. The full path grade change is the weighted 189 

sum (again by length) of these link grades. 190 

191 

3.2 Generating alternative paths 192 For a given Origin-Destination (OD) pair, there exists one chosen path with a vector of attributes 193 

and an infinite set of alternatives with a wide range of attribute vector values. In the absence of 194 

established methods for alternative path generation, we base our approach on the following a 195 

priori assumption. We assume that there is an inherent trade off for cyclists between a path’s 196 

directness of travel (shorter length or travel time is preferred) and its safety (travel separated 197 

from autos, or adjacent to lower speed, lower volume travel is preferred). As such, to determine 198 

the relative importance of attributes, we generate four alternative, un-chosen paths such that: 199 

two of these paths (where possible) are more direct than the chosen path but, presumably, 200 

have poorer safety attributes – higher auto volumes, higher posted speeds, or no 201 

dedicated cycling space. 202 

two of these paths are less direct paths but typically have better safety characteristics, 203 

including greater use of trails. 204 

205 

Based on these principles, we automated the alternative generation in GIS. To identify more 206 

direct paths (where they existed), we employed built-in shortest path algorithms to find feasible 207 

paths from a path set that had no restrictions, except to preclude travel on freeways. We also 208 

wished to use the shortest path functionality of GIS to identify less direct paths. To this end, we 209 

introduced artificial travel penalties on the three previously identified paths – the two more direct 210 

and the actual path. This forced the GIS shortest path algorithm to find a shortest path that was 211 

both longer than the three previous paths, and also independent of the previous paths. This 212 

independence of alternatives is necessary to satisfy the logit model requirements. 213 

214 

The output of the alternative generation is a table of five paths – the actual path and four un-215 

chosen paths – and a vector of attributes that describe each path. Figure 2 shows both the chosen 216 

path and alternatives generated for a given OD pair. The path characteristics for the alternatives 217 

and the chosen path are shown in Table 2. 218 

219 

220 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 8: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 8  

  

 

 221 Figure 2 An example OD pair with actual path and four alternative paths 222 

223 

224 Table 2 Path attributes for chosen route and alternatives 225 

Path Attribute Alternative

1 Alternative

2 Chosen

path Alternative

3 Alternative

4 Length (km) 7.19 7.42 7.45 9.16 9.91 Auto speed (km/h) 33.2 48.2 35.8 49.8 41.5 Auto volume (veh/h) 245.1 357.4 240.0 219.8 90.2 Grade 0.52 0.55 0.39 0.83 0.64 Presence of bike lane 0.49 0.16 0.46 0.02 0.22

226  227 

While all the alternatives have similar speeds, in this case, the shorter paths (alternatives 228 

1 and 2) both expose the cyclist to higher auto volumes and more challenging grades. Similarly, 229 

only 16% of path 2 has a bicycle facility compared to 46% of the chosen alternative. The longer 230 

paths have lower auto volumes, but higher grades, and a smaller percentage of the path with 231 

cycling facilities. 232 

233 

The 2011 study produced more than 2000 individual trips. From this total set, we were 234 

forced to exclude many records because: 235 

1. Many of the trips were recreational which implies a very different path choice 236 

framework; 237 

2. Many of the trips were very short – less than 300 meters – which precluded the 238 

generation of meaningful alternatives; 239 

3. Some attribute data were missing from the GIS network which precluded the generation 240 

of alternative and chosen path characteristic tables (like Table 2). 241 

242 

As a result of these limitations, we reduced our total data set to 905 trips. This sample size 243 

is still sufficiently large to provide meaningful results. By generating this kind of comparison for 244 

more than 900 O-D pairs, with the knowledge of the chosen alternative, it is possible to estimate 245 

quantitatively the relative value that cyclists in our study place on each of the path attributes. 246 

247 

OD

Choice set for the cyclist:Actual pathAlt.1Alt.2Alt.3Alt.4

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 9: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 9  

  

 

3.3 Model estimation 248 The framework to model cyclists’ route choice is based upon discrete choice theory first 249 

developed by (10, 11, 12, 13). We employ a multinomial logit model (MNL) of the form: 250 

251 

∑ ∈

252 

253 

This equation can be interpreted as follows. The probability of choosing alternative i amongst all 254 

alternatives n is a function of the utility derived ( from the attributes of choice i, relative to 255 

the utility derived from the attributes of all alternatives. For our work, we employ this 256 

framework to identify the coefficient for each attribute in a utility function that maximizes the 257 

likelihood that the chosen alternative has the highest probability of being selected. In other 258 

words, we use this framework to establish the values for in equation 2 that maximize the 259 

number of times the chosen path has the highest probability of being selected. 260 

261 

. 262 

263 

Note that in equation 2 that the coefficients of length, speed, volume and elevation 264 

difference are negative – implying decreased utility or a cost. On the other hand, the presence of 265 

a bike lane creates positive utility. 266 

267 

3.4 Analysis of alternatives 268 As with any modeling effort, it is necessary to evaluate the form and interdependence of 269 

independent variables. Figure 2 shows a scatter plot generated for the attributes of the chosen 270 

paths. In separate work (14), we analyze these data and determine that a linear representation of 271 

the utility function is appropriate, and that the a priori hypothesis regarding route tradeoffs are 272 

supported by these diagrams. Here, we make one further observation. There appears to be a 273 

strong correlation between path speed and the presence of a bike lane – note the lack of 274 

dispersion in data values in the two highlighted boxes in Figure 3. In order to avoid a situation 275 

where this correlation produces unwanted outcomes, we do not consider utility functions that 276 

contain both speed and the presence of a bike lane. 277 

278 

279 

eq.1

eq.2

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 10: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 10  

  

 

 280 Figure 3 Scatter plots of the relationships between independent variables 281 

282 

4. MODELS GENERATED 283 To account for the dependence between speed and bike lane, we began with two model 284 

formulations. The first included bike lane and excluded speed; the second included speed and 285 

excluded bike lane. The format of the two models is shown in Table 3. 286 

287 Table 3 General format of the models generated 288 

Model 1

Model 2

289 We then randomly chose a set of 724 trips (80% of the total data set) to estimate model 290 

parameters, reserving the remaining 20% of trips for validation. Utilizing commercial software 291 

(EasyLogitModeler (15)), we estimate the model parameters and goodness of fit. The model 292 

parameters are shown in Table 4 along with t-test results that evaluate statistical significance. 293 

The Abs. ratio in the table is a metric of the relative importance per unit of measurement of each 294 

independent variable. 295 

296 

For model 1, all signs are consistent with our a priori expectations. Cyclists perceive -297 

0.1818 units of utility (or 0.1818 units of cost) for every km of distance traveled; cyclists also 298 

experience disutility associated with positive (uphill) grades. In contrast, the model estimates 299 

very strong positive utility for increased percentages of paths with cycling lanes. In model 1, the 300 

auto volume is statistically insignificant. 301 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 11: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 11  

  

 

302 

In model 2, the signs for length, speed and gradient are consistent with expectation. 303 

Increases in each of these variables generate higher cycling costs. Volume, on the other hand, 304 

has an opposite sign; model 2 suggests that increases in auto volume lower cyclists’ perception 305 

of cost. The Abs. ratios in model are also very large, suggesting that model 2 may have some 306 

undesirable qualities. This is reflected in its predictive power, discussed below. 307 

308 Table 4 Parameter estimates for models 1 and 2 309 

Model 1 Model 2 Value t-test Abs. Ratio Value t-test Abs. Ratio Estimated parameters

-0.1818 -5.1864 1 -0.083 -2.2037 36

4.3081 11.7626 24 - - -

0.0001 0.371 - 0.0023 6.4307 1

- - - -0.7025 -15.4666 305

-1.4864 -6.4719 8 -0.5009 -2.0703 218

310 

The goodness of fit statistics can be found in (14). We think a more appropriate 311 

assessment of the models’ performance can be completed by using these cost functions to predict 312 

path choice for OD pairs not used in the model formulation. To this end, we solved equation 1 313 

using the two models for five alternative paths between 181 OD pairs. Recall that the outputs of 314 

equation 1 are the probability of each alternative being chosen. We analyzed the result in three 315 

ways. 316 

First, we calculated the number of times the actual path was estimated by equation 1 to 317 

have the highest probability of being chosen amongst the five alternatives; this metric is simply 318 

how many times do the models predict the correct path. In the cases where an alternative’s path 319 

had a higher probability, we record the rank of the actual path amongst the five choices. Finaly, 320 

we calculated the difference in probabilities between the highest probability path and the chosen 321 

path. When this difference is very small, the model can be perceived to be providing a good 322 

estimate; when this difference is very large, the model is not performing well. 323 

324 

Table 5 shows sample results for five OD pairs using model 1. For trips 917 – 920, the 325 

model correctly identified the chosen path as having the highest probability of being selected. 326 

For trip 921, the model predicted the probability of choosing the actual path as least likely, fifth, 327 

amongst all choices. Also for trip 921, alternative 2 is the most likely path with a probability of 328 

0.292 while the actual path only had a probability of 0.043. In this case, a substantial difference 329 

in probabilities – 0.243 – is observed meaning for this OD pair, the model performs poorly. 330 

331 Table 5 Sample probabilities of selecting paths 332 

Model 1 Trip_ID

Pr (actual)

Pr (alt.1) Pr (alt.2)

Pr (alt.3)

Pr (alt.4)

917 0.530 0.093 0.146 0.114 0.117

918 0.513 0.109 0.174 0.100 0.104

919 0.307 0.288 0.154 0.207 0.045

920 0.857 0.028 0.044 0.035 0.036

921 0.043 0.193 0.292 0.228 0.244

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 12: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 12  

  

 

333 

In our tests, model 1 correctly predicted the chosen path in 118 out of 181 cases, or 65% 334 

of the time. Model 2 was correct for 84 trips, or 46% of the time. For those trips in which the 335 

actual path was not highest probability choice, Table 6 shows the number of times the actual path 336 

was ranked second through fifth. 337 

338 Table 6 Rankings of actual path amongst five choices 339 

Rank of Actual Path Model 1 Model 2

2nd 19 193rd 26 474th 13 195th 5 12

340 

341 

Figure 4 shows the cumulative distribution of observations as a function of the error term. 342 

Model 1 predicted the actual path as the highest probability path (therefore producing a 343 

difference of 0) 65% of the time. Again for model 1, the difference in probability was less than 344 

0.05 for an additional 24 OD pairs, or 13%. Thus, in 78% of the cases, model 1 either predicts 345 

the chosen path correctly or estimates that its probability is within 0.05, a value one may 346 

consider sufficiently similar to the highest probability alternative. Figure 4 also shows the 347 

number of OD pairs for which the models perform poorly. For model 1, in about 16% of cases, 348 

the difference in probabilities exceeds 0.10; for model 2, the difference is greater than 0.10 more 349 

than 30% of the time. Thus, for both models, significant “outliers” – situations where cyclists 350 

chose unconventional paths – exist. 351 

352 

353 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 13: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 13  

  

 

 354 Figure 4 Models' performance as measured by probability difference 355 

356 

357 

5. ANALYSIS AND LIMITATIONS 358 There are several limits to our methods. In our study, nearly all participants described 359 

themselves as expert or very experienced cyclists. As such, the tradeoffs between distance and 360 

safety may be less pronounced in our data set than if data were collected from a broader 361 

population base. We were also forced to include only the presence or absence of bike lanes in 362 

our analysis. We believe that there is a hierarchy of cycling facilities – with well designed off-363 

road trails being generally most desirable. Similarly, the satisfaction or desirability of on-road 364 

cycling paths varies with the width and degree of separation from vehicular traffic. Future 365 

research may include a measure of cycling facility quality as well as quantity. 366 

We also recognize that it is likely that a cyclist considers both vehicle speeds and the 367 

presence or absence of bike lane in making his path choice. The statistical correlation between 368 

these two features in our set of alternatives we generate precludes the inclusion of both in a 369 

single model. An improved approach to developing feasible, un-chosen alternatives would be to 370 

ensure that some paths included cycling lanes, while others excluded links containing cycling 371 

facilities. This would accomplish two objectives. First, the statistical correlation between the 372 

two factors would likely be significantly reduced. Second, this approach would allow us to 373 

quantify more rigorously the value of a cycling facility. 374 

In testing the predictive power of our models, we examined the likelihood of selecting the 375 

chosen path amongst a choice set of five paths (the chosen plus four alternatives). A more robust 376 

test of the predictive power would be to code our generalized cost formulations into GIS or any 377 

shortest path software to determine how often the chosen path is the highest probability path 378 

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 0.1 0.2 0.3 0.4 0.5 0.6

Percent of observations with error less than

 upperbound

Upper Bound on Probability Difference

Model 1 Model 2

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 14: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 14  

  

 

amongst the full set of feasible paths. Despite these limitations, we find the results presented 379 

here to be positive and, more importantly, repeatable. 380 

381 

6. CONCLUSIONS AND POTENTIAL APPLICATIONS 382 Using GPS traces of cycling activity, and developing feasible alternative routes, we are able to 383 

estimate the relative cost perceptions of cyclists in Waterloo Region. The two models developed 384 

perform reasonably well in predicting cyclists’ path choice, though model 1 – calibrated as a 385 

function of length, grade and bike lane – performs significantly better than model 2. 386 

We believe that this research has significant potential to influence how cycling facilities 387 

are planned and designed. In forthcoming work we demonstrate the following approach. We 388 

select a set of destinations that are expected to have significant cycling demand – in our case 389 

several university campuses. Using the utility functions we present in this paper, we estimate the 390 

most likely path from all origins to the set of destinations. For each predicted trip, we then 391 

compare the predicted path to the current shortest possible path to calculate “excess travel.” 392 

Those OD pairs with the largest excess travel are trips for which investments in cycling facilities 393 

along the shortest path may produce the greatest return on investment. Using this method, we 394 

are also able to identify key links that if upgraded would reduce the user cost of cycling for 395 

multiple OD pairs. More generally, we are working to extend the methods described here to 396 

create a low-cost, reasonably robust generalized cost representation for cyclists that can be 397 

integrated into a multimodal travel forecasting framework to predict utilization and, ultimately, 398 

return on investment. 399 

400 

401 

ACKNOWLEDGEMENTS 402 The authors are grateful to the Region of Waterloo for funding of the project through which the 403 

GPS data were gathered. This research was also funded in part by the National Science and 404 

Engineering Research Council (NSERC). The Easy Logit (15) software used for model 405 

estimation was provided by Jeffrey Newman (Northwestern University). The authors also 406 

acknowledge the helpful and constructive comments from the anonymous reviewers. 407 

408 

409 

REFERENCES 410 1. Vuchic, V., (1999), Transportation for liveable cities. Center for Urban Policy Research, 411 

New Jersey. 412 

2. Broach, J., Gliebe, J., Dill, J. (2011), Bicycle route choice model developed from 413 

revealed-preference data. Transportation Research Board. Paper #11-3901 414 

3. Menghini, G., N. Carrasco, N., Schussler, N., Axhausen, K., W. (2010), Route choice of 415 

cyclists in Zurich. Transportation Research Part A 44 (2010) 754-765 416 

4. Aultman-Hall, L.,M. (1996), Commuter bicycle route choice: analysis of major 417 

determinants and safety implications, (PhD Dissertation) 418 

5. Rewa, K., (2012), An analysis of stated and revealed preference cycling behaviour: a case 419 

study of the regional municipality of Waterloo, (MSc Thesis), 420 

http://hdl.handle.net/10012/6910 421 

6. Hunt, J., D., Abraham., J., E. (2006), Influences on bicycle use. Transportation 2007 422 

34:453-470 423 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 15: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 15  

  

 

7. Heinen, E., Wee van Bert, Maat, K. (2010), Commuting by bicycle: an overview of the 424 

literature. Transport Reviews: A Transnational Transdisciplary Journal, 30:1, 59-96 425 

8. Bradley, M., A., Bovy, P.H.L. (1984), A stated-preference analysis of bicyclist route 426 

choice, Proceedings - PTRC Annual Meeting, London, pp. 39-53 427 

9. Taylor, D., Mahmassani, H. (1996), Analysis of stated preferences for intermodal 428 

bicycle-transit interfaces. Transportation Research Record 1556 429 

10. Ben-Akiva, M., Lerman, S. (1985), Discrete Choice Analysis: Theory and Applications 430 

to Travel Demand. MIT Press Series in Transportation Studies 431 

11. McFadden. D (1976), The mathematical theory of demand models. In Behavioural Travel 432 

Demand Models . P. Stopher and A. Meyburg, eds, North Holland, Amsterdam, pp. 75-433 

96 (Nobel Prize Laureate) 434 

12. McFadden. D (1974), Conditional logit analysis of qualitative choice behaviour. In 435 

Frontiers in Econometrics. P.Zarembka, ed. Academic Press, New York, pp. 105-142 436 

(Nobel Prize Laureate) 437 

13. McFadden. D, Tye, W., Train, K. (1977), An application of diagnostic tests for the 438 

irrelevant alternatives property of the multinomial logit model. Transportation Research 439 

Record 637: 39-46 440 

14. Usyukov, V. (2013), Modeling of route choice of cyclists in the Region of Waterloo, 441 

Canada. Transferability of route choice models to the Region of Peel, Canada, (MSc 442 

Thesis) 443 

15. Easy Logit Software, available from http://elm.newman.me/downloads 444 

16. Antonakos, C. (1994), Environmental and travel preferences of cyclists. Transport. Res. 445 

Record 1438, 25–33 446 

17. Axhausen, K.W., Smith, R.L. (1986), Bicyclist link evaluation: a stated-preference 447 

approach. Transport. Res. Record 1085, 7–15 448 

18. Copley, J.D., Pelz, D. (1995), The City of (19) experience—what works. Am. Soc. Civil. 449 

Eng. Transport. Congr. 2, 1116–1125 450 

19. Davis, W.J. (1995), Bicycle test route evaluation for urban road conditions. Am. Soc. 451 

Civil. Eng. Transport. Congr. 2, 1063–1076 452 

20. Epperson, B. (1994), Evaluating suitability of roadways for bicycle use: towards a 453 

cycling level-of-service standard. Transport. Res. Record 1438, 9-16 454 

21. Goldsmith, S. (1996), Estimating the Effect of Bicycle Facilities on VMT and Emissions. 455 

City of Seattle Engineering Department, Seattle, WA 456 

22. Guttenplan, M., Patten, R. (1995), Off-road but on track. Transport. Res. News 178(3), 457 

7–11 458 

23. Kroll, B., Ramey, M. (1977), Effect of bike lanes on driver and bicyclist behavior. J. 459 

Transport. Eng. Div.,Am. Soc. Civil. Eng. 103(TE2), 243–256 460 

24. Kroll, B., Sommer, R. (1976), Bicyclist response to urban bikeways. J. Am. Inst. Planners 461 

42, 42–51 462 

25. Landis, B.W., Vattikuti, V.R. (1997), Real-time human perceptions: towards a bicycle 463 

level of service. Presented at the 1997 Transportation Research Board Annual 464 

Conference, Washington DC, January, 465 

26. Lott, D.Y., Tardiff, T., Lott, D.F. (1978), Evaluation by experienced riders of a new 466 

bicycle lane in an established bikeway system. Transport. Res. Record 683, 40–46 467 

TRB 2014 Annual Meeting Paper revised from original submittal.

Page 16: Modeling Cyclists' Route Choice Based on GPS Data

Casello and Usyukov 16  

  

 

27. Mars, J.H., Kyriakides, M. (1986), Riders, Reasons and Recommendations: A Profile of 468 

Adult Cyclists in Toronto. City of Toronto Planning and Development Department, 469 

Toronto ON 470 

28. Stinson, M., Bhat, C. (2003), Commuter bicyclist route choice: analysis using a stated 471 

and preference survey. Transportation Research Record 1828, Paper No. 03-3301 472 

29. Sorton, A., Walsh, T. (1994), Bicycle stress level as a tool to evaluate urban and sub-473 

urban bicycle compatibility. Transport. Res. Record 1438, 17–24 474 

475 

TRB 2014 Annual Meeting Paper revised from original submittal.