41
Mining Regular Route from GPS Data for Ridesharing Recommendation 2012.09.19

2012.09.19. Wen He Tsinhua University, Beijing, China and Xi'an Communication Institute, Xi'an, China Deyi Li Tsinhua University, Beijing, China and Chinese

Embed Size (px)

Citation preview

Mining Regular Route from GPS Data for Ridesharing

Recommendation

2012.09.19

AuthorWen He Tsinhua University, Beijing, China and Xi'an Communication Institute, Xi'an, ChinaDeyi Li Tsinhua University, Beijing, China and Chinese Institute of Electronic System Engineering, Beijing, ChinaTianlei Zhang Tsinhua University, Beijing, ChinaLifeng An Tsinhua University, Beijing, ChinaMu Guo Tsinhua University, Beijing, ChinaGuisheng Chen Chinese Institute of Electronic System Engineering, Beijing, China

OutlineIntroductionRelated WorkArchitecture The mining of regular routingRidesharing recommendationsExperiment discussionConclusion

IntroductionWhy do this research

Improve traffic problem in Beijing, China.

Current ridesharing is still short supplies due to

Journey reliability. Ridesharing less desirable

Current situation: More people are likely to record daily trajectory.Current challenge: Regular Route(RP) is difficult recognized from frequent route.

ContributionMade a ridesharing recommendations according to a group of user’s regular routes(RP).

Propose a method to split the mixed user trajectories into each individual route.Propose a frequency-based regular routes mining method to infer user’s RPs.Improve the accuracy in distinguishing travel modes between public transports and private drivingEvaluate a method using a large GPS dataset which is provided by GeoLife. This dataset contains 178 realistic user GPS trajectories over a period of four years.

Related Work(1/2)Past Ridesharing Recommendation

A location-based cab-sharing service to help reduce cab fare costs and effectively utilize available cabs Improve the quality of ridesharing by increasing the driver’s incomeMultiple riders was proposed based on the Bee Colony Optimization Metaheuristic method

Related Work(2/2)Mining Route History

not only to driver users, but also to users who take public transport as one of their common travel modes.Our sharable routes are not directly generated based on one day’s trajectory log.flexible time interval

it’s difficult to find two cars that are always keep synchronized, even they started at the same time and were running on the same road.The difference between a personal route and a regular route(RP) is that, a personal route does not consider the time factor, and is not a complete route.

ArchitectureThree components:

Routes ProcessingRegular Routes MiningRidesharing Recommendation

User-based components only need to be preformed once while a usersubmitting his/her logs to the system.

Routes ProcessingStay Regions Subtracting

A stay regions is definitely not a part of a regular

Grids Mappingcombine the time information with grids, and a series of temporal grids is built.

Routes Splittingsegment the trajectory into each individual route.

Regular Routes MiningRoutes Grouping

group the routes which happened at similar times of a day together.

Regular Routes Finding:a frequency-based regular routes mining algorithm is proposed.

Travel Modes RecognizingA feature of fixed stop rate (FSR) is used to recognize the different travel modes of an RR

Ridesharing Recommendations

Grid-based Routes Table BuildingFor a regular route generated by public transport, we only record it at the starting and ending grids.

Routes Matching With the grid-basedSearch two routes which appeared in pairs and also have similar time properties.

THE MINING OF REGULAR ROUTES

Routes Processingtwo cases that a sequence of GPS points should be split:

user may arrive at his destination, and when he left, a new route will begin. GPS device was shut down or lost satellite signal over a certain time.

Routes Processing Steps(1/3)

Stay Region SubtractingPm is just the ending point of the route which enters into a stay regionPn+1 is the starting point when user departs from the stay region.we do not denote this region by a single point, but by a pair of indicators (Pm ,Pn), where Pm and Pn are the beginning and ending points of the stay region.

Routes Processing Steps(2/3)

Grids Mapping

Routes Processing Steps(3/3)

Routes Splitting

Regular Routes Mining(1/2)

An RR is a complete route where a user frequently passed through in approximately the same time of day.

The first is , which is used to decide the frequency of a route,and the second is , which is used to decide a similar time.

Regular Routes Mining(2/2)

Routes GroupingTherefore it's difficult to extract RRs from all routes directlyBut an RR should always happen at a similar time of day.group routes not only based on the time of day but also the day of the week

Regular routes Finding(1/7)

After grids mapping, the trajectories are formed as R1, R2 and R3 in Figure4 (b).

Regular routes Finding(2/7)

We say a DE is a FDE if DE.num is larger than threshold of .RR is a route which is frequently visited by a couple of complete routes, but not some parts of a route. This means we should not directly use FDEs to represent an RR

Regular routes Finding(3/7)

In a set of t-Routes, FDEs may exist without an RR. But if there is an RR, the RR will have large common parts with FDE

Regular routes Finding(4/7)

A frequency-based regular route mining method

1. Calculate frequent coefficient (FC) of each route

The frequent coefficient is defined as fc(R) =m/n, where n is the number of DEs in the route R, m is the number of FDEs in the route R.

2. Find frequent routesA route with fc(R)> fcthreh will be deemed as a frequent route.

3. Calculate regular coefficient (RC) of each FDE

Regular routes Finding(5/7)

A frequency-based regular route mining method

4. Find Regular FDEs (RFDE)

5. Use RFDEs instead of FDEs to repeat step 2 to 4.

Regular routes Finding(6/7)

in Figure5 (a), both R2 and R4 passed the DE (JS->JT), but since R4 is not a frequent route, it has no contribution to an RR, DE(JS->JT) cannot be an RFDE.

Regular routes Finding(7/7)

Add time property (ts, td) for each RR, where ts and td denote the start and the duration time of the route respectively

where n is the number of the support routes of an RR.

Travel Modes Recogning(1/3)

According to make a recommendation for ridesharing, there are two transport modes:

public transportprivate driving.

Travel Modes Recogning(2/3)

distinguish different transport modesWhich one is public transport

frequently at fixed regions like bus stops or subway stations

Then an RFDE with SP lower than is a stop region.

Travel Modes Recogning(3/3)

Fixed Stop rate(FSR)the number of stop regions within a certain distance

RIDESHARING RECOMMENDATIONS

Grid-based Routes Table Buildingif it is generated by public transportation, it will only be recorded in its origin and destination grids.

RIDESHARING RECOMMENDATIONS

Routes MatchingTwo kind of car sharing

Public transportationPrivate driving

if a query route is generated by public transport, only routes by driving modes could be recommended

RIDESHARING RECOMMENDATIONS

Flow chart if the process

EXPERIMENTS DISCUSSION

This dataset is consisted of 178 users' realistic trips over a period of 4 years (from 2007 to 2011).Most of the time, we see all mined RRs as different users’ RRs

Experiment ResultInfluence of grid size

Smaller the grid size, the larger the storage space is neededToo larger a grid size will lost some details

10 sec as final grid size in our experiment

Experiment Result3 routes are support routes of RR from 9 routes in similar time.

Robust to slight disturbance

Compare with ANTrip trajectory

Experiment Resultthe trajectories are generated by bus

Only two one RR for the user (according to three bus routes(a))

Experiment Result are too short to make a ridesharingRR are dense in north of Beijing in Microsoft Research

Experiment Resultusing FSR to distinguish traffic modes between public transportation and driving.the accuracy could reach 0.876

Experiment ResultRoutes matching.

(a) and (c) are public transportation(b) and (d) are driving

Experiment ResultThe storage requirement of the proposed method.

The first row is the number of recordsThe second row is the storage ration between the numbers of the original dataset.The storage requirement is quite lightweight

A frequency-based regular route mining algorithm is proposed

each part of a regular route must be visited frequently.a regular route should be frequently visited by some complete routes called support routes.most parts of a support route must pass through the frequently visited regions

identified to distinguish travel modes between public transportation and individual drivingValuated on a real-world GPS dataset, which is consisted of 178 users over a period of 4 years

Conclusion

Futrue workMore flexible ridesharing strategies will be considered in our future work.

find a route which reaches at his/her nearest subway station.

BACKUP

Presented by Ivan Chiou