34
TRACLASS: TRAJECTORY CLASSIFICATION USING HIERARCHICAL REGION-BASED AND TRAJECTORY-BASED CLUSTERING JAE-GIL LEE, JIAWEI HAN, XIAOLEI LI, HECTOR GONZALEZ UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN VLDB 2008

VLDB 2008

  • Upload
    andra

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

VLDB 2008. TRACLASS : Trajectory Classification Using Hierarchical Region-Based and Trajectory-Based Clustering Jae-Gil Lee, Jiawei Han, Xiaolei Li, Hector Gonzalez University of Illinois at Urbana-Champaign. Outline. Motivation TraClass : Trajectory Feature Generation - PowerPoint PPT Presentation

Citation preview

TRaCLASS: T

TRACLASS: Trajectory Classification Using Hierarchical Region-Based and Trajectory-Based Clustering

Jae-Gil Lee, Jiawei Han, Xiaolei Li, Hector GonzalezUniversity of Illinois at Urbana-Champaign

VLDB 2008OutlineMotivationTraClass: Trajectory Feature GenerationTrajectory PartitioningRegion-Based ClusteringTrajectory-Based ClusteringClassification StrategyPerformance EvaluationRelated WorkConclusions2008-08-28#ClassificationFeature Generation

ClassifierClass labelTraining dataFeaturesPredictionUnseen data(Jeff, Professor, 4, ?)Tenured = YesScope of this paper2008-08-28#Trajectory DataA trajectory is a sequence of the location and timestamp of a moving object

HurricanesTurtlesVesselsVehicles2008-08-28#Trajectory ClassificationDefinition: The process of predicting the class labels of moving objects based on their trajectories and other features

Applications: Homeland security, weather forecast, law enforcement, etc.Example: Detection of vessel types (e.g., container ships, tankers, and fishing boats) from satellite images

2008-08-28#Previous StudiesSeveral trajectory classification methods have been proposed mainly in the fields of pattern recognition, bioengineering, and video surveillance

A common characteristic of earlier methods is that they use the shapes of whole trajectories to do classification, e.g., by using the HMMNote: Although a few methods partition trajectories, the purpose of their partitioning is just to approximate or smooth trajectories2008-08-28#Problem Statement and ObservationsProblem Statement: Given a set of labeled trajectories, generate discriminative trajectory features that make a specific class distinguishable from other classes

Observations: (1) Discriminative features are likely to appear at parts of trajectories, not at whole trajectories; (2) Discriminative features appear not only as common movement patterns, but also as regions2008-08-28#Motivating ExampleObservation 1: Parts of trajectories near the container port and near the refinery enable us to distinguish between container ships and tankers even if they share common long pathsObservation 2: Those in the fishery enable us to recognize fishing boats even if they have no common path there

RegionSub-trajectory2008-08-28#Limitations of Earlier MethodsThe classification accuracy of earlier methods might not be high since the overall shapes of whole trajectories are similar to each otherOur framework TraClass aims at discovering both region and sub-trajectory features

Overall shape2008-08-28#Overall Procedure of TraClassExtract features in a top-down fashion, first by region-based clustering and then by trajectory-based clustering

Region-Based ClusteringTrajectory-Based ClusteringTrajectory partitions in non-homogeneous regionsRegion-based andTrajectory-based clustersTrajectory partitionsRecursively quantize non-homogeneous regionsRepeatedly find finer-granularity clusters2008-08-28#Our ContributionsAchieve high classification accuracy owing to the collaboration between the two types of clusteringRegion features Region-based clusteringSub-trajectory features Trajectory partitioning and trajectory-based clustering

2008-08-28#Where We Are NowRegion-Based ClusteringTrajectory-Based ClusteringTrajectory partitions in non-homogeneous regionsRegion-based andTrajectory-based clustersTrajectory partitionsRecursively quantize non-homogeneous regionsRepeatedly find finer-granularity clusters2008-08-28#Class-Conscious Trajectory Partitioning1. Trajectories are partitioned based on their shapes as in the partition-and-group framework [12]2. Trajectory partitions are further partitioned by the class labelsThe real interest here is to guarantee that trajectory partitions do not span the class boundaries

Additional partitioning points Non-discriminative DiscriminativeClass AClass B2008-08-28#Partitioning ConditionIf the most prevalent class around one endpoint is different from that around the other endpoint, further partition itExample:Class AClass BPrevalent class = Class APrevalent class = Class BNeed to be further partitioned2008-08-28#Where We Are NowRegion-Based ClusteringTrajectory-Based ClusteringTrajectory partitions in non-homogeneous regionsRegion-based andTrajectory-based clustersTrajectory partitionsRecursively quantize non-homogeneous regionsRepeatedly find finer-granularity clusters2008-08-28#Region-Based ClusteringDiscover regions that have trajectories mostly of one class regardless of their movement patternsThe region-based cluster is a set of trajectory partitions of the same class within a rectangular region regardless of their movement patterns

(1)(2)2008-08-28#Desirable Properties of Region-Based ClusteringHomogeneity: The class distribution in each region should be as homogeneous as possibleConciseness: The number of regions should be as small as possibleNote: Two properties are contradictory to each other

Need to find a good tradeoff between the propertiesOne large regionMany small regionshomogeneityconciseness2008-08-28#Translation into MDL OptimizationThe minimum description length (MDL) cost consists of the description cost and the code costThe former measures conciseness, and the latter homogeneity

The best hypothesis is the one that minimizes the sum of the description cost and the code cost

Finding a good quantization translates to finding the best hypothesis using the MDL principle2008-08-28#Region-Based Clustering AlgorithmProgressively find a better partitioning alternately for the X axis and for the Y axis as long as the MDL cost decreasesSelect the partition that has the maximum code cost and divide it into two parts in order to decrease the MDL cost

(1) (2) (3) (4)2008-08-28#19Where We Are NowRegion-Based ClusteringTrajectory-Based ClusteringTrajectory partitions in non-homogeneous regionsRegion-based andTrajectory-based clustersTrajectory partitionsRecursively quantize non-homogeneous regionsRepeatedly find finer-granularity clusters2008-08-28#Trajectory-Based ClusteringDiscover sub-trajectories that indicate common movement patterns of each classThe trajectory-based cluster is a set of trajectory partitions of the same class which share a common movement pattern

(3)(4)2008-08-28#Trajectory-Based Clustering AlgorithmSimilar to our trajectory clustering algorithm [12], but incorporate the class labels into clusteringThe algorithm is based on DBSCAN [5]If an -neighborhood contains trajectory partitions mostly of the same class, it is used for clustering; otherwise, it is discarded immediately

Non-homogeneous Homogeneous -neighborhood -neighborhood L1L2XO2008-08-28#Selection of Trajectory-Based ClustersAfter trajectory-based clusters are found, discriminative clusters are selected for effective classificationIf the average distance to other clusters of different classes is high, the discriminative power of the cluster is highExample:

C1C2Class AClass BC1 is more discriminative than C22008-08-28#Generation of Cluster LinksA cluster link is a sequence of connectable (i.e., consecutive) trajectory-based clustersTwo clusters are connectable if they share enough trajectories (more formally, the ratio of common trajectories is higher than )

The benefit of cluster links is to derive also whole-trajectory featuresCluster links are added to the set of trajectory-based clusters for use in classification2008-08-28#Classification Strategy1. Partition trajectories by considering the class labels 2. Perform region-based clustering3. Perform trajectory-based clustering4. Select discriminative trajectory-based clusters5. Find cluster links from trajectory-based clusters6. Convert each trajectory into a feature vectorEach feature is either a region-based cluster or a trajectory-based clusterThe i-th entry of a feature vector is the frequency that the i-th feature occurs in the trajectory7. Feed the feature vectors to the SVM2008-08-28#Experimental Setting (1/2)Use three real trajectory data setsAnimal movement data setMovements of elk, deer, and cattle for the years 1993 through 1996 Three classes: Elk, Deer, and CattleNumber of trajectories (points): 38 (7117), 30 (4333), and 34 (3540)Vessel navigation data setNavigation paths of two vessels in August 2000Two classes: Point Lobos and Point SurNumber of trajectories (points): 600 (65500) and 550 (125750)Hurricane track data setAtlantic Hurricanes for the years 1950 through 2006Two classes: Category 2 and Category 3Number of trajectories (points): 61 (2459) and 72 (3126)

Randomly select 20% of trajectories for the test set2008-08-28#Experimental Setting (2/2)Measure classification accuracy, training time, and prediction time for the three data sets

Compare two versions of the algorithmTB-ONLY: Perform trajectory-based clustering onlyRB-TB: Perform both types of clusteringTB-ONLY is expected to be no worse than earlier methods since it discovers also whole-trajectory features by cluster-link generationClassification accuracy = # of test trajectories correctly classifiedtotal # of test trajectories2008-08-28#Overall ResultsData SetAnimalVesselHurricaneVersionTB-ONLYRB-TBTB-ONLYRB-TBTB-ONLYRB-TBAccuracy (%)50.083.384.498.265.473.1Training Time (ms)354224064468322902331317Prediction Time (ms)104987226084846The classification accuracy of RB-TB is much higher than that of TB-ONLYThe training time of RB-TB is much shorter than that of TB-ONLY

2008-08-28#Features for the Animal Data

Data: Three classesFeatures:10 region-based clusters37 trajectory-based clustersRed: Elk Blue: Deer Black: CattleAccuracy = 83.3%2008-08-28#Features for the Hurricane Data

Gulf of Mexico1 region-based cluster15 trajectory-based clustersRed: Category 2 Blue: Category 3Stronger hurricanes tend to go further than weaker onesThese hurricanes entered the Gulf of Mexico and thus stayed longer at sea before landfall than others; They are likely to get strong because hurricanes gain energy from the evaporation of warm ocean water2008-08-28#Results for Synthetic DataEffect of region-based clustering

Effect of the data size (scalability test)

2008-08-28#Related WorkPattern recognition [1] e.g., speech, handwriting, signature, and gesture recognitionClassifying human motion trajectoriesEmploying the hidden Markov model (HMM)Bioengineering [16]Classifying biological motion trajectoriesVideo surveillance [15]Detecting suspicious behaviors of pedestriansTime-series classification [20,21]Moving-object anomaly detection [14]2008-08-28#ConclusionsA novel and comprehensive feature generation framework for trajectories has been proposed

The primary advantage is the high classification accuracy owing to the collaboration between the two types of clustering

Various real-world applications, e.g., vessel classification, can benefit from our framework2008-08-28#Thank You!2008-08-28#Sheet1NAMERANKYEARSTENUREDMikeAssistant Prof3noMaryAssistant Prof7yesBillProfessor2yesJimAssociate Prof7yesDaveAssistant Prof6noAnneAssociate Prof3no