20
Heterogeneous Data Fusion with Multiple Kernel Growing Self Organizing Maps Maheshakya Wijewardena[1], Thimal Kempitiya[1], Thilina Rathnayake[1], Kevin Ratnasekera[1] Thushan Ganegedara[1], Amal Perera[1], Damminda Alahakoon[2] [1]Department of Computer Science and Engineering, University of Moratuwa [2] La Trobe University, Bundoora Victoria

Heterogeneous data fusion with multiple kernel growing self organizing maps

Embed Size (px)

Citation preview

Heterogeneous Data Fusion with Multiple Kernel Growing Self Organizing Maps

Maheshakya Wijewardena[1], Thimal Kempitiya[1], Thilina Rathnayake[1], Kevin Ratnasekera[1]Thushan Ganegedara[1], Amal Perera[1], Damminda Alahakoon[2]

[1]Department of Computer Science and Engineering, University of Moratuwa[2] La Trobe University, Bundoora Victoria

Overview

● About data fusion - what, why, how?● Data fusion in unsupervised learning● Introduction to Growing Self Organizing Maps● Kernel methods in Self organizing Maps● Data fusion with Multiple Kernel Growing Self Organizing Maps● Road traffic visual analysis with heterogeneous sources ● Conclusion

2

What is Data Fusion and why?

● Data fusion is the process of integrating multiple data sources and knowledge concerning the same real world object in order to obtain more ○ consistent○ accurate ○ robust○ descriptive information

“Combination of multiple sources to get improved information; i.e. less

expensive, higher quality or more relevant information.”

3

Data Fusion in Unsupervised Learning

● Data fusion in supervised learning is has become trivial:○ Kernel based methods○ Model ensembles○ …

● Unsupervised learning - learning is to discover hidden structure in unlabeled data○ Clustering is an ill-posed problem where solutions violate at least one of the

common assumptions about scale invariance, richness, and cluster consistency

○ Different solutions may look equally plausible without prior knowledge about the underlying data distributions

4

Goals● Investigation of how data fusion can be applied for unsupervised learning.● Employing data fusion in a self adapting structure which is capable of mutating

according to input data.● Building a single representation for multiple sources of heterogeneous data.● Improving road traffic visual analysis with data fusion to identify different levels of

congestion thoroughly.

5

Growing Self Organizing Maps● SOM is a vector quantization method and a dimensionality reduction

method: produces a fixed size 2D grid● GSOM: a dynamically growing SOM, an adapting structure according to

the input data○ Initialize the 2D map with a set of nodes (usually 4) with random weights.○ Grow new nodes as the aggregated error in existing nodes exceeds a specified

tolerance level.

○ Stop new node growing and smoothen the map.

6

Kernel Self Organizing MapsKernel ?A kernel is a function such that for all examples x and z in an input space X ⊂ Rd :K(x, z) =< φ(x), φ(z) > where φ is a linear or nonlinear mapping from the input spaceX to the feature space F, and < , > is an inner product.

● Two methods to employ kernels in SOM:○ Type I Kernel

■ Depends on the size of the map and input data size○ Type II Kernel

■ Can support online learning, no dependency on the size of the map● These two types have been proved equivalent.

7

Data Fusion with Multiple Kernel GSOM● For a single kernel:

○ Distance calculation

○ Determining winner node

○ Updating error

8

Data Fusion with Multiple Kernel GSOM● Employing multiple kernels:

○ Node vectors have the aggregated dimensionality of all input data sources○ A function Kd : Gd × Gd → R such that Kd is symmetric and positive kernel.○ A convex combination of kernels => A global kernel

○ Update rule of kernel coefficients: a stochastic gradient descent procedure

○ where V(t) is a standard rate that decreases with time t, x is input vector, w is node weight, LR is predefined learning rate and,

9

A Framework for Multiple Kernel GSOM Based Clustering Multiple Data Sources

10

Data Fusion of Road Traffic video Data with Multiple Kernel GSOM

● UCSD highway traffic video data set which was taken over two days from a stationary camera overlooking I-5 in Seattle, Washington totaling to 20 minutes of time.

● 3 congestion levels:○ Heavy : 44○ Medium : 45○ Low : 165

11

Data Fusion of Road Traffic video Data with Multiple Kernel GSOM

● Video feature extraction: motion○ Optical flow based features - Histogram of Optical Flow descriptor(HOF)○ Change direction based features - Frame difference

● HOF - 400 to 1600 dimensions - Linear Kernel● Frame difference - 40 dimensions - Gaussian (RBF) Kernel

Data fusion in MK-GSOM involves both HOF features with Linear Kernel and Frame difference features with Gaussian Kernel.

12

Data Fusion of Road Traffic video Data with Multiple Kernel GSOMAfter training MK-GSOM, K-Means clustering has been applied on the map in order to identify clusters.

HOF

FD

13

Data Fusion of Road Traffic video Data with Multiple Kernel GSOM

Clusters formed by HOF features with Linear Kernel

Clusters formed by FD features with Gaussian Kernel

Clusters formed by Fusion of HOF features with Linear Kernel and FD features with Gaussian Kernel

14

Data Fusion of Road Traffic video Data with Multiple Kernel GSOM

15

Data Fusion of Road Traffic video Data with Multiple Kernel GSOM

16

Data Fusion of Road Traffic video Data with Multiple Kernel GSOM

17

Conclusions

● Data fusion mechanisms significantly improves the quality of the results of unsupervised learning algorithms.

● Multiple kernel algorithms provide means of smooth integration of fusion techniques into existing learning algorithms.

● MK-GSOM successfully adopts data fusion into its’ adaptive structure.● The experimental results of road traffic video data elaborate the effectiveness of this

approach in the application of heterogeneous data fusion.● Ability to identify novel clusters which otherwise could not have been identified with

individual data sources.● Challenges: Tuning hyperparameters of GSOM and selecting appropriate kernel.● Limitations:

○ When a large number of data sources are available, training time can grow rapidly.18

References[1] F. Castanedo. “A Review of Data Fusion Techniques”. In: The Scientific World Journal2013 (2013). [2] T. Kohonen, “The self-organizing map,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1464–1480, 1990.[3] D. Alahakoon, S. Halgamuge, and B. Srinivasan, “Dynamic self-organizing maps with controlled growth for knowledge discovery,” Neural Networks, IEEE Transactions on, vol. 11, no. 3, pp. 601–614, 2000.[4] D. MacDonald and C. Fyfe, “The kernel self-organising map,” in Knowledge-Based Intelligent Engineering Systems and Allied Technologies, 2000. Proceedings. Fourth International Conference on, vol. 1. IEEE, 2000, pp. 317–320.[5] M. Olteanu, N. Villa-Vialaneix, and C. Cierco-Ayrolles, “Multiple kernel self-organizing maps,” in European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2013, p. 83.[6] A. B. Chan and N. Vasconcelos, “Modeling, clustering, and segmenting video with mixtures of dynamic textures,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, no. 5, pp. 909–926, 2008.

19

Q&A

Thank you...

20