29
Dr. Christopher Henry, P. Eng. Dept. of Applied Computer Science University of Winnipeg 515 Portage Ave. Winnipeg, MB, Canada R3B 2E9 GTC 2014 A Parallel GPU Solution to the Maximal Clique Enumeration Problem for CBIR

A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Dr. Christopher Henry, P. Eng.Dept. of Applied Computer Science

University of Winnipeg515 Portage Ave.

Winnipeg, MB, Canada R3B 2E9

GTC 2014

A Parallel GPU Solution to the Maximal Clique Enumeration Problem for CBIR

Page 2: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Introduction• Goal: Quantify the nearness (or apartness) of sets of

objects based on their descriptions• Tolerance near set theory provides a framework for this

assessment• Approach is dependent on finding all the tolerance

classes on a set of objects• The problem of finding tolerance classes has recently

been mapped to performing Maximal Clique Enumeration (MCE)

• Focus: An efficient method for finding all tolerance classes on a set of objects

• Application: Content-Based Image Retrieval

Page 3: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Tolerance Near Sets• Goal: Determine the perceptual nearness of disjoint sets

of objects• Near sets provide:

• Framework• A systematic method to formally answer the question “Are these

sets similar?”• To what extent

• Traces its origins contributions of Webner, Fechner, Poincaré, Zeeman, Sossinsky, Reiz, Pawlak, Orłowska, and Peters

Page 4: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Tolerance Relation and Tolerance Classes• Tolerance near sets are defined by a description-based

tolerance relation• Tolerance relations provide a view of the world without

transitivity

Page 5: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Nearness Measure• The nearness measure was created out of a need to

determine the degree that near sets resemble each other

Page 6: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Application: Content-Based Image Retrieval

• Goal: Retrieve images based on content of an image• Rather than on semantic string or keyword associated with the

image

Page 7: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Perceptual Image Analysis• Subimages are the perceptual objects used in this work• Aim: find tolerance classes (sets) contained in the union

of subimages from two images• MCE is the approach used to find these tolerance classes

Page 8: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Image Dataset• Results are based on images from the SIMPLcity image

dataset

Page 9: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Maximal Clique Enumeration• Observation: The problem of finding classes can be

mapped to the Maximal Clique Enumeration Problem• Classes can be found using an algorithm with reduced

complexity• Based in graph theory

• MCE problem consists of finding all the maximal cliques among an undirected graph• Let G=(V,E) denote an undirected graph

• V: set of vertices• E: set of edges that connect pairs of distinct vertices from V

• A clique is a set of vertices where each pair of vertices in the clique is connected by an edge in E

• A maximal clique in G is a clique whose vertices are not all contained in some larger clique

Page 10: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Bron-Kerbosh MCE Algorthm• First serial algorithm for MCE was developed by Harary

and Ross• Since then, two main approaches have been established

to solve the MCE problem• Greedy approach by Bron-Kerbosh

• Concurrently discovered by Akkoyunlu• Output-sensitive approaches

• General idea: find maximal cliques through a depth-first search• Branches are formed based on candidate cliques• Backtracking occurs once a maximal clique has been discovered

• Algorithm essentially marks new nodes and processes them

Page 11: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Maximal Clique Enumeration

Page 12: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Visualization of Tree Structure

Page 13: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

GPU Algorithm• Each block of threads processes one pair of images• Performs comparison between query image and

candidate image• Each thread executes MCE algorithm on 1 node in the tree• i.e. each thread finds all child nodes for a given node

• One level of the tree is processed per iteration• Threads and blocks are arranged in one dimension

Page 14: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

GPU Algorithm

Page 15: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

GPU Algorithm• Each block is assigned memory space for processing a

different pair of images

• Stopping condition

Page 16: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Algorithm Iteration

Block-level Sorting Device-level Sorting

Page 17: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Node Data Structure• Recall, each node contains three data structures

1. clique: A list of vertices in the current search path2. cands: A list of candidates vertices that are not in clique

• But, connected to every vertex in clique3. not: A list of vertices that are connected to every vertex in clique

• But, would form a redundant path if combined with vertices in clique

• Each of these are stored as a bit set • Contained in an array of type unsigned char

• Bit set representation allows for efficient intersection and counting operations

• These nodes are stored in global memory• Also too large to store in shared memory• CGMA is 1 since node is only used by 1 thread

Page 18: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Global Memory• The input and output of each kernel iteration are nodes• Each thread can generate multiple nodes during each

kernel iteration• The number of output nodes are not known a priori

• After each iteration a radix sort is used to move the output to contiguous input locations• Sort is performed on offsets using CUB library

• Each thread is also updating its portion of the nearness measure after each iteration• Can be calculated as a weighted average

Page 19: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Texture Memory• Each node requires the adjacency matrix during MCE

• For this CBIR application the adjacency matrix is ~26KB• Too large for shared and constant memory

• Also not possible to know adjacency matrix access patterns a priori

• Thus, adjacency matrix is stored in texture memory• One for each block (each pair of images)

Page 20: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Timing Results

ε CPU MCE (sec)i7-930

GPU MCE (sec)GeForce GTX

460

GPU MCE (sec)K20

K20 Memory(MB)

0.1 0.85 0.1 0.009 5850.2 0.84 0.06 0.023 7240.3 2.45 1.42 0.035 8620.4 28.71 3.39 0.078 1277

Page 21: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

CBIR Results

Page 22: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

CBIR Results

Page 23: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

CBIR Results

Page 24: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

CBIR Results

Page 25: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Conclusion• The algorithm presented here is tailored to a specific

dataset• Approach can be adapted for use on larger graphs• The key restraint is the upper limit on the amount of

nodes each thread can generate• Enough storage must be available for these nodes

Page 26: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Acknowledgements• This research has been supported by:

• The Natural Sciences and Engineering Research Council of Canada (NSERC) grant 418413

• NVIDIA Corporation

• Special thanks to:• Keith Massey• Tariq Alusaifeer• Sheela Ramanna• James F. Peters

Page 27: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

References• J. F. Peters and P. Wasilewski, “Foundations of near sets,” Info. Sci., vol. 179, no. 18, pp. 3091–3109, 2009.• C. J. Henry, “Near Sets: Theory and Applications,” Ph.D. dissertation, University of Manitoba, CAN, 2010, Available

at: https://mspace.lib.umanitoba.ca/handle/1993/4267.• C. J. Henry and S. Ramanna, “Maximal clique enumeration in finding near neighbourhoods,” Transactions on Rough

Sets, vol. LNCS 7736, pp. 103–124, 2013.• T. Alusaifeer, S. Ramanna, C. J. Henry, and J. Peters, “GPU implementation of MCE approach to finding near

neighbourhoods,” Lecture Notes in Artificial Intelligence, vol. LNAI 8171, pp. 251–262, 2013.• G. T. Fechner, Elements of Psychophysics, vol. I. London, UK: Hold, Rinehart &Winston, 1966, H. E. Adler’s trans. of

Elemente der Psychophysik, 1860.• H. Poincaré, Science and Hypothesis. Brock University: The Mead Project, 1905, L. G. Ward’s translation.• E. C. Zeeman, “The topology of the brain and the visual perception,” in Topoloy of 3-manifolds and selected topics,

K. M. Fort, Ed. New Jersey: Prentice Hall, 1965, pp. 240–256.• A. B. Sossinsky, “Tolerance space theory and some applications,” Acta Applicandae Mathematicae: An International

Survey Journal on Applying Mathematics and Mathematical Applications, vol. 5, no. 2, pp. 137–167, 1986.• F. Riesz, “Stetigkeitsbegriff und abstrakte mengenlehre,” Atti del IV Congresso Internazionale dei Matematici, vol. II,

pp. 18–24, 1908.• Z. Pawlak, “Classification of objects by means of attributes,” Institute for Computer Science, Polish Academy of

Sciences, Tech. Rep. PAS 429, 1981.• E. Orłowska, “Semantics of vague concepts. Applications of rough sets,” Institute for Computer Science, Polish

Academy of Sciences, Tech. Rep. 469, 1982.• J. F. Peters, “Near sets. General theory about nearness of objects,” Applied Mathematical Sciences, vol. 1, no. 53,

pp. 2609–2629, 2007.• J. F. Peters and P. Wasilewski, “Tolerance spaces: Origins, theoretical aspects and applications,” Information

Sciences, vol. 195, no. 0, pp. 211–225, 2012.

Page 28: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

References• C. J. Henry, “Perceptually indiscernibility, rough sets, descriptively near sets, and image analysis,” Transactions on

Rough Sets, vol. LNCS 7255, pp. 41–121, 2012.• J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semantics-sensitive integrated matching for picture libraries,”

IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 9, pp. 947–963, 2001.• F. Harary and I. C. Ross, “A procedure for clique detection using the group matrix,” Sociometry, vol. 20, no. 3, pp.

205–215, 1957.• M. C. Schmidt, N. f. Samatova, K. Thomas, and P. Byung-Hoon, "A scalable, parallel algorithm for maximal clique

enumeration," Journal of Parallel and Distributed Computing, vol. 69, pp. 417-428, 2009.• C. Bron and J. Kerbosch, “Algorithm 457: finding all cliques of an undirected graph,” Communications of the ACM,

vol. 16, no. 9, pp. 575–577, 1973.• E. A. Akkoyunlu, “The enumeration of maximal cliques of large graphs,” SIAM Journal on Computing, vol. 2, no. 1,

pp. 1–6, 1973.• S. Tsukiyama, M. Ide, H. Ariyoshi, and I. Shirakawa, “A new algorithm for generating all the maximal independent

sets,” SIAM Journal on Computing, vol. 6, pp. 505–517, 1977.• K. Makino and T. Uno, “New algorithms for enumerating all maximal cliques,” Algorithm Theory: SWAT 2004, LNCS,

vol. 3111, pp. 260–272, 2004.

Page 29: A Parallel GPU Solution to the Maximal Clique Enumeration ...on-demand.gputechconf.com/.../S4510-maximal-clique-enumeration-problem-cbir.pdfElemente der Psychophysik, 1860. ... A Parallel

Thank You