33
HIERARCHICAL CELLULAR TREE HIERARCHICAL CELLULAR TREE (HCT) (HCT) AN EFFICIENT INDEXING METHOD FOR CONTENT AN EFFICIENT INDEXING METHOD FOR CONTENT - - BASED BASED RETRIEVAL ON MULTIMEDIA DATABASES RETRIEVAL ON MULTIMEDIA DATABASES By Serkan KIRANYAZ TUT, Finland

AN EFFICIENT INDEXING METHOD FOR CONTENT …muvis.cs.tut.fi/Documents/HCT.pdf · AN EFFICIENT INDEXING METHOD FOR CONTENT-BASED . RETRIEVAL ON MULTIMEDIA DATABASES. ... CF. C > CThr

  • Upload
    lykien

  • View
    229

  • Download
    4

Embed Size (px)

Citation preview

HIERARCHICAL CELLULAR TREEHIERARCHICAL CELLULAR TREE(HCT)(HCT)

AN EFFICIENT INDEXING METHOD FOR CONTENTAN EFFICIENT INDEXING METHOD FOR CONTENT--BASED BASED RETRIEVAL ON MULTIMEDIA DATABASESRETRIEVAL ON MULTIMEDIA DATABASES

By Serkan KIRANYAZTUT, Finland

Indexing TreesIndexing TreesStatic Clustering Techniques:Static Clustering Techniques:

MST, MST, kk--meansmeans, etc., etc.

Spatial Access Methods (SAMs):Spatial Access Methods (SAMs):KDKD--Tree, RTree, R--tree, R*tree, R*--tree, R+tree, R+--tree, TVtree, TV--tree, Xtree, X--tree, SStree, SS--tree tree SRSR--tree, Stree, S²²--Tree, HybridTree, Hybrid--Tree, ATree, A--tree, IQtree, IQ--tree, Pyramid tree, Pyramid Tree , NBTree , NB--tree, etc.tree, etc.

Metric Access Methods (MAMs)Metric Access Methods (MAMs)::StaticStatic: VP: VP--Tree, MVPTree, MVP--Tree, GNAT, SOM, etc.Tree, GNAT, SOM, etc.DynamicDynamic: M: M--Tree, M+ Tree, SlimTree, M+ Tree, Slim--Tree.Tree.

What to Use for Indexing MM Databases?What to Use for Indexing MM Databases?

Static Clustering Techniques: Infeasible!Static Clustering Techniques: Infeasible!O( ) computational O( ) computational costcost (i.e. (i.e. MSTMST))Static Static constructionconstruction..ParameterParameter dependentdependent (i.e. (i.e. kNNkNN: : kk=?)=?)

SAMs: Infeasible!SAMs: Infeasible!Works in single feature Works in single feature spacespace..LossLoss of of efficiencyefficiency in in highhigh dimensionsdimensions (i.e. d>10)(i.e. d>10)

Static MAMs: Infeasible!Static MAMs: Infeasible!Static Static constructionconstruction..TopTop--BottomBottom constructionconstruction (i.e. SOM)(i.e. SOM)DependencyDependency on on goodgood vantagevantage objectobject selectionselection (i.e. vp(i.e. vp--tree, mvptree, mvp--tree, tree, SOM)SOM)

Dynamic MAMs: Dynamic MAMs: FeasibleFeasible butbut EfficiencyEfficiency??ParameterParameter dependencydependency? (i.e. M? (i.e. M--tree, M=?)tree, M=?)PerformancePerformance on on largelarge databases? (databases? (RobustnessRobustness for for corruptioncorruption))

2n

(Indexing) (Indexing) RequirementsRequirements for for largelargeMM DatabasesMM Databases

Dynamic Dynamic MAMMAM--basedbased::Dynamic (Dynamic (IncrementalIncremental) ) ConstructionConstruction..HiearchicalHiearchical StructureStructure O(nlognO(nlogn) computational ) computational costcost..BottomBottom--TopTop ConstructionConstruction (No (No vantagevantage pointspoints selectionselection is is neededneeded))Works Works withwith multiplemultiple featuresfeatures ((overover similaritysimilarity distancedistance betweenbetween objectsobjects))

((SignificantSignificant) ) ParameterParameter IndependenceIndependence (No (No kk, M,..), M,..)RobustnessRobustness to to CorruptionCorruption::

DueDue to to ””crowdcrowd effecteffect”” in in largelarge databases.databases.DueDue to to defficienciesdefficiencies of the of the lowlow--levellevel featuresfeatures (i.e. (i.e. limitedlimiteddiscriminationdiscrimination, , noisynoisy behaviourbehaviour, etc.), etc.)

EfficientEfficient ((HierarchicHierarchic) Clustering:) Clustering:The The earliestearliest possiblepossible mostmost relevantrelevant retrievalsretrievals in a in a queryquery operationoperation..ConvenientConvenient BrowsingBrowsing and and NavigationNavigation (i.e. (i.e. providesprovides ””mentalmental picturepicture””))

Hierarchical Cellular Tree: HCTHierarchical Cellular Tree: HCTDefinition: HCT is a dynamic, parameter independent and flexible cell (node) sized indexing structure, which is optimized to achieve as focused cells as possible using the (sub-optimum) visual and aural descriptors.

a

cb

c

de

a b f

d c e a bf

sp

r

h

g

im

l

n

k

j

o

A CB

FEDCBA

A

Level 2 = Top Level

Level 1

Level 0 = Ground Level

a 32c 41fb d 6e 5 7g

1

texta

c

f

b

de

g

f

b

de

g

a

b

1

f

32 41 65 7

L0

L1

32 4 65 7

1

fb d

g

a

b

1

L0

L1

5

5 7

2

e

e

3 6

?

4

Sample HCT Construction: ”Color Balls”

BasicBasic HCTHCT FeaturesFeaturesBy means of the By means of the flexible cell sizeflexible cell size property, one or minimum number of property, one or minimum number of cell(scell(s) ) are used to store a group of similar items, which in effect reduare used to store a group of similar items, which in effect reduces the ces the degradations caused by degradations caused by ““crowd effectcrowd effect”” within the within the HCTHCT body. body. During their lifeDuring their life--time the cells are under the close surveillance of their levels time the cells are under the close surveillance of their levels in in order to enhance the compactness using mitosis operations whenevorder to enhance the compactness using mitosis operations whenever er necessary to rid of the dissimilar necessary to rid of the dissimilar item(sitem(s) from the cell. Furthermore, for the item ) from the cell. Furthermore, for the item insertions, an optimum insertions, an optimum cell searchcell search technique (technique (PrePre--emptiveemptive) is used to find out ) is used to find out the most suitable (similar) cell in each level. the most suitable (similar) cell in each level. HCTHCT is also is also intrinsically dynamicintrinsically dynamic, meaning that the cell and level parameters , meaning that the cell and level parameters and primitives are subject to continuous upgrade operations to pand primitives are subject to continuous upgrade operations to provide most rovide most reliable environment. For example a cell nucleus item is changedreliable environment. For example a cell nucleus item is changed whenever a whenever a better candidate is available and once a new nucleus item is assbetter candidate is available and once a new nucleus item is assigned, its owner igned, its owner cell in the upper level is found after a cell search instead of cell in the upper level is found after a cell search instead of using the old oneusing the old one’’s s owner cell. Such a dynamic internal behavior keeps the owner cell. Such a dynamic internal behavior keeps the HCTHCT body intact by body intact by preventing the potential sources of corruption.preventing the potential sources of corruption.By means of By means of MSTMST within each cell, the optimum nucleus item can be assigned within each cell, the optimum nucleus item can be assigned whenever necessary and with no cost. Furthermore the optimum splwhenever necessary and with no cost. Furthermore the optimum split it management can be done when the mitosis operation is performed. management can be done when the mitosis operation is performed. Most Most important of all, MST provides a reliable compactness measure viimportant of all, MST provides a reliable compactness measure via a ““cell cell similaritysimilarity”” for any item instead relying on only to a single (nucleus) itemfor any item instead relying on only to a single (nucleus) item. .

HCTHCT OverviewOverview: : CellCell StructureStructureFlexibleFlexible SizeSizeMaturityMaturity LevelLevel (i.e. )(i.e. )InIn--builtbuilt MSTMST2 Black 2 Black BoxesBoxes: :

(Dis(Dis--) ) SimilaritySimilarity DistanceDistance CalculationCalculationCompactnessCompactness Feature (Feature (onlyonly for for maturemature cellscells))

Dynamic Dynamic EventsEvents::MitosisMitosis ((CellCell SplitSplit))((IncrementalIncremental) MST ) MST ConstructionConstructionNucleusNucleus AssignmentAssignment

6≥MN

Cell Structure: Dynamic MST FormationThere are several MST construction algorithms, such as Kruskaland Prim’s. Those algorithms are, however, static algorithms, that is, all the items with their relative (similarity) distances with respect to each other should be known beforehand. The construction of MST requires computational cost where N is the number of items. Dynamic Cell Nucleus Assignment Rule: The item with max. number of connections (branches) is assigned (updated) as the (new) cell nucleus. Yet old nucleus items have the priority.

)( 2NO

text

5

71

2

3

1

4

2

3Insert

3

2

4

14

2

3

3

3

2

4

1X4

2

3

5 1

3

3

2

4

1

X

2

3

5 1

2

3

3

2

4

1

2

5 1

2

Node SD

MST Before Insertion MST After Insertion4>3 3>2

Cell Compactness FeatureThis is the feature, which represents the compactness of the cell items, i.e. how tight (focused) the clustering for the items within the cell. Furthermore the regularization function implementation for the calculation of the cell compactness feature is a black box for HCT. Once a cell reaches maturity (a pre-requisite for the compactness feature calculation) then a regularization function, f, can be expressed using the following statistical cell parameters:

CCCCCCCCCCC NwrKNwrfCF )max()),max(,,,( σµσµ ==

..grantedMitosisCellCThrCF LC →>

Similar to continuous updates for the nucleus item, the value is also updated (recalculated) each time an

operation is performed over the cell C. CCF

CellCell MitosisMitosis ProcessProcessHappensHappens onlyonly ifif the the cellcell is is maturemature and and notnot focusedfocused ((enoughenough). ). No No costcost duedue to the to the presencepresence of MSTof MSTThe The parentparent cellcell ((alongalong withwith itsits cellcell representativerepresentative nucleusnucleus) ) ceasesceases to to existexist, , insteadinstead 2 2 newbornnewborn cellscells emergeemerge. . A A SampleSample MitosisMitosis OperationOperation::

Parent Cell Before Mitosis 2 Child Cells After Mitosis

3

2

4

1

251

1

1 6

7

9

8

32

2

6

7

9

8

32

2

3

2

4

1

251

1

1X8 +=

C2C

C1

HCTHCT OverviewOverview: : LevelLevel StructureStructureThe top level contains single cell and when this cell splits in The top level contains single cell and when this cell splits in the top level, the top level, then a new top level is created above this level. then a new top level is created above this level. Each level is responsible for taking logs about the operations pEach level is responsible for taking logs about the operations performed erformed in it, such as number of mitosis operations, the statistics abouin it, such as number of mitosis operations, the statistics about the t the compactness feature of the cells, etc. compactness feature of the cells, etc. Within a period of time (i.e. during a number of insertions or aWithin a period of time (i.e. during a number of insertions or after some fter some number of mitosis occurs), each level updates its compactness thnumber of mitosis occurs), each level updates its compactness threshold reshold according to the compactness feature statistics of the mature ceaccording to the compactness feature statistics of the mature cells, into lls, into which an item inserted. Therefore, value for a particular levelwhich an item inserted. Therefore, value for a particular level L L can be can be estimated as follows: estimated as follows:

PCF

SC

NNCCL SCkCF

PkCThr

C

P

MC

∈∀== ∑∈

>

µ00

Primary HCT Operations:“Incremental Item Insertions”

Insert (nextItem, levelNo)Let top level number: topLevelNo and the single cell in top level: cell-TIf(levelNo > topLevelNo) then do:

oCreate a new top level: level-T with number = topLevelNo+1oCreate a new cell in level-T: cell-ToAppend nextItem into cell-T.oReturn.

Let the Owner (target) cell in level levelNo: cell-O If(levelNo = topLevelNo ) then do:

oAssign cell-O = cell-TElse do:

oCreate a cell array for Pre-emptive cell search: ArrayCS[], put cell-T into itoAssign cell-O = PreEmptiveCellSearch (ArrayCS[], nextItem, topLevelNo)

Append nextItem into cell-O.Check cell-O for Post-Processing:

oIf cell-O is split then do:Let item-O, item-N1 and item N2 be old nucleus item (parent) and new nucleus items (2 child)Remove( item-O, levelNo+1)Insert(item-N1, levelNo+1)Insert(item-N2, levelNo+1)

oElse if nucleus item is changed within cell-O then do:Let item-O and item-N be old and new nucleus items.Remove( item-O, levelNo+1 )Insert( item-N, levelNo+1 )

Return.

Incremental Item Insertions:Incremental Item Insertions:PrePre--emptiveemptive Cell SearchCell Search

Used for finding the optimum (owner) cell in the level that Used for finding the optimum (owner) cell in the level that insertion occurs. The traditional cell search technique, insertion occurs. The traditional cell search technique, MSMS--NucleusNucleus, which is used in M, which is used in M--Tree and its derivatives depends Tree and its derivatives depends on a simple heuristics assuming the closest nucleus (routing) on a simple heuristics assuming the closest nucleus (routing) object yield to the best subobject yield to the best sub--tree during descend and finally tree during descend and finally the best (owner) cell to be appended.the best (owner) cell to be appended.

o

1d 2d

o

1d 2d

Case 2:

1NO1

NO 2NO 2

NO)( 1

NOr )( 1NOr)( 2

NOr)( 2

NOr

1∆ 2∆

212 CO →⇒∆<∆ 22

2 )( COOrd N →⇒<

1C 2C1C

2CCase 1:

a

f

c d

be

a

e

dc

b

f

Pre-emptive vs. MS-NucleusCell Search Techniques

• MS-Nucleus Heuristics:1. If neither nucleus item for which exists, the

choice is taken in order to minimize the increase of the covering radius, i.e. , among all the nucleus objects that are in the

owner cell C. 2. If there exists a nucleus item for which exists, then its

sub-tree is tracked in the lower level. If multiple sub-trees (nucleus objects) with this property exists, then the one to which the object O is the closest, is chosen.

iiN

iN COrOOd ∀≤ )(),(

iiN

iNi COrOOd ∀−=∆ )(),(

iiN

iN COrOOd ∀≤ )(),(

• Pre-emptive Heuristics:1. If neither nucleus item for which exists, then fetch all the

nucleus items whose cells in the lower level may provide the closest object, i.e. , among all the nucleus objects that are in the owner cell C.

2. If there exists one or more nucleus item(s) for which exist(s), then fetch all of them since their owner cells in the lower level may provide the closest object.

Since Case 1 implies Case 2, Case 1 alone can be used the one and only criteria to fetch

all the nucleus items for tracking.

iiN

iN COrOOd ∀≤ )(),(

iiN

iN COrOOd ∀≤ )(),(

iiN

iNi CdOrOOd ∀≤−=∆ min)(),(

Primary HCT Operations:“Items Removals”

Remove (ArrayIR[], levelNo)Let top level number: topLevelNo and the single cell in top level: cell-TLet the Owner (target) cell in level levelNo: cell-O Remove items in ArrayIR within cell-OCheck cell-O for Post-Processing:

oIf cell-O is depleted (cell-death) then do:If( levelNo = topLevelNo ) then do:

•Remove cell-O=cell-T •Remove the top level from HCT body

Else do:•Let item-O be the old nucleus item•Remove (item-O, levelNo+1)

oElse if cell-O is split then do:Let item-O, item-N1 and item N2 are old nucleus item and two new nucleus items.Remove (item-O, levelNo+1)Insert (item-N1, levelNo+1)Insert (item-N2, levelNo+1)

oElse if nucleus item is changed within cell-O then do:Let item-O and item-N be old and new nucleus items.Remove (item-O, levelNo+1)Insert (item-N, levelNo+1)

Return.

PrimaryPrimary HCTHCT Operations:Operations:““(Periodic) Fitness Check(Periodic) Fitness Check””

The goal is to reduce the number of the immature The goal is to reduce the number of the immature cells that are making the level cells that are making the level ““crowdedcrowded”” whilst whilst respecting the real minority cases. respecting the real minority cases. So the idea is to waste all the immature cells and So the idea is to waste all the immature cells and feed their items back to the system, expecting that a feed their items back to the system, expecting that a mature cell might now capture and own them. mature cell might now capture and own them. An An OptionalOptional operationoperation..ApplicapleApplicaple to to allall the the levelslevels ((exceptexcept top top levellevel), ), withwithtop to top to bottombottom orderorder..

HCTHCT vs. Mvs. M--treetreeMM--treetree is a generic MAM, designed to achieve a balanced tree with a lois a generic MAM, designed to achieve a balanced tree with a low w I/O cost in large data set. I/O cost in large data set. HCTHCT is on the other hand designed for indexing is on the other hand designed for indexing multimedia databases where the content variation is seldom balanmultimedia databases where the content variation is seldom balanced and it ced and it is especially optimized for compactness (highly focused cells). is especially optimized for compactness (highly focused cells). MM--treetree works over the nodes with a maximum (fixed size) capacity works over the nodes with a maximum (fixed size) capacity MM. . Therefore the performance depends on a Therefore the performance depends on a ““goodgood”” choice of this parameter choice of this parameter with respect to the database size and thus, Mwith respect to the database size and thus, M--tree construction significantly tree construction significantly varies with it. varies with it. HCTHCT on the other hand has no limit for the cell size as long as on the other hand has no limit for the cell size as long as the cell keeps a definite the cell keeps a definite ““compactnesscompactness”” measure. So measure. So HCTHCT will not drastically will not drastically suffer from the suffer from the ““crowd effectcrowd effect”” and the resultant corruption by clustering each and the resultant corruption by clustering each similar object into one (or minimum number of) similar object into one (or minimum number of) cell(scell(s) and hence providing ) and hence providing an equal representation chance for both minor and major group ofan equal representation chance for both minor and major group of items on items on the higher levels. the higher levels. In In MM--treetree the cell compactness is only measured with respect to distance the cell compactness is only measured with respect to distance of of the routing (nucleus) object to the farthest object that is so cthe routing (nucleus) object to the farthest object that is so called the alled the covering radiuscovering radius. . HCTHCT uses all the items and their minimum distances to the uses all the items and their minimum distances to the cell (instead of a single nucleus item alone) to come up with a cell (instead of a single nucleus item alone) to come up with a regularization regularization function that represents a dynamic model for the cell compactnesfunction that represents a dynamic model for the cell compactness. During s. During the lifetime of the the lifetime of the HCTHCT body (i.e. with incoming item insertions, removals body (i.e. with incoming item insertions, removals and internal transfers, events, etc.) this function dynamically and internal transfers, events, etc.) this function dynamically updates the updates the current cell compactness feature.current cell compactness feature.

HCTHCT vs. Mvs. M--tree (tree (contcont..)..)MM--tree tree performs a split operation only when the cell size reaches performs a split operation only when the cell size reaches MM without without paying attention to the current status (i.e. compactness) of thepaying attention to the current status (i.e. compactness) of the cell. For the cell. For the split operation, Msplit operation, M--tree first tries to find the suitable nucleus (routing) objects tree first tries to find the suitable nucleus (routing) objects and then form the child cells around them. and then form the child cells around them. HCTHCT first performs the mitosis first performs the mitosis operation to split the cell into two child cells and afterwards operation to split the cell into two child cells and afterwards assign the most assign the most suitable nucleus items for them accordingly. Due to the presencesuitable nucleus items for them accordingly. Due to the presence of MST of MST formation within each cell, there is no cost for mitosis operatiformation within each cell, there is no cost for mitosis operation since MST on since MST is used to decide from which branch the partition should be execis used to decide from which branch the partition should be executed. uted. MM--tree tree uses uses MSMS--NucleusNucleus cell search. It is the primary source of corruption cell search. It is the primary source of corruption due to due to ““crowd effectcrowd effect””. Furthermore, the incremental construction of a M. Furthermore, the incremental construction of a M--tree tree could lead to different structures depending on the order of thecould lead to different structures depending on the order of the item item insertions. insertions. HCTHCT uses the optimized uses the optimized PrePre--emptiveemptive cell search. By this way, cell search. By this way, along with the mitosis operation this search algorithm further ialong with the mitosis operation this search algorithm further improves the mproves the compactness factor for the cells at each level. compactness factor for the cells at each level. MM--treetree has a conservative structure that might cause degradations in dhas a conservative structure that might cause degradations in due ue time. On contrary time. On contrary HCTHCT has a totally dynamic approach. Any operation has a totally dynamic approach. Any operation (insertion, removal or mitosis) can change the current cell nucl(insertion, removal or mitosis) can change the current cell nucleus to a new eus to a new (better) one, in which case the old nucleus has been removed and(better) one, in which case the old nucleus has been removed and the new the new one is inserted into the most suitable cell in the upper level.one is inserted into the most suitable cell in the upper level.

4.403117.509926.5255.673394.4623.86831708.911548.58After FC

3.19667.631450.1053.054193.9181.92515110.94541.073Before FCPre-emptive

3.98931.02072.4564.147158.2873.0347674.978286.906After FC

3.00023.36546.1622.31079.6001.8474359.849168.079Before FCMS-NucleusHCT Const.

Time (sec)

7.6865.1285.3214.9263.6365.0513.5312.814After FC

6.2634.1544.0974.0982.7783.8083.0112.649Before FCPre-emptive

8.1488.1876.2314.8313.9225.3803.9314.670After FC

6.8224.4594.6194.6953.2264.3813.4403.283Before FCMS-NucleusAverage Cell

Size

0.1090.5490.8050.8720.5310.9802.2810.883After FC

0.1090.5400.7940.8610.5170.9152.2780.890Before FCPre-emptive

0.1490.7071.2551.0310.5851.1192.3871.034After FC

0.1470.6681.2001.0140.5581.0252.3511.015Before FCMS-Nucleus

Average Broken Branch Weight

0.1070.5390.9250.9860.5051.0772.5141.093After FC

0.0980.5040.8630.9350.5010.9612.3571.068Before FCPre-emptive

0.1330.8171.2931.0420.5881.1562.4491.229After FC

0.1270.5801.2061.049NaN1.0892.4481.238Before FCMS-NucleusAverage

Cover. Radius

0.01514.10477.58799.9059.378195.7212096.814128.164After FC

0.01112.09765.02082.44011.586134.2992172.990112.148Before FCPre-emptive

0.048157.738384.193145.58110.989304.2672687.994192.968After FC

0.03823.636289.694152.672NaN255.4172299.667173.328Before FCMS-NucleusAverage

Compactness

70.00052.57156.46554.40027.00063.03030.95236.865After FC

61.70542.35740.72944.2003.50040.40420.80223.179Before FCPre-emptive

72.89875.21465.11644.60032.00061.81830.45137.748After FC

64.09141.14347.86539.7000.00044.04021.30324.062Before FCMS-NucleusItem % in

Mature Cells

41.04819.04825.61824.63110.90925.51012.3899.938After FC

28.11412.76013.93716.3931.38912.3086.7925.848Before FCPre-emptive

44.44450.87734.65419.32415.68628.26114.28617.526After FC

31.78313.69418.22815.9620.00015.9299.0527.246Before FCMS-NucleusMature Cell

%

TextureShapeCorel10K

Corel1KSports VideoSports

ImageReal World

Audio

Real WorldVideo

Fitness CheckCell Search Alg.

Statistics (Level 0)

PQPQ overover HCTHCT

Sub-QueryFusion

Sub-QueryFusion

PeriodicSub-Query

Results

t = 2t 3t 4t

ProgressiveSub-Query

Result

time

Sub-Set 1

Sub-Set 3Sub-Set 2

Sub-Set N

1 32

1 1+2 1+2+3

4

MM Dbs.

pt

PQ over HCT (cont..)PQ operation over HCT is executed synchronously over two parallel processes: HCT tracer and a generic process for PSQ formation using the latest QP segment. HCT tracer is a recursive algorithm, which traces among the HCT levels in order to form a QP (segment) for the next PSQ update. When the time allocated for this operation is completed, this process is paused and the next PSQ retrieval result is formed and presented to the user.

– QP Formation from HCT: Starting from the top level, HCT tracer algorithm recursively traces among the levels and their cells according to the similarity of the cell nucleuses.

HCTtracer (ArrayQP[], levelNo, item-MS)Let cell-MS be the owner cell of item-MS.If (levelNo = 0) then do: // if this is ground levelo Append all items in cell-MS into ArrayQP[].o Return.

Else do: // if this is an intermediate levelo Create the priority queue of cell-MS: queue-MS.o For , do: // for all sorted (nucleus) items do:

HCTtracer (ArrayQP[] , levelNo-1, ) Return.

MSqueueOiN −∈∀

PQPQ overover HCT: QP HCT: QP FormationFormation

a

cb

c

de

a b f

d c e a bf

sp

r

h

g

i m

l

n

k

j

o

A CB

FEDCBA

A

Level 2 = Top Level

Level 1

Level 0 = Ground Level

QQuery Item:

12

acb

3

Q A

12

dec

3

Q A1 a

Q B12 f

b

Q C

3

11

3

112

2

1 1 2

2

11 2

233

b rQP(Q) p s f c i e j k l m n d g h a o

PQ operation

165 43 2

PQPQ overover HCT: HCT: SampleSample QPQP PlotsPlots

PQ over HCT• Once the QP segments are formed, PQ operation that is executed over HCT body

becomes similar to the sequential PQ. There are two main differences: each database sub-set should now be replaced by aQP segment created by the HCT tracer process

PQoverHCT(HCTfile, )Load the HCTfile to activate HCT body of the database.Create a timer, which signals to this process every millisecond. Create a process (thread) for HCT tracer.Set q = 0.While ( timer< > ticks ) do:

oPause HCT tracer process.oRetrieve QP segment as a periodic sub-query result. oFuse the periodic sub-query result with the last PSQ result to form next PSQ update.oRender the next PSQ update to the user.oUpdate value for the next (q+1st) PSQ period as given in Eq. 3. Reset the timer < >.oSet q q+1.oRe-activate HCT tracer process.

End loop.

qpt

qpt

PQ over HCT (cont..)

9002120009000300212033150015011200330003001PQ over HCT

57057829982400330023180504500336023180062100521004Seq. PQ

61080849215889748553399055486936652319772982037675NQ

Aural

2001150420033002150125013139200220033003PQ over HCT

30022001200360024004600360003997597412007Seq. PQ

70394954779972906774777659225938940719624NQ

Visual

Q10Q9Q8Q7Q6Q5Q4Q3Q2Q1Query TypeQuery Genre

Table I: Retrieval times (in msec) for 10 visual and aural query operations performed per query type.

BrowsingBrowsing in MM Databasesin MM DatabasesDefinition: Browsing is a loose process, which usually requires a continuous interaction and feedback from the user and therefore, it is a kind of free-navigation and exploration among the items of a database.

Yet, Browsing does not lack a purpose: it is to access a set of items in an efficient way even though the definition of the set may not be clear, or rather vague. So it is the browsingalgorithm’s responsibility to organize the database in such a way that the “unknown” parameters of any browsing action can be resolved as efficiently as possible. For guiding the userit is essential to provide an organized (perhaps in a hierarchical way) map of the entire database along with the current status of the user (e.g. such as a “You are here!” sign) should be provided during the browsing process.

HCTHCT BrowsingBrowsingSupportsSupports 3 3 typestypes of of NavigationsNavigations::

InterInter--levellevelInterInter--cellularcellularRandomRandom AccessAccess

The user is guided at each level by the nucleus The user is guided at each level by the nucleus items and several hierarchic levels of summarization items and several hierarchic levels of summarization help the user to have a help the user to have a ““mental picturemental picture”” about the about the entire database.entire database.EachEach interinter--cellularcellular triptrip is a is a ””summarisationsummarisation”” of the of the databasedatabase, and the , and the levellevel of of summarisationsummarisation is is simplysimplythe the ””heightheight”” of the of the naviagationnaviagation (i.e. the (i.e. the ownerowner levellevel))

HCTHCT BrowsingBrowsing in MUVIS: in MUVIS: MBrowserMBrowser

Prev Next

NucleusItem

CellItems

LevelControls Cell

Controls

HCTInfo

Cell MSTInfo

ItemNavigatorButtons

HCTHCT BrowsingBrowsing ((contcont..)..)

Level 3

Level 2

Level 1

Level 0

Level 3

Level 2

Level 1

Level 0

ConclusionConclusionHCTHCT is a dynamic, parameter independent and flexible cell (node) is a dynamic, parameter independent and flexible cell (node) sized indexing structure, which is optimized to achieve as focussized indexing structure, which is optimized to achieve as focused ed cells as possible using the subcells as possible using the sub--optimum visual and aural optimum visual and aural descriptors. descriptors. By means of the flexible cell size property, one or minimum numbBy means of the flexible cell size property, one or minimum number er of of cell(scell(s) are used to store a group of similar items, which in effect ) are used to store a group of similar items, which in effect reduces the degradations caused by reduces the degradations caused by ““crowd effectcrowd effect”” within the within the HCTHCTbody. body. During their lifeDuring their life--time the cells are under the close surveillance of time the cells are under the close surveillance of their levels in order to enhance the compactness using mitosis their levels in order to enhance the compactness using mitosis operations whenever necessary to rid of the dissimilar operations whenever necessary to rid of the dissimilar item(sitem(s) from ) from the cell. Furthermore, for the item insertions, an optimum cell the cell. Furthermore, for the item insertions, an optimum cell search search technique (technique (PrePre--emptiveemptive) is used to find out the most suitable ) is used to find out the most suitable (similar) cell in each level. (similar) cell in each level. HCTHCT is also intrinsically dynamic, meaning that the cell and level is also intrinsically dynamic, meaning that the cell and level parameters and primitives are subject to continuous upgrade parameters and primitives are subject to continuous upgrade operations to provide most reliable environment. For example a coperations to provide most reliable environment. For example a cell ell nucleus item is changed whenever a better candidate is availablenucleus item is changed whenever a better candidate is availableand once a new nucleus item is assigned, its owner cell in the uand once a new nucleus item is assigned, its owner cell in the upper pper level is found after a cell search instead of using the old onelevel is found after a cell search instead of using the old one’’s owner s owner cell. Such a dynamic internal behavior keeps the cell. Such a dynamic internal behavior keeps the HCTHCT body intact by body intact by preventing the potential sources of corruption.preventing the potential sources of corruption.

ConclusionConclusion ((contcont..)..)By means of MST within each cell, the optimum nucleus item can By means of MST within each cell, the optimum nucleus item can be assigned whenever necessary and with no cost. Furthermore thebe assigned whenever necessary and with no cost. Furthermore theoptimum split management can be done when the mitosis operation optimum split management can be done when the mitosis operation is performed (again with no cost). Most important of all, MST is performed (again with no cost). Most important of all, MST provides a reliable compactness measure via provides a reliable compactness measure via ““cell similaritycell similarity”” for any for any item instead relying on only to a single (nucleus) item. By thisitem instead relying on only to a single (nucleus) item. By this way a way a better judgment can be done whether or not a particular item is better judgment can be done whether or not a particular item is suitable for a mature cell. suitable for a mature cell. HCTHCT is particularly designed to work with is particularly designed to work with PQPQ in order to provide the in order to provide the earliest possible retrievals of the relevant items.earliest possible retrievals of the relevant items.Finally Finally HCTHCT indexing body can be used for efficient browsing and indexing body can be used for efficient browsing and navigation among the database items. The user is guided at each navigation among the database items. The user is guided at each level by the nucleus items and several hierarchic levels of level by the nucleus items and several hierarchic levels of summarization help the user to have a summarization help the user to have a ““mental picturemental picture”” about the about the entire database.entire database.HCTHCT ((withwith PQPQ) is ) is freelyfreely available within MUVIS v1.7,available within MUVIS v1.7,Check:Check: http://muvis.cs.tut.fihttp://muvis.cs.tut.fi

The The FutureFuture WorkWorkOptimisationsOptimisations overover HCT: HCT:

SpeedSpeedBetterBetter schemescheme for for compactnesscompactnessImprovedImproved FitnessFitness Check.Check.

HCTHCT--basedbased implementationsimplementations::Video Video SummarisationSummarisationKeyKey--FramingFraming for Audio Indexing.for Audio Indexing.RelevanceRelevance Feedback (Feedback (UserUser feedback feedback duringduringindexing and indexing and retrievalretrieval (via PQ) (via PQ) phasesphases

OtherOther PotentialPotential UsesUses: : TestingTesting the the efficiencyefficiency of of New New FeXFeX//AFeXAFeX methods.methods.Feature Feature DimDim. . ReductionReduction techniques, i.e. PCA.techniques, i.e. PCA.