An Efficient Algorithm for Incremental Mining of Association Rules

Chin-Chen Chang, Yu-Chiang Li, Jung-San Lee

RIDE-SDMA’05

Speaker ：董原賓 Advisor ：柯佳伶

Introduction Previous incremental mining algorithms

FUP (Fast Update Algorithm) FUP2 negative border※They all have to rescan the originally database

Problem Publication-like database

EX ： Publication database, web log records, etc. The original database is normally much larger than the incremental database

Solution NFUP (New Fast Update Algorithm)

Definition

DB ： original database db ： the set of newly added transaction

s DB+ ： DB + db n, Pn ： db is divided into n partitions, db = P1UP2U,…,UPn-1UPn

dbm,n = PmUPm+1U,…,UPn-1UPn

Definition α set: frequent itemsets in DB+

β set: frequent in dbm,n , (m ≤ n), but infrequent in dbm-1,n

γ set: frequent in dbm,m, but infrequent in dbm+1,n

X.count ： occurrence count

X.start ： partition number when X becomes frequent

X.type ： denotes one of the three types α,β, and γ

FUP (Fast Update Algorithm)

In case2, itemset is easily calculated In case3, FUP needs to rescan the orig

inal database

NFUP (New Fast Update Algo.) A backward method that only requires scan

ning incremental database

A frequent itemset in the incremental database is also important even if it is infrequent in the updated database

Partition the incremental database (db) by the time interval

NFUP The frequent set of itemsets of DB is k

nown in advance

NFUP scans each partition backward, the last partition is scanned first

In each partition, the process is performed like that of Apriori.

Scan from Pn to P1 and find the α,β,γ itemsets in db

After P1 is scanned, the occurrence count is accumulated with itemsets of DB

The latest partition is scanned first, initialize variables and accumulate the occurrence

Still frequent in Pm then

accumulate count

Still frequent in dbm,n then accumulate count

Only frequent in dbm+1,n then Remove from α set and addInto β set

Not belong to any set and frequent in Pm then check if Pm is the latest partitionYes α set No γ set

Example

Scan p2 : 1-itemset

α set startcountβ set startcount γ set startcount

Min sup = 50%

{A: 2} {B: 2} {C: 3}{D: 1} {E: 1} {F: 2}

3 x 0.5 = 1.5

Check if itemset belongs to α setElse check itemset doesn’t belongs to any setCheck if itemset’s count >= 1.5Check if P2 is the latest partition yes α no γ

{A} 2 2

{B} 2 2

{C} 2 3

{F} 2 2

{AB} 2 2

{AC} 2 2

{BC} 2 2

{CF} 2 2

{ABC} 2 2

Run Apriori-gen scan P2 : 2-itemset {AB: 2} {AC: 2} {AF: 1} {BC: 2} {BF: 1} {CF: 2}

Check if itemset belongs to α set Else check itemset doesn’t belong to any set Check if itemset’s count >= 1.5 Check if P2 is the latest partition yes α no γ

{ABC: 2}Scan P2 : 3-itemset

Example

Scan p1 : 1-itemset

α set startcountβ set startcount γ set startcount

Min sup = 50%

{A: 1} {B: 3} {C: 2}{D: 1} {E: 3} {F: 0}

3 x 0.5 = 1.5

Check if itemset belongs to α set Check itemset doesn’t belongs to any setElse check if itemset’s count >= 1.5Check if P1 is the latest partition yes α no γ

{A} 2 2

{B} 2 2

{C} 2 3

{F} 2 2

{AB} 2 2

{AC} 2 2

{BC} 2 2

{CF} 2 2

{ABC} 2 2

Run Apriori-genscan P1 : 2-itemset {AB: 1} {AC: 0} {BC: 2}{BE: 3} {CE: 2}Check if itemset belon

gs to α set Check itemset doesn’t belong to any set Else check if itemset’s count >= 1.5 Check if P1 is the latest partition yes α no γ

Yesaccumulate countCount < s*|dbm,n| = 0.5x6 = 3 β set

1{F} 2 2 {E} 1 3

{AC} 2 2

{CF} 2 2

{BE} 1 3

{CE} 1 2

{ABC} 2 2

Example

α set startcount

{A} 1 3

{B} 1 5

{C} 1 5

{AB} 1 3

{BC} 1 4

γ set startcount

{E} 1 3

{BE} 1 3

{CE} 1 2

β set startcount

{F} 2 2

{AC} 2 2

{CF} 2 2

{ABC} 2 2

{AB} 1 3

{BC} 1 4

{ABC} 2 2

{AE} 0 3

Experiment

Intel Pentium IV 1.5GHz CPU, 640 MB main memory

Microsoft Windows 2000 Professional Synthetic datasets:

Experiment

An Efficient Algorithm for Incremental Mining of Association Rules

Documents

Efficient Variants of the ICP Algorithm

Efficient ATL Incremental Transformations

Incremental Placement Algorithm for Field Programmable ...lemieux/publications/presentations/leong-fpl2009talk.pdf• Incremental Placement Algorithm – RePlace – Based on Placement

Efficient Incremental Validation of XML Documents

An Efficient Video Similarity Search Algorithm

A BLOCK INCREMENTAL ALGORITHM FOR COMPUTING …

Fast Incremental Algorithm for Speeding Up the Computation of Binarization

Efficient incremental density-based algorithm for ... · the proposed algorithm speeds up the incremental clustering process with a factor up to 3.2 compared to relevant existing

Efficient Elevator Algorithm

An Autonomous Incremental Learning Algorithm for Radial Basis

Chapter 3 Core-based Incremental Placement Algorithm · 2020. 9. 25. · Chapter 3 Core-based Incremental Placement Algorithm Based on the literature reviews, knowledge of design

Efficient Incremental Computation of Aggregations over

Efficient Algorithm

Modularity-Based Incremental Label Propagation Algorithm

Incremental Connectivity-Based Outlier Factor Algorithm · 2017-11-30 · incremental variant of local outlier factor algorithm, with asymptotic time complexity equal to the time

Slider: an Efficient Incremental Reasoner, by Jules Chevalier

iBOA: The Incremental Bayesian Optimization Algorithm

[ACM-ICPC] Efficient Algorithm

A Priority-Based Preemption Algorithm for Incremental

A Fast Algorithm for Incremental Distance Calculation