Upload
pearlmillee
View
227
Download
0
Embed Size (px)
Citation preview
8/9/2019 Association rule mining to Remotely sensed data
1/20
Represented By
Madhusmita Sahu
(CSE,950014)
1
8/9/2019 Association rule mining to Remotely sensed data
2/20
Contentsy Introductiony Apriori Algorithmy Mining Rules to Imagery data
-Problem definition
-Partitioning quantitative attributes-Finding larger itemsets from imagery data
y New pruning techniques for fast data mining-Technique one-Technique two
y
An example of applying new algorithmy Conclusiony Reference
2
8/9/2019 Association rule mining to Remotely sensed data
3/20
3
REMOTE SENSING
Remote Sensing is the science of acquiring information about the Earth'ssurface without actually being in contact with it.
recording reflected energy
images collected in multiple bands of the electromagnetic spectrum
8/9/2019 Association rule mining to Remotely sensed data
4/20
Association Rule MiningyAssociations
y Simple rules in categorical data
y
Sample applicationsy Market Basket Analysis
Buys(Milk) Buys(Eggs)
y Transaction Processing
Income(Hi) & Single(Y) Owns(Computer)
y Search for Strong Rulesy Support R(A B) = P(A U B)
y Confidence R(A B) = P(B | A) = P(A B) / P(A)
4
8/9/2019 Association rule mining to Remotely sensed data
5/20
The Apriori Algorithm : Pseudo code
y Join Step: Ck is generated by joining Lk-1with itselfy Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a
frequent k-itemsety Pseudo-code:
Ck: Candidate item set of size kLk: frequent item set of size k
L1= { frequent items};For(k= 1; Lk!=; k++ ) do begin
Ck+1= candidates generated from Lk;F
or each transactiont
in dat
abase doIncrement the count of all candidates in Ck+1 that are contained in tLk+1= candidates in Ck+1 with min_support
endReturn kLk;
5
8/9/2019 Association rule mining to Remotely sensed data
6/20
MINING ASSOCIATION RULES
FROM IMAGERY DATA
y Problem definition
y Partitioning Quantitative Attributes
y Finding Large Item sets from Imagery Data
6
8/9/2019 Association rule mining to Remotely sensed data
7/20
NEW PRUNING TECHNIQUES FOR FAST DATAMINING
7
y Technique one
lemma 1: A pixel value can not belong to two differentintervals from the same band.
lemma 2: The combination of k intervals (k>1)from
same band has support zero.
8/9/2019 Association rule mining to Remotely sensed data
8/20
Ck : Candidate k-item setsLk: Large k-item sets
* : An operation for contactenationCk : Number of itemset in candidate k-item setsRj : Number of intervals in bandj
L k: Number of itemset in large k-item sets
1. According to the apriori algorithm :Apriori use L1*L1 to generate a candidate set ofitemsets C2.
|C2|apriori = |L1 ||L-1| 2
=
2. According to the new algorithm :Assume L 1 = R1 + R2 + ... + Rn.
C2new =R1 (R2 + R3 + ... + Rn) +R2 (R3 + R4 + ... + Rn) + ...+ Rn-2 (Rn-1 + Rn) + Rn-1(Rn)
=
8
8/9/2019 Association rule mining to Remotely sensed data
9/20
Contd
The numberof candidate 2-itemsets generated by new algorithm is muchless than by Apriori .
C2prune 1 = C2 apriori - C2 new
=
whe
n n is lar
ge
and Rj is lar
ge,
C2prune 1 be
come
s anex
tre
me
ly lar
ge
number.Forexample : If the imagery data has 8 bands and each band has 16intervals.The numberof pruned candidate 2-itemsets is8 *16(16-1)=960.It sharply reduces the process cost.
9
8/9/2019 Association rule mining to Remotely sensed data
10/20
Technique twoy During the process of data mining ,allow user interaction with the
mining engine and use users prior knowledge will help to speed upthe mining algorithms by restricting the search space.
y Consider only one band "bandN" in output. The association rule is the
form: bandl ... band(N-l)bandN.The number of candidate 2-itemset
C2 new =y we are not interested in those itemsets which do not contain bandN.
We will prune those candidate itemset in which none of the interval ischose from bandN.
The number of pruned candidate 2-itemset is
C2 prune 2 =
10
8/9/2019 Association rule mining to Remotely sensed data
11/20
contd.
y
Apply new pruning technique described in technique one.
C2 prune 1 =
y The total number of pruned candidate 2-itemset
C2 prune = C2prune 1 + C2 prune 2
=
y
And the remaining steps are the same as Apriori algorithm,
11
8/9/2019 Association rule mining to Remotely sensed data
12/20
Contd.
y If there are (N-M) bands in output in the form:
bandl ... bandM band(mM+l ) .... bandN
The total number of pruned candidate 2-itemset
C2prune = C2 prune l+ C 2 prune2
= +
And the remaining steps are the same as Apriori algorithm
12
8/9/2019 Association rule mining to Remotely sensed data
13/20
Steps
y Step 1: Choose one of the partition method (equaldepth,uneven depth and discontinous partition) to
determine the intervals.
y Step 2: From large l-item set, apply new pruningtechnique (technique one and technique two) to
generate candidate 2-itemset.
y Step 3: Applying remaining steps of Apriori algorithm
13
8/9/2019 Association rule mining to Remotely sensed data
14/20
An example for applying new algorithm (Assume user select equal depth
partitioning.Diameter two for band1 and band4 , Diameter three for band2 and band3
Pixel Band1 Band2 Band3 Band4
1 40 140 200 240
2 50 130 210 250
3 45 135 210 190
4 100 180 50 1005 110 170 40 120
14
[0,63] [64,127] [128,191] [192,255]
band1 b11 b12 b13 B14
band4 b41 b42 b43 b44
[0,31] [32,63] [64,95] [96,127
]
[128,
159]
[160,
191]
[192,
225]
[226,
255]
band2 b21 b22 b23 b24 b25 b26 b27 B28
band3 b31 b32 b33 b34 b35 b36 b37 b38
8/9/2019 Association rule mining to Remotely sensed data
15/20
An example ofpartition the value into intervals.
15
Pixel
b11
b12
b13
b14
b21
b25
b26
b28
b31
b32
b37
b38
b41
b42
b43
b44
1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
2 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
3 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0
4 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0
5 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0
y After selecting partition method.Map each value in thistable into intervals.
8/9/2019 Association rule mining to Remotely sensed data
16/20
Contd.y Apply new pruning techniques forcandidate 2-itemset
generation.Assume the minsup=40% and minconf=60%y Candidate 1-itemset:
{b11,b12,b13,b14,b21,b22,b23,b24,b25,b26,b27,b28,b31,b32,
b33,b34,b35,b36,b37,b38,b41,b42,b43,b44} Large 1-itemset:
{b11(3),b12(2),b25(3),b26(2),b32(2),b37(3),b42(2),b44(2)}Candidate 2-itemsets:
{{b42,b11},{b42,b12},{b42,b25},{b42,b32},{b42,b37},{b44,b11},{b44,b12},{b44,b25},{b44,b26},{b44,b32},{b44,b37}}
16
8/9/2019 Association rule mining to Remotely sensed data
17/20
An example contd.
Applying pruning technique one,C2 prune 1 =1+1+1+1=4
Applying pruning technique two,
C2 prune 2 =2 X (2+2)+2 X2 = 12 Total pruned no. of candidate 2-itemsets is =12+4=16 Applying apriori algorithm,the no. of candidate 2-itemset
C2 apriori =(8 X 7)/2 = 28
The percentage of pruning is 57%.so,theexecutionefficiency of mining process is improved. Remaining steps are the same as Apriori algorithm.
17
8/9/2019 Association rule mining to Remotely sensed data
18/20
Conclusion
y In this seminar,we defined a new data mining problem ---mining association rules from imagery data and its applicationin precision agriculture.
y Since theefficiency of a mining algorithm is a very importantissue of data mining,we proposed two simple and effectivepruning techniques forcandidate 2-itemset generation.
y by exploiting the nature of the problem and characteristics of
imagery data,we can prune significant numberof unnecessarycandidate itemsets during thevery early phase of miningprocess.
18
8/9/2019 Association rule mining to Remotely sensed data
19/20
References
19
Jianning Dong,william Perrizo,Qin Ding and Jingkai Zhou,Associationrule mining to Remotely sensed data North Dakota StateUniversity,Fargo,ND 581105
Data Mining: Concepts and Techniques(Hardcover - Mar 2006)
byJiawei han,Micheline kamber. J. Zhang, H. Wynne, M. L. Lee, Image mining: issues, frameworks, and
techniques, inProceedings of 2nd International Workshop on MultimediaData Mining, San Francisco, Aug 2001, pp. 13 20.
J. Li and R. M. Narayanan, "Integrated spectral and spatial information
mining in remote sensing,"IEEE Transactions on Geoscience and RemoteSensing,vol. 42, no. 3, pp. 673 685, March 2004.
8/9/2019 Association rule mining to Remotely sensed data
20/20
Thank You!!
20