Upload
zaffar-ahmed-shaikh
View
1.245
Download
3
Embed Size (px)
Citation preview
Nearest Neighbor Algorithm
Zaffar Ahmed Shaikh
Topics
• Introduction – Memory-based algorithms• K-nearest neighbor (KNN) algorithm• How KNN works?• KNN Example• Different types of KNN
Introduction
• Memory-based algorithms utilize the entire user-item database to generate a prediction. They find a set of users, known as neighbors, that have a history of agreeing with the target user. Once a neighborhood of users is formed, the preferences of neighbors are combined to produce a prediction or top-K recommendation for the active user.
K-nearest neighbor (KNN)• The nearest neighbor algorithm measures the distance
dE(Xi,Xj) between query points Xi and a set of training samples Xj to classify a new object based on majority of K-nearest neighbor category of Y attributes of training samples.
Query point Xi = x1, x2, x3, ……….., xn
Training Sample Xj= x1, x2, x3, ……….., xn
Dist(c1,c2) attr
i(c1) attr
i(c2) 2
i1
N
k NearestNeighbors k MIN(Dist(ci,ctest))
predictiontest
1
kclass
ii1
k (or
1
kvalue
ii1
k )
How KNN works?
1. Determine K (no of nearest neighbors)2. Calculate distance (Euclidean, Manhattan)3. Determine K-minimum distance neighbors4. Gather category Y values of nearest neighbors 5. Use simple majority of nearest neighbors to
predict value of query instance
KNN Example• Predict who will win today’s Cricket match between India
and Pakistan based on users rating and previous results of matches played between the two teams.
Matches/Teams Who will win?Pakistan?
Who will win?India? Neutral Y (Winner)
1 7 2 1 +
2 3 5 2 -
3 2 6 2 +
4 6 3 1 +
5 4 4 2 -
6 7 2 1 -
7 2 3 4 +
8 4 3 3 ?
1. Determine K
1. Determine value of K Suppose K = 32. Calculate distanceCoordinates of query instance are (4,3,3)Coordinates of training instance(1) are (7,2,1)D = SQRT ((7-4)2+(2-3) 2+(1-3) 2) = 3.74165
2. Calculate distance
Matches/Teams
Who will win?Pakistan?
Who will win?India?
Neutral Y (Winner) distance
1 7 2 1 + 3.741657
2 3 5 2 - 2.44949
3 2 6 2 + 3.741657
4 6 3 1 + 2.828427
5 4 4 2 - 1.414214
6 7 2 1 - 3.741657
7 2 3 4 + 2.236068
8 4 3 3 ?
3. Determine K-minimum distance neighbors
K = 3
Matches/Teams
Who will win?Pakistan?
Who will win?India?
Neutral Y (Winner) distance
1 7 2 1 + 3.741657
2 3 5 2 - 2.44949 (3)
3 2 6 2 + 3.741657
4 6 3 1 + 2.828427
5 4 4 2 - 1.414214 (1)
6 7 2 1 - 3.741657
7 2 3 4 + 2.236068 (2)
8 4 3 3 ?
4. Gather category Y values of nearest neighbors
Matches/Teams
Who will win?Pakistan?
Who will win?India?
Neutral Y (Winner)
2 3 5 2 -5 4 4 2 -7 2 3 4 +8 4 3 3
5. Use simple majority of nearest neighbors to predict value of query instance
• Here India has won 2 matches 2 (-) signs and Pakistan has won 1 match 1 (+) sign
• We conclude that India will win today’s match
Matches/Teams Who will win?Pakistan?
Who will win?India? Neutral Y (Winner)
1 7 2 1 +2 3 5 2 -3 2 6 2 +4 6 3 1 +5 4 4 2 -6 7 2 1 -7 2 3 4 +
8 4 3 3 (-)
Different types of KNN
• KNN for Classification• KNN for Prediction• KNN for Smoothing
Thank you