View
233
Download
0
Tags:
Embed Size (px)
Citation preview
© Eric Xing @ CMU, 2006-2010
Machine LearningMachine Learning
Introduction & Introduction &
Nonparametric ClassifiersNonparametric Classifiers
Eric XingEric Xing
Lecture 1, August 12, 2010
Reading:
© Eric Xing @ CMU, 2006-2010
Where does it come from: http://www.cs.cmu.edu/~epxing/Class/10701/ http://www.cs.cmu.edu/~epxing/Class/10708/
Machine Learning
© Eric Xing @ CMU, 2006-2010
Logistics
Text book Chris Bishop, Pattern Recognition and Machine Learning (required) Tom Mitchell, Machine Learning
David Mackay, Information Theory, Inference, and Learning Algorithms Daphnie Koller and Nir Friedman, Probabilistic Graphical Models
Class resource http://bcmi.sjtu.edu.cn/ds/
Host: Lu Baoliang, Shanghai JiaoTong University Xue Xiangyang, Fudan University
Instructors:
© Eric Xing @ CMU, 2006-2010
Local Hosts and co-instructors
© Eric Xing @ CMU, 2006-2010
Apoptosis + Medicine
What is Learning
Grammatical rulesManufacturing proceduresNatural laws…
Inference
Learning is about seeking a predictive and/or executable understanding of natural/artificial subjects, phenomena, or activities from …
© Eric Xing @ CMU, 2006-2010
Machine Learning
© Eric Xing @ CMU, 2006-2010
Fetching a stapler from inside an office --- the Stanford STAIR robot
© Eric Xing @ CMU, 2006-2010
What is Machine Learning?
Machine Learning seeks to develop theories and computer systems for
representing; classifying, clustering and recognizing; reasoning under uncertainty; predicting; and reacting to …
complex, real world data, based on the system's own experience with data,
and (hopefully) under a unified model or mathematical framework, that
can be formally characterized and analyzed can take into account human prior knowledge can generalize and adapt across data and domains can operate automatically and autonomously and can be interpreted and perceived by human.
© Eric Xing @ CMU, 2006-2010
Where Machine Learning is being used or can be useful?
Speech recognitionSpeech recognition
Information retrievalInformation retrieval
Computer visionComputer vision
Robotic controlRobotic control
PlanningPlanning
GamesGames
EvolutionEvolution
PedigreePedigree
© Eric Xing @ CMU, 2006-2010
Natural language processing and speech recognition
Now most pocket Speech Recognizers or Translators are running on some sort of learning device --- the more you play/use them, the smarter they become!
© Eric Xing @ CMU, 2006-2010
Object Recognition
Behind a security camera, most likely there is a computer that is learning and/or checking!
© Eric Xing @ CMU, 2006-2010
Robotic Control I
The best helicopter pilot is now a computer! it runs a program that learns how to fly and make acrobatic maneuvers by itself! no taped instructions, joysticks, or things like …
A. Ng 2005
© Eric Xing @ CMU, 2006-2010
Text Mining
Reading, digesting, and categorizing a vast text database is too much for human!
We want:
© Eric Xing @ CMU, 2006-2010
Bioinformatics
tgtatcccagcatattttacgtaaaaacaaaacggtaatgcgaacataacttatttattggggcccggaccgcaaaccggccaaacgcgtttgcacccataaaaacataagggcaacaaaaaaattgttaagctgttgtttatttttgcaatcgaaaatcgaaagcgctcaaatagctgcgatcactcgggagcagggtaaagtcgcctcgaaacaggaagctgaagcatcttctataaatacactcaaagcgatcattccgaggcgagtctggttagaaatttacatggactgcaaaaaggtatagccccacaaactgcgtctccacatcgctgcgtttcggcagctaattgccttttagaaattattttcccatttcgagaaactcgtgtgggatgccggatgcggctttcaatcacttctggcccgggatcggattgggtcacattgtctgcgggctctattgtctcgatccgcggatttaaggcgcagttcgcgtgcttagcggtcagaaaggcagagattcggttcggattgatgcgctggcagcagggcacaaagatctaatgactggcaaatcgctacaaataaattaaagtccggcggctaattaatgagcggactgaagccactttggcggagttcattaaccaaaaaacagcagataaacaaaaacggcaaagaaaattgccacagagttgtcacgctttgttgcacaaacatttgtgcagaaaagtgaaaagcttttagccattattaagtttttcctcagctcgctggcagcacttgcgaatgtaaattgaaactgatgttcctcataaatgaaaattaatgtttgctctacgctccaccgaactcgcttgtttgggggattggctggctaatcgcggctagatcccaggcggtataaccttttcgcttcatcagttgtgaaaccagatggctggtgttttggcaacgacagccagcggactcccctcgaacgctctcgaaatcaagtggctttccagccggcccgctgggccgctcgcccactggaccggtattcccaggccaggccacactgtaccgcaccgcataatcctcgccagactcggcgctgataaggcccaatgtcaagaccgcactccgcaggcgtctatttatgccaaggaccgttcttcttcagctttcggctcgagtatttgttgtgccatgttggttacgatgccaatcgcggtacagttatgcaaatgagcagcgaataccgctcactgacaatgaacggcgtcttgtcaaccattcatattcatgctgacattcatattcattcctttggttttttgtcttcgacggactgaaaagtgcggagagaaacccaaaaacagaagcgcgcaaagcgccgttaatatgcgaactcagcgaactcattgaagttatcacaacaccatatccataaccatacccatatccatatcaatatcaatatcgctattattaacgatcatgctctgctgatcaagtattcagcgctgcgctagattcgacagattgaatcgagctcaatagactcaacagactccactcgacagatgcgcaatgccaaggacaattgccgtgtgtaagtggagtaaacgaggcgtatgcgcaacctgcacctggcggacgcggcgtatgcgcaatgtgcaattcgcttaccttctcgttgcgggtcaggaactcccagatgggaatggccgatgacgagctgatctgaatgtggaaggcgcccagcaggcagcaggccaagattactttcgccgcagtcgtcatggtgtcgttgctgcttttatgttgcgtactccgcactacacggagagttcaggggattcgtgctccgtgatctgtgatccgtgttccgtgggtcaattgcacggttcggttgtgtaaccttcgtgtggcgactttctttttttttagggcccaataaaagcgcttttgtggcggcttgatagattatcacttggtttcggtggctagccaagtggctttcttctgtccgacgcacttaattgaattaaccaaacaacgagcgtggccaattcgtattatcgctgttttcgggtgtacgtgtgtctcagcttgaaacgcaaaagcttgtttcacacatcggtttctcggcaagatgggggagtcagtcggtctagggagaggggcgcccaccagtcgatcacgaaaacggcgaattccaagcgaaacggaaacggagcgagcactataatcttgcagtactatgtcgaacaaccgatcgcggcgatgtcagtgagtcgtcttcggacagcgctggcgctccacacgtatttaagctctgagatcggctttgggagagcgcagagagcgccatcgcacggcagagcgaaagcggcagtgagcgaaagcgagagggagagcggcagcgggtgggggatcgggagccccccgaaaaaaacagaggcgcacgtcgatgccatcggggaattggaacctcaatgtgtgggaatgtttaaatattctgtgttaggtagtgtagtttcatagactatagattctcatacagattgtatctgagagtccttcgagccgattatacacgacagcaaaatatttcagtcgcgcttgggcaaaaggcttaagcacgactcccagtccccccttacatttgtcttcctaagcccctggagccactatcaaacttgttctacgcttgcactgaaaatagaaattaaaaaccaaagtaaacaatcaaaaagaccaaaaacaataacaaccagcaccgagtcgaacatcagtgaggcattgcaaaaatttcaaagtcaagtttgcgtcgtcatcgcgtctgagtccgatcaagccgggcttgtaattgaagttgttgatgagccgcgggattactggattgtggcgaattctggtcagcatacttaacagcagcccgctaattaagcaaaataaacatatcaaattccagaatgcgacggcgccatcatcctgtttgggaattcaattcgcgggcagatcgtttaattcaattaaaaggtagaagtcggtaaaagggagcagaagaatgcgatcgctggaatttcctaacatcacggaccccataaatttgataagcccgagctcgctgcgttgagtcagccaccccacatccccaaatccccgccaaaagaagacagctgggttgttgactcgccagattgaggtgcggattgcagtggagtggacctggtcaaagaagcaccgttaatgtgctgattccattcgattccatccgggaatgcgataaagaaaggctctgatccaagcaactgcaatccggatttcgattttctctttccatttggttttgtatttacgtacttatgggaaagcattctaatgaagacttggagaagacttacgttatattcagaccatcgtgcgatagaggatgagtcatttccatatggccgaaatttattatgtttactatcgtttttagaggtgttttttggacttaccaaaagaggcatttgttttctgggattattcaactgaaaagatatttaaattttttcttggaccattttcaaggttccggatatatttgaaacacactagctagcagtgttggtaagttacatgtatttctataatgtcatattcctttgtccgtattcaaatcgaatactccacatctcacaggatgttgtacttgaggaattggcgatcgtagcgatttcccccgccgtaaagttcctgatcctcgttgtttttgtacatcataaagtccggattctgctcgtcgccgaagatgggaacgaagctgccaaagctgagagtctgcttgaggtgctggtccggacggcgtcccagctggataaccttgctgtacagatcggcatctgcctggagggcacgatcgaaatccttccagtggacgaacttcacctgctcgctgggaatagcgttgttgtcaagcagctcaaggagcgtattcgagttgacgggctgcaccacgttaagttcctgctccttcgctggggattcccctgcgggtaagcgccgcttgcttggactcgtttccaaatcccatagccacgccagcagaggagtaacagagctcwhereisthegenetgattaaaaatatcctttaagaaagcccatgggtataacttacgtgctcactgcgtcctatgcgaggaatggtctttaggttctttatggcaaagttctcgcctcgcttgcccagccgcggtacgttcttggtgatctttaggaagaatcctggactactgtcgtctgcctggcttatggccacaagacccaccaagagcgaaacagacaggactgttatgattctcatgctgatgcgactgaagcttcacctgactcctgctccacaattggtggcctttatatagcgagatccacccgcatcttgcgtggaatagaaatgcgggtgactccaggaattagcattatcgatcggaaagtgcaaagcagataaaactgaactaacctgacctaaatgcctggccataattaagtgcatacatacacattacattacttacatttgtataagaactaaattttatagtacataccacttgcgtatgtaaatgcttgtcttttctcttatatacgttttataaatttgtatcccagcatattttacgtaaaaacaaaacggtaatgcgaacataacttatttattggggcccggaccgcaaaccggccaaacgcgtttgcacccataaaaacataagggcaacaaaaaaattgttaagctgttgtttatttttgcaatcgaaaatcgaaagcgctcaaatagctgcgatcactcgggagcagggtaaagtcgcctcgaaacaggaagctgaagcatcttctataaatacactcaaagcgatcattccgaggcgagtctggttagaaatttacatggactgcaaaaaggtatagccccacaaactgcgtctccacatcgctgcgtttcggcagctaattgccttttagaaattattttcccatttcgagaaactcgtgtgggatgccggatgcggctttcaatcacttctggcccgggatcggattgggtcacattgtctgcgggctctattgtctcgatccgcggatttaaggcgcagttcgcgtgcttagcggtcagaaaggcagagattcggttcggattgatgcgctggcagcagggcacaaagatctaatgactggcaaatcgctacaaataaattaaagtccggcggctaattaatgagcggactgaagccactttggcggagttcattaaccaaaaaacagcagataaacaaaaacggcaaagaaaattgccacagagttgtcacgctttgttgcacaaacatttgtgcagaaaagtgaaaagcttttagccattattaagtttttcctcagctcgctggcagcacttgcgaatgtaaattgaaactgatgttcctcataaatgaaaattaatgtttgctctacgctccaccgaactcgcttgtttgggggattggctggctaatcgcggctagatcccaggcggtataaccttttcgcttcatcagttgtgaaaccagatggctggtgttttggcaacgacagccagcggactcccctcgaacgctctcgaaatcaagtggctttccagccggcccgctgggccgctcgcccactggaccggtattcccaggccaggccacactgtaccgcaccgcataatcctcgccagactcggcgctgataaggcccaatgtcaagaccgcactccgcaggcgtctatttatgccaaggaccgttcttcttcagctttcggctcgagtatttgttgtgccatgttggttacgatgccaatcgcggtacagttatgcaaatgagcagcgaataccgctcactgacaatgaacggcgtcttgtcaaccattcatattcatgctgacattcatattcattcctttggttttttgtcttcgacggactgaaaagtgcggagagaaacccaaaaacagaagcgcgcaaagcgccgttaatatgcgaactcagcgaactcattgaagttatcacaacaccatatccataaccatacccatatccatatcaatatcaatatcgctattattaacgatcatgctctgctgatcaagtattcagcgctgcgctagattcgacagattgaatcgagctcaatagactcaacagactccactcgacagatgcgcaatgccaaggacaattgccgtgtgtaagtggagtaaacgaggcgtatgcgcaacctgcacctggcggacgcggcgtatgcgcaatgtgcaattcgcttaccttctcgttgcgggtcaggaactcccagatgggaatggccgatgacgagctgatctgaatgtggaaggcgcccagcaggcagcaggccaagattactttcgccgcagtcgtcatggtgtcgttgctgcttttatgttgcgtactccgcactacacggagagttcaggggattcgtgctccgtgatctgtgatccgtgttccgtgggtcaattgcacggttcggttgtgtaaccttcgtgtggcgactttctttttttttagggcccaataaaagcgcttttgtggcggcttgatagattatcacttggtttcggtggctagccaagtggctttcttctgtccgacgcacttaattgaattaaccaaacaacgagcgtggccaattcgtattatcgctgttttcgggtgtacgtgtgtctcagcttgaaacgcaaaagcttgtttcacacatcggtttctcggcaagatgggggagtcagtcggtctagggagaggggcgcccaccagtcgatcacgaaaacggcgaattccaagcgaaacggaaacggagcgagcactataatcttgcagtactatgtcgaacaaccgatcgcggcgatgtcagtgagtcgtcttcggacagcgctggcgctccacacgtatttaagctctgagatcggctttgggagagcgcagagagcgccatcgcacggcagagcgaaagcggcagtgagcgaaagcgagagggagagcggcagcgggtgggggatcgggagccccccgaaaaaaacagaggcgcacgtcgatgccatcggggaattggaacctcaatgtgtgggaatgtttaaatattctgtgttaggtagtgtagtttcatagactatagattctcatacagattgtatctgagagtccttcgagccgattatacacgacagcaaaatatttcagtcgcgcttgggcaaaaggcttaagcacgactcccagtccccccttacatttgtcttcctaagcccctggagccactatcaaacttgttctacgcttgcactgaaaatagaaattaaaaaccaaagtaaacaatcaaaaagaccaaaaacaataacaaccagcaccgagtcgaacatcagtgaggcattgcaaaaatttcaaagtcaagtttgcgtcgtcatcgcgtctgagtccgatcaagccgggcttgtaattgaagttgttgatgagccgcgggattactggattgtggcgaattctggtcagcatacttaacagcagcccgctaattaagcaaaataaacatatcaaattccagaatgcgacggcgccatcatcctgtttgggaattcaattcgcgggcagatcgtttaattcaattaaaaggtagaagtcggtaaaagggagcagaagaatgcgatcgctggaatttcctaacatcacggaccccataaatttgataagcccgagctcgctgcgttgagtcagccaccccacatccccaaatccccgccaaaagaagacagctgggttgttgactcgccagattgaggtgcggattgcagtggagtggacctggtcaaagaagcaccgttaatgtgctgattccattcgattccatccgggaatgcgataaagaaaggctctgatccaagcaactgcaatccggatttcgattttctctttccatttggttttgtatttacgtacttatgggaaagcattctaatgaagacttggagaagacttacgttatattcagaccatcgtgcgatagaggatgagtcatttccatatggccgaaatttattatgtttactatcgtttttagaggtgttttttggacttaccaaaagaggcatttgttttctgggattattcaactgaaaagatatttaaattttttcttggaccattttcaaggttccggatatatttgaaacacactagctagcagtgttggtaagttacatgtatttctataatgtcatattcctttgtccgtattcaaatcgaatactccacatctcacaggatgttgtacttgaggaattggcgatcgtagcgatttcccccgccgtaaagttcctgatcctcgttgtttttgtacatcataaagtccggattctgctcgtcgccgaagatgggaacgaagctgccaaagctgagagtctgcttgaggtgctggtccggacggcgtcccagctggataaccttgctgtacagatcggcatctgcctggagggcacgatcgaaatccttccagtggacgaacttcacctgctcgctgggaatagcgttgttgtcaagcagctcaaggagcgtattcgagttgacgggctgcaccacgttaagttcctgctccttcgctggggattcccctgcgggtaagcgccgcttgcttggactcgtttccaaatcccatagccacgccagcagaggagtaacagagctctgaaaacagttcatggtttaaaaatatcctttaagaaagcccatgggtataacttacgtgctcactgcgtcctatgcgaggaatggtctttaggttctttatggcaaagttctcgcctcgcttgcccagccgcggtacgttcttggtgatctttaggaagaatcctggactactgtcgtctgcctggcttatggccacaagacccaccaagagcgaaacagacaggactgttatgattctcatgctgatgcgactgaagcttcacctgactcctgctccacaattggtggcctttatatagcgagatccacccgcatcttgcgtggaatagaaatgcgggtgactccaggaattagcattatcgatcggaaagtgcaaagcagataaaactgaactaacctgacctaaatgcctggccataattaagtgcatacatacacattacattacttacatttgtataagaactaaattttatagtacataccacttgcgtatgtaaatgcttgtcttttctcttatatacgttttataaatt
Where is the gene?Where is the gene?
© Eric Xing @ CMU, 2006-2010
Paradigms of Machine Learning
Supervised Learning Given , learn , s.t.
Unsupervised Learning Given , learn , s.t.
Reinforcement Learning Given
learn , s.t.
Active Learning Given , learn , s.t.
iiD YX , ii XY f :)f( jjD YX new
iD X ii XY f :)f( jjD YX new
game trace/realsimulator/ rewards,,actions,envD
rea
are
,:utility
,:policy 321 aaa ,,game real new,env
)(G~ D jD Y policy, ),(G' all )f( and )(G'~new D
© Eric Xing @ CMU, 2006-2010
Elements of Learning Here are some important elements to consider before you start:
Task: Embedding? Classification? Clustering? Topic extraction? …
Data and other info: Input and output (e.g., continuous, binary, counts, …) Supervised or unsupervised, of a blend of everything? Prior knowledge? Bias?
Models and paradigms: BN? MRF? Regression? SVM? Bayesian/Frequents ? Parametric/Nonparametric?
Objective/Loss function: MLE? MCLE? Max margin? Log loss, hinge loss, square loss? …
Tractability and exactness trade off: Exact inference? MCMC? Variational? Gradient? Greedy search? Online? Batch? Distributed?
Evaluation: Visualization? Human interpretability? Perperlexity? Predictive accuracy?
It is better to consider one element at a time!
© Eric Xing @ CMU, 2006-2010
Theories of Learning
For the learned F(; )
Consistency (value, pattern, …) Bias versus variance Sample complexity Learning rate Convergence Error bound Confidence Stability …
© Eric Xing @ CMU, 2006-2010
Classification
Representing data:
Hypothesis (classifier)
© Eric Xing @ CMU, 2006-2010
Decision-making as dividing a high-dimensional space
Classification-specific Dist.: P(X|Y)
Class prior (i.e., "weight"): P(Y)
),;(
)|(
111
1
Xp
YXp
),;(
)|(
222
2
Xp
YXp
© Eric Xing @ CMU, 2006-2010
The Bayes Rule
What we have just did leads to the following general expression:
This is Bayes Rule
)(
)()|()|(
XP
YpYXPXYP
© Eric Xing @ CMU, 2006-2010
The Bayes Decision Rule for Minimum Error
The a posteriori probability of a sample
Bayes Test:
Likelihood Ratio:
Discriminant function:
)()|(
)|(
)(
)()|()|( Xq
iYXp
iYXp
Xp
iYPiYXpXiYP i
i ii
ii
)(Xh
)(X
© Eric Xing @ CMU, 2006-2010
Example of Decision Rules
When each class is a normal …
We can write the decision boundary analytically in some cases … homework!!
© Eric Xing @ CMU, 2006-2010
Bayes Error
We must calculate the probability of error the probability that a sample is assigned to the wrong class
Given a datum X, what is the risk?
The Bayes error (the expected risk):
© Eric Xing @ CMU, 2006-2010
More on Bayes Error
Bayes error is the lower bound of probability of classification error
Bayes classifier is the theoretically best classifier that minimize probability of classification error
Computing Bayes error is in general a very complex problem. Why? Density estimation:
Integrating density function:
© Eric Xing @ CMU, 2006-2010
Learning Classifier
The decision rule:
Learning strategies
Generative Learning
Discriminative Learning
Instance-based Learning (Store all past experience in memory) A special case of nonparametric classifier
© Eric Xing @ CMU, 2006-2010
Supervised Learning
K-Nearest-Neighbor Classifier: where the h(X) is represented by all the data, and by an algorithm
© Eric Xing @ CMU, 2006-2010
Recall: Vector Space Representation
Doc 1 Doc 2 Doc 3 ...
Word 1 3 0 0 ...
Word 2 0 8 1 ...
Word 3 12 1 10 ...
... 0 1 3 ...
... 0 0 0 ...
Each document is a vector, one
component for each term (= word).
Normalize to unit length. High-dimensional vector space:
Terms are axes, 10,000+ dimensions, or even 100,000+ Docs are vectors in this space
© Eric Xing @ CMU, 2006-2010
Test Document = ?
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
1-Nearest Neighbor (kNN) classifier
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
2-Nearest Neighbor (kNN) classifier
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
3-Nearest Neighbor (kNN) classifier
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
K-Nearest Neighbor (kNN) classifier
Sports
Science
Arts
Voting kNN
© Eric Xing @ CMU, 2006-2010
Classes in a Vector Space
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
kNN Is Close to Optimal
Cover and Hart 1967 Asymptotically, the error rate of 1-nearest-neighbor
classification is less than twice the Bayes rate [error rate of classifier knowing model that generated data]
In particular, asymptotic error rate is 0 if Bayes rate is 0.
Where does kNN come from? Nonparametric density estimation
© Eric Xing @ CMU, 2006-2010
Nearest-Neighbor Learning Algorithm
Learning is just storing the representations of the training examples in D.
Testing instance x: Compute similarity between x and all examples in D. Assign x the category of the most similar example in D.
Does not explicitly compute a generalization or category prototypes.
Also called: Case-based learning Memory-based learning Lazy learning
© Eric Xing @ CMU, 2006-2010
kNN is an instance of Instance-Based Learning
What makes an Instance-Based Learner?
A distance metric
How many nearby neighbors to look at?
A weighting function (optional)
How to relate to the local points?
© Eric Xing @ CMU, 2006-2010
Euclidean Distance Metric
Or equivalently,
Other metrics: L1 norm: |x-x'|
L∞ norm: max |x-x'| (elementwise …)
Mahalanobis: where is full, and symmetric Correlation Angle Hamming distance, Manhattan distance …
i
iii xxxxD 22 )'()',(
)'()'()',( xxxxxxD T
© Eric Xing @ CMU, 2006-2010
Case Study:kNN for Web Classification
Dataset 20 News Groups (20 classes) Download :(http://people.csail.mit.edu/jrennie/20Newsgroups/) 61,118 words, 18,774 documents Class labels descriptions
© Eric Xing @ CMU, 2006-2008
© Eric Xing @ CMU, 2006-2010
Results: Binary Classes
© Eric Xing @ CMU, 2006-2008
alt.atheism vs.
comp.graphics
rec.autos vs.
rec.sport.baseball
comp.windows.x
vs. rec.motorcycles
k
Accuracy
© Eric Xing @ CMU, 2006-2010
Results: Multiple Classes
© Eric Xing @ CMU, 2006-2008
k
Accuracy
Random select 5-out-of-20 classes, repeat 10 runs and average
All 20 classes
© Eric Xing @ CMU, 2006-2010
Is kNN ideal?
© Eric Xing @ CMU, 2006-2010
Is kNN ideal? … more later
© Eric Xing @ CMU, 2006-2010
Effect of Parameters
Sample size The more the better Need efficient search algorithm for NN
Dimensionality Curse of dimensionality
Density How smooth?
Metric The relative scalings in the distance metric affect region shapes.
Weight Spurious or less relevant points need to be downweighted
K
© Eric Xing @ CMU, 2006-2010
Summary Machine Learning is Cool and Useful!!
Paradigms of Machine Learning. Design elements learning Theories on learning
Fundamental theory of classification Bayes optimal classifier Instance-based learning: kNN – a Nonparametric classifier A nonparametric method does not rely on any assumption concerning the structure of the underlying
density function. Very little “learning” is involved in these methods
Good news: Simple and powerful methods; Flexible and easy to apply to many problems. kNN classifier asymptotically approaches the Bayes classifier, which is theoretically the best classifier that
minimizes the probability of classification error.
Bad news: High memory requirements Very dependant on the scale factor for a specific problem.