Upload
butest
View
374
Download
0
Tags:
Embed Size (px)
Citation preview
Introduction to Machine Learning
CAP 4630
Xingquan (Hill) Zhu
Outline
• What is Machine Learning?– What is Pattern Recognition?– Typical Pattern Recognition Systems– Resource & References
• Decision Trees
• Neural Networks
Face Detection Demo
• http://demo.pittpatt.com/– http://graphics8.nytimes.com/images/2008/03/
17/us/17bush-600.jpg– http://img.timeinc.net/time/daily/special/photo/
siralec/faces.jpg– http://demo.pittpatt.com/detection_demo/–
More Complicated Examples
• Given The following Examples
•Which of the following person belong to this small group?
Machine Learning
Yes or No!
How much confidence?
Fst: Yes, 90%
Snd: No, 70%
Thd: No, 90%
Fth: No, 80%
Neural Networks Applications
ALVINN drives 70mph on highways
CMU
More Complicated Example Regression
• Example: Price of a used car
• x : car attributes
y : price
y = g (x | θ)
g ( ) model,
θ parameters
y = wx+w0
http://www.theparticle.com/applets/ml/index.html
Outline
• What is Machine Learning?– What is Pattern Recognition?– Typical Pattern Recognition Systems– Resource & References
What is Machine Learning?
• Humans have developed highly sophisticated skills for sensing their environment and taking actions according to what they observe, e.g.,– Recognizing a face– Understanding spoken words– Reading handwriting– Distinguishing fresh food from its smell
• We would like to give similar capabilities to machines
What is Machine Learning
• Programming computers to use example data or past experience– Needed in cases where we cannot directly write a
computer program but have example data• Learning is used when:
– Human expertise does not exist (navigating on Mars),– Humans are unable to explain their expertise (speech
recognition)– Solution changes in time (routing on a computer
network)– Solution needs to be adapted to particular cases (user
biometrics)• Are all problems learnable?
“Learning”…• Learning general models from the data of particular
examples – Data is usually cheap and abundant (data warehouses, data
marts); knowledge is expensive and scarce.
– Data scarcity, learning is possible but knowledge is less reliable
• Example in retail: Customer transactions to consumer behavior:
People who bought “Da Vinci Code” also bought “The Five People You Meet in Heaven” (www.amazon.com)
• Build a model that is a good and useful approximation to the data.
Machine Learning: Creating a Classifier Adaptively
• Supervised learning– Decision Trees– Feedforward neural network and
backpropagation
• Unsupervised learning– Clustering
• Grouping similar instances
– Association analysis• People who bought “Da Vinci Code” also bought “The Five People
You Meet in Heaven” (www.amazon.com)
• Reinforcement learning
Machine Learning Output
• The output of Machine learning– Patterns
• Decision trees, data summarization, data generative models– Discriminative machines
• Neural networks (explicit rules are not available)
• The requirements of the output crucially determine the underlying learning models – Class category– Confidence (probability)– The comprehensibility of the decision model
• The process of Learning is the process of pattern discovery
Outline
• What is Machine Learning?– What is Pattern Recognition?– Typical Pattern Recognition Systems– Resource & References
What is Pattern Recognition
• A pattern is an entity, vaguely defined, that could be given a name, e.g.,– fingerprint image– handwritten word– human face– speech signal– DNA sequence
• Pattern recognition is the study of how machines can– observe the environment– learn to distinguish patterns of interest– make sound and reasonable decisions about the
categories of the patterns
A Simple Example
Typical Decision Process
• Fish Face Recognition?• Salmon tastes better?• What kind of information can distinguish one
species from the others?– length, width, weight, number and shape of fins, tail
shape, etc.• What can cause problems during sensing?
– lighting conditions, position of fish on the conveyor belt, camera noise, etc.
• What are the steps in the process?– capture image → isolate fish → take measurements
→ make decision
Example: Feature Selection
• Assume a fisherman (domain knowledge) told us that a sea bass is generally longer than a salmon.
• We can use length as a feature and decide between sea bass and salmon according to a threshold on length.
• How can we choose this threshold?
Example: Feature Selection
Example: Feature Selection
• Even though sea bass is longer than salmon on the average, there are many examples of fish where this observation does not hold.
• Try another feature: average lightness of the fish scales.
Example: Feature Selection
Example: Multiple Features
• Assume we also observed that sea bass are typically wider than salmon.
• We can use two features in our decision:– lightness: x1– width: x2
• Each fish image is now represented as a point (feature vector)
in a two-dimensional feature space
Example: Multiple Features
0)(
0)()q(
bifJ
bifH
xw
xwx
Example: Multiple Features
y x
nx
x
x
2
1 Feature vector
- A vector of observations (measurements). - is a point in feature space .
Hidden state
- Cannot be directly measured.
- Patterns with equal hidden state belong to the same class.
Xx
x X
Yy
Task- To design a classifer (decision rule)
which decides about a hidden state based on an onbservation.
YX :q
Pattern
Example: Multiple Features
• Does adding more features always improve the results?– Avoid unreliable features.– Be careful about correlations with existing
features.– Be careful about measurement costs.– Be careful about noise in the measurements.
• Is there some curse for working in very high dimensions?
Example: Decision Boundaries
Overfitting and underfitting
underfitting overfittinggood fit
Example: Decision Boundaries
Example: Cost of Error
• We should also consider costs of different errors we make in our decisions. For example, if the fish packing company knows that:– Customers who buy salmon will object vigorously if
they see sea bass in their cans.– Customers who buy sea bass will not be unhappy if
they occasionally see some expensive salmon in their cans.
• How does this knowledge affect our decision?
The actual forms of the decision model (Patterns)
• Decision Rules– IF lightness>8 AND length>14 THEN Salmon
• Decision Trees– A tree structure to make decision
• Probability Models– Gaussian models– Bayesian decision models
• Weight functions– Neural Networks– Linear Discriminate Analysis
Outline
• What is Machine Learning?– What is Pattern Recognition?– Typical Pattern Recognition Systems– Resource & References
Interesting Demo
Story picturing: http://alipr.com/spe/
Resources: Datasets
• UCI Repository: http://www.ics.uci.edu/~mlearn/MLRepository.html
• UCI KDD Archive: http://kdd.ics.uci.edu/summary.data.application.html
• Statlib: http://lib.stat.cmu.edu/• Delve: http://www.cs.utoronto.ca/~delve/
Resources: Journals
• Journal of Machine Learning Researc• Machine Learning • Neural Computation• Neural Networks• IEEE Transactions on Neural Networks• IEEE Transactions on Pattern Analysis and Machine Intelligence• Annals of Statistics• Journal of the American Statistical Association• IEEE Trans. On Knowledge and Data Engineering• Data Mining and Knowledge Discovery• ...
Resources: Conferences
• International Joint Conference on Artificial Intelligence (IJCAI)
• International Conference on Machine Learning (ICML) • Neural Information Processing Systems (NIPS)• American Association for Artificial Intelligence (AAAI)• Uncertainty in Artificial Intelligence (UAI)• International Conference on Neural Networks (Europe)• ACM Knowledge Discovery and Data Mining (KDD)• IEEE International Conference on Data Mining (ICDM)• ...
Outline
• What is Machine Learning?– What is Pattern Recognition?– Typical Pattern Recognition Systems– Resource & References
• Decision Trees
• Neural Networks