View
449
Download
1
Category
Preview:
Citation preview
Classification is Supervised Learning
(we tell the system the classifications)
Clustering is Unsupervised Learning
(the data determines the groupings (which we then name))
Observations
an Observation can be described by a fixed set of quantifiable properties called Explanatory Variables or Features
For example, a Doctor visits could result in the following Features: • Weight • Male/Female • Age • White Cell Count
• Mental State (bad, neutral, good, great)
• Blood Pressure • etc
Text Documents will have a set of Features that defines the number of occurrences of each Word or n-gram in the corpus of documents
Classifier
a Machine Learning Algorithm or Mathematical Function that maps input data to a category is known as a Classifier
Examples:
• Linear Classifiers • Quadratic Classifiers • Support Vector Machines • K-Nearest Neighbours • Neural Networks • Decision Trees
Most algorithms are best applied to Binary Classification.
If you want to have multiple classes (tags) then use multiple Binary Classifiers instead
Training
A Classifier has a set of variables that need to set (trained). Different
classifiers have different algorithms to optimize this process
Of course there are many ways we can define Best Performance…
Accuracy
Sensitivity
Specifity
F1 Score
Likelihood
Cumulative Gain
Mean Reciprocal Rank
Average Precision
k-Nearest Neighbor
Cousin of k-Means Clustering
Algorithm: 1) In feature space, find the k closest neighbors (often using
Euclidean distance (straight line geometry)) 2) Assign the majority class from those neighbors
Decision Tress
Can generate multiple decision trees to improve accuracy (Random Forest)
Can be learned by consecutively splitting the data on an attribute pair using Recursive Partitioning
Linear Combination of the Feature Vector and a Weight Vector.
Can think of it as splitting a high-dimensional input space with a hyperplane
Determining the Weight
Vector
Can either use Generative or Discriminative models to determine the Weight Vector
Generative models attempt to model the conditional probability function of an Observation Vector given a
Classification.
Examples include: • LDA (Gaussian density) • Naive Bayes Classifier (Multinomial Bernoulli events)
Examples include: • Logistic Regression (maximum likelihood estimation assuming training set was
generated by a binomial model) • Support Vector Machine (attempts to maximize the margin between the
decision hyperplane and the examples in the training set)
Discriminative models attempt to maximize the quality of the output on a training set through an optimization
algorithm.
Functional Imperative
functionalimperative.com (647) 405-8994 @func_i
Recommended