Introduction to Machine Learning Classifiers

Very, Very Basic Introduction to Machine Learning Classification

Josh Borts

Problem

Identify which of a set of

categories a new observation

belongs

Classification is Supervised Learning

(we tell the system the classifications)

Clustering is Unsupervised Learning

(the data determines the groupings (which we then name))

Examples

Handwriting Recognition / OCR

Spam Filters

Blood Type Identification

Automatic Document Classification

Face Recognition

SHAZAM!!

Other Examples

Credit Scoring

Text Sentiment Extraction

Cohort Assignment

Gesture Recognition

Observations

an Observation can be described by a fixed set of quantifiable properties called Explanatory Variables or Features

For example, a Doctor visits could result in the following Features: • Weight • Male/Female • Age • White Cell Count

• Mental State (bad, neutral, good, great)

• Blood Pressure • etc

Text Documents will have a set of Features that defines the number of occurrences of each Word or n-gram in the corpus of documents

Classifier

a Machine Learning Algorithm or Mathematical Function that maps input data to a category is known as a Classifier

Examples:

• Linear Classifiers • Quadratic Classifiers • Support Vector Machines • K-Nearest Neighbours • Neural Networks • Decision Trees

Most algorithms are best applied to Binary Classification.

If you want to have multiple classes (tags) then use multiple Binary Classifiers instead

Training

A Classifier has a set of variables that need to set (trained). Different

classifiers have different algorithms to optimize this process

Overfitting

Danger!! The model fits only the data in was trained on.

New data is completely foreign

Among competing hypotheses, the one

with the fewest assumptions should

be selected

Split the data into In-Sample (training) and Out-Of-Sample (test)

How do we Evaluate Classifier

Performance?

Of course there are many ways we can define Best Performance…

Accuracy

Sensitivity

Specifity

F1 Score

Likelihood

Cumulative Gain

Mean Reciprocal Rank

Average Precision

Algorithms

k-Nearest Neighbor

Cousin of k-Means Clustering

Algorithm: 1) In feature space, find the k closest neighbors (often using

Euclidean distance (straight line geometry)) 2) Assign the majority class from those neighbors

Decision Tress

Can generate multiple decision trees to improve accuracy (Random Forest)

Can be learned by consecutively splitting the data on an attribute pair using Recursive Partitioning

New York & San Fran housing by

Elevation and Price

Linear Classifier

Linear Combination of the Feature Vector and a Weight Vector.

Can think of it as splitting a high-dimensional input space with a hyperplane

Often the fastest classifier, especially when feature space is sparse or large number of dimensions

Determining the Weight

Vector

Can either use Generative or Discriminative models to determine the Weight Vector

Generative models attempt to model the conditional probability function of an Observation Vector given a

Classification.

Examples include: • LDA (Gaussian density) • Naive Bayes Classifier (Multinomial Bernoulli events)

Examples include: • Logistic Regression (maximum likelihood estimation assuming training set was

generated by a binomial model) • Support Vector Machine (attempts to maximize the margin between the

decision hyperplane and the examples in the training set)

Discriminative models attempt to maximize the quality of the output on a training set through an optimization

algorithm.

Neural Network

Not going to get into the details, this time….

Functional Imperative

functionalimperative.com (647) 405-8994 @func_i

Introduction to Machine Learning Classifiers

Technology

COMPARISON OF DIFFERENT MACHINE LEARNING CLASSIFIERS FOR ... · COMPARISON OF DIFFERENT MACHINE LEARNING CLASSIFIERS FOR BUILDING EXTRACTION IN LIDAR-DERIVED DATASETS ... Object Based

Machine Learning and Data Mining: 16 Classifiers Ensembles

INF 4300 – Support Vector Machine Classifiers (SVM)€¦ · Support Vector Machine classifiers • To understand Support Vector Machine (SVM) classifiers, we need to study the linear

MLE’s, Bayesian Classifiers and Naïve Bayes Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 30,

Machine Learning: Ensembles of Classifiers · AlgoLabs Certi cation Course on Machine Learning 24 February, 2015. Bottlenecks in building a classi er Noise: Uncertainty in classi

Machine Learning CSE 473. © Daniel S. Weld 2 Machine Learning Outline Machine learning: What & why? Bias Supervised learning Classifiers A supervised

CS340 Machine learning Gaussian classifiersmurphyk/Teaching/CS340-Fall07/...CS340 Machine learning Gaussian classifiers 2 Correlated features • Height and weight are not independent

Dutch football prediction using machine learning classifiers

Feature Extraction/Machine Learning for Degradation ...2).pdf · Supervised machine learning classifiers Stratified sampling is done to divide data into test set and validation set

Mastering Machine Learning - OPENCADD · 8 | Mastering Machine Learning STEP 3. Develop Predictive Models i. Selecting the Training and Validation Data Before training actual classifiers,

Complexity vs. Performance: Empirical Analysis of Machine ...people.cs.uchicago.edu/~ravenben/publications/pdf/mlaas-imc17.pdf · Machine learning classifiers are basic research tools

Defending Malicious Script Attacks Using Machine Learning ...downloads.hindawi.com/journals/wcmc/2017/5360472.pdf · Defending Malicious Script Attacks Using Machine Learning Classifiers

Bayesian Classifiers, Conditional Independence and Naïve …epxing/Class/10701-10s/Lecture/lecture4.pdfBayesian Classifiers, Conditional Independence and Naïve Bayes Machine Learning

COMPARISON OF DIFFERENT MACHINE LEARNING CLASSIFIERS …€¦ · Other than rule-based classification, training and automatic classification of objects using machine learning classifiers

Learning Bayesian Classifiers

Machine Learning – Linear Classifiers and Boosting CS 271: Fall 2007 Instructor: Padhraic Smyth

1 Preliminary Experiments: Learning Virtual Sensors Machine learning approach: train classifiers –fMRI(t, t+ ) CognitiveState Fixed set of possible

Machine Learning Classifiers

Lab 9: Developing Machine Learning Classifiers for Anomaly

Lab 12: Developing Machine Learning Classifiers for ...ce.sc.edu/cyberinfra/workshops/workshop_2020/zeek... · Lab 12: Developing Machine Learning Classifiers for Anomaly Inference