Intro to Logistic Regression

Logistic RegressionJacquelyn Victoria & Tamer Wahba

Slide OwnershipJacquelyn Victoria - 3 to 9Tamer Wahba - 10 to 15

Regression Analysis +

Classification

How can we predict a nominal class using regression analysis?

Consider a binary class:

Each instance x is a vector of feature values

Our output values or class labels are restricted to 0 or 1, i.e. f(x) {0, 1}∈

We need an h(x) where: 0 < h(x) < 1

We need a function which exhibits this behavior

Logistic Functions Sigmoid Function σ(x)

Asymptotes at y = 1 and y = 0

Easy to specify threshold (σ(0) = .5)

Results are P(y=1)

As a result:

Where θ is a vector of weights

Cost FunctionNeed to find hθ(x) that is a logistic

function that represents our data

Need to find θ to fit our data

-log(1-x)-log(x)

Gradient Descent

In order to find the minimum, we can use the partial derivative of J(θ)

}until θ converges

Where α is the learning rate (almost always between 0 and 1, .1-.3 usually a good range)

Maximum Likelihood Estimation

}until θ converges

Can also be calculated using:

Iteratively Reweighted Least Squares

Multinomial data uses Softmax Regression

Interpreting hypothesis

Recall that σ(0) = .5 and that hθ(x) = σ(θTx)

Interpreting hθ

I want to create a model to give me the probability that I will pass a test given how many hours I have studied

Hours 0.50 0.75 1.00 1.25 1.50 1.75 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 4.00 4.25 4.50 4.75 5.00 5.50

Pass 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1

Using this generated model, calculate my probability of passing given I have studied 3 hours

P(passing| study time = 3) = .61

9source

Logistic Regression

Compared to Other

Classifiers

Naive Bayes

Support Vector Machines

Decision Trees

vs Decision TreeAssumptions

DT: decision boundaries parallel to axes

LR: one smooth boundary

Decision trees can be used when there are multiple decision boundaries

Feature Weights

NB: each set independently depending on class

LR: together such that decision function tends to be high for positive classes and low for negative classes

Correlated features have no effect on logistic regression

vs Naive Bayes

vs Support Vector Machine

Both attempt to find hyperplane separating training samples

SVM: find the solution with maximum margin

LR: find any solution that separates the instances

SVM is a hard classified while LR is probabilistic

AdvantagesWorks well with diagonal decision boundaries

Does not give undue weight to correlated features

Probabilistic outcomes

Requires large sample size for stable results

Disadvantages

Use CasesCategorical outcomes

Large sample data

Minimal preprocessing

For more info...

Helpful links to go into more depth with Logistic Regression

Stanford Open Course (Logit regression section)

Logit Regression Tutorial (exercises in MATLAB)

Logit Regression Tutorial (no code)

How to use Logit Regression in Python

How to use Logit Regression in R

How to use Logit Regression in Java using Weka

Intro to Logistic Regression

Data & Analytics

Intro Logistic Regression Gradient Descent + SGD AdaGradcourses.cs.washington.edu/courses/cse547/14wi/slides/...1 1 Intro Logistic Regression Gradient Descent + SGD AdaGrad Machine

Logistic Regression

Lecture 20 - Logistic Regression - Statistical Science · Logistic Regression Logistic Regression Logistic regression is a GLM used to model a binary categorical variable using numerical

Intro Logistic Regression Gradient Descent + SGDIntro Logistic Regression Gradient Descent + SGD Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade

Binary Logistic Regression Multinomial Logistic Regressionmgormley/courses/10601/slides/lecture10-multi.pdf · Binary Logistic Regression + Multinomial Logistic Regression 1 10-601

Lecture 11 Intro to logistic regression - UW Courses Web ...courses.washington.edu/b515/l11.pdfLecture 11 Intro to logistic regression BIOST 515 February 10, 2004 BIOST 515, Lecture

Logistic Regression - cs.wellesley.educs.wellesley.edu/~cs305/lectures/6_Logistic_Regression.pdfLogistic Regression Logistic regression is used for classification, not regression!

Open Intro Multiple and Logistic Regression

Logistic Regression and Discriminant Analysis · 2018-04-16 · Discriminant Analysis? Logistic Regression . Logistic Regression •Logistic regression builds a predictive model for

Logistic Regression€¦ · Logistic Regression • Combine with linear regression to obtain logistic regression approach: • Learn best weights in • • We know interpret this

Logistic regression

Regression analysis Linear regression Logistic regression

Binary Logistic Regression - Juan BattleBinary Logistic Regression • The logistic regression model is simply a non-linear transformation of the linear regression. • The logistic

Logistic Regression Using SPSS - sites.education.miami.edu...Logistic Regression Using SPSS Overview Logistic Regression - Logistic regression is used to predict a categorical (usually

Stata for Logistic Regression - people.umass.edu for Logistic Regression.pdfFit a Logistic Regression Model Summary The commands logit and logistic will fit logistic regression models

Introduction to Logistic Regression Modeling - minitab.com · Logistic Regression will estimate binary (Cox (1970)) and multinomial (Anderson (1972)) logistic models. Logistic Regression

Logistic Regression for Distribution Modeling - CLAS Usersusers.clas.ufl.edu/.../logistic-regression_modeling.pdf · Logistic Regression for Distribution Modeling ... logistic regression

Logistic Regression - svivek · Logistic Regression is the discriminative version. This lecture •Logistic regression •Connection to Naïve Bayes •Training a logistic regression

And Logistic Regression Linear and Logistic Regression

Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data