CS 6140: Machine Learning - College of Computer and ... · CS 6140: Machine Learning Spring 2017 ... – The purpose of this quiz is to indicate the expected ... • Deep Learning:

CS6140:MachineLearningSpring2017

Instructor:LuWangCollegeofComputerandInforma@onScience

NortheasternUniversityWebpage:www.ccs.neu.edu/home/luwang

Email:[email protected]

TimeandLoca@on

•  Time:Thursdaysfrom6:00pm–9:00pm

•  Loca)on:ForsythBuilding129

CourseWebpage

•  hPp://www.ccs.neu.edu/home/luwang/courses/cs6140_sp2017.html

Prerequisites

•  Programming– Beingabletowritecodeinsomeprogramminglanguages(e.g.Python,Java,C/C++,Matlab)proficiently

•  Courses– Algorithms– Probabilityandsta@s@cs– Linearalgebra

Prerequisites

•  Courses–  Algorithms–  Probabilityandsta@s@cs–  Linearalgebra

•  Aquiz:–  22simpleques@ons,20ofthemasTrueorFalseques@ons(relevanttoprobability,sta@s@cs,andlinearalgebra)

–  Thepurposeofthisquizistoindicatetheexpectedbackgroundofstudents.

–  80%oftheques@onsshouldbeeasytoanswer.–  Notcountedinyourfinalscore!

TextbookandReferences

•  MainTextbook–  KevinMurphy,"MachineLearning-aProbabilis@cPerspec@ve",MITPress,2012.

–  ChristopherM.Bishop,"PaPernRecogni@onandMachineLearning",Springer,2006.

•  Othertextbooks–  TomMitchell,"MachineLearning",McGrawHill,1997.

•  Machinelearninglectures

ContentoftheCourse•  Regression:linearregression,logis@cregression•  DimensionalityReduc)on:PrincipalComponentAnalysis(PCA),Independent

ComponentAnalysis(ICA),LinearDiscriminantAnalysis•  Probabilis)cModels:NaiveBayes,maximumlikelihoodes@ma@on•  Sta)s)calLearningTheory:VCdimension•  Kernels:SupportVectorMachines(SVMs),kerneltricks,duality•  Sequen)alModelsandStructuralModels:HiddenMarkovModel(HMM),

Condi@onalRandomFields(CRFs)•  Clustering:spectralclustering,hierarchicalclustering•  LatentVariableModels:K-means,mixturemodels,expecta@on-maximiza@on

(EM)algorithms,LatentDirichletAlloca@on(LDA),representa@onlearning•  DeepLearning:feedforwardneuralnetwork,restrictedBoltzmannmachine,

autoencoders,recurrentneuralnetwork,convolu@onalneuralnetwork•  ReinforcementLearning:Markovdecisionprocesses,Q-learning•  andothers,includingadvancedtopicsformachinelearninginnaturallanguage

processingandtextanalysis

TheGoal

•  Scien@ficunderstandingofmachinelearningmodels

•  Howtoapplyanddesignlearningmethodsfornovelproblems

TheGoal

•  Notonlywhat,butalsowhy!

Grading•  Assignment

–  3assignments,10%foreach

•  Quiz–  10in-classtests,1%foreach

•  Exam–  1exam,30%

•  Project–  1project,27%

•  Par@cipa@on–  3%–  Classes–  Piazza

Exam

•  Openbook•  April20,2017

CourseProject

•  Amachinelearningrelevantresearchproject

•  2-3studentsasateam

Topics

•  Machinelearningrelevant– Naturallanguageprocessing– Computervision– Robo@cs– Bioinforma@cs– Healthinforma@cs– …

CourseProjectGrading

•  Wewanttoseenovelandinteres@ngprojects!– Theproblemneedstobewell-defined,novel,useful,andprac@cal

– machinelearningtechniques

– Reasonableresultsandobserva@ons

ProjectfromLastYear

ProjectfromLastYear

•  Predic@ngFollow-backBehaviorinInstagramUsers

ProjectfromLastYear

•  Predic@ngGraspPointsUsingConvolu@onalNeuralNetworks

ProjectfromLastYear

•  Ar@ficialNeuralNetworksforDrugResponsePredic@oninTailoredTherapy

ProjectfromLastYear

•  ThreatDetec@onfromTwiPer

ProjectfromLastYear

•  PlayerRankinginPopularGames

CourseProjectGrading

•  Threereports– Proposal(2%)– Progress,withcode(10%)– Final,withcode(10%)

•  Onepresenta@on–  Inclass(5%)

SubmissionandLatePolicy•  Eachassignmentorreport,bothelectroniccopyandhardcopy,isdueatthebeginningofclassonthecorrespondingduedate.

•  Programminglanguage–  Python,Java,C/C++,Matlab

•  Electronicversion–  Onblackboard

•  Hardcopy–  Inclass

SubmissionandLatePolicy

•  Assignmentorreportturnedinlatewillbecharged10points(outof100points)offforeachlateday(i.e.24hours).

•  Eachstudenthasabudgetof5daysthroughoutthesemesterbeforealatepenaltyisapplied.

Howtofindus?•  Coursewebpage:–  hPp://www.ccs.neu.edu/home/luwang/courses/cs6140_sp2017.html

•  Officehours–  LuWang:Thursdaysfrom4:30pmto5:30pm,orbyappointment,448WVH

–  RuiDong(TA),Tuesdaysfrom4:00pmto5:00pm,orbyappointment,466BWVH

•  Piazza–  hPp://piazza.com/northeastern/spring2017/cs614002– Allcourserelevantques@onsgohere

WhatisMachineLearning?

•  “Asetofmethodsthatcanautoma@callydetectpaPernsindata,andthenusetheuncoveredpaPernstopredictfuturedata,ortoperformotherkindsofdecisionsmakingundercertainty.”

RealWorldApplica@ons










Rela@onswithOtherAreas

•  NaturalLanguageProcessing

•  ComputerVision

•  Robo@cs

•  Alotofotherareas…

Today’sOutline

•  Basicconceptsinmachinelearning

•  K-nearestneighbors

•  Linearregression

•  Ridgeregression

Supervisedvs.UnsupervisedLearning

SupervisedLearning


•  Supervisedlearning

Trainingset Trainingsample Gold-standardlabel-  Classifica)on,ifcategorical-  Regression,ifnumerical

SupervisedLearning

SupervisedLearning

SupervisedLearning

•  Goal:– Generalizabletonewinputsamples– Overfivngvs.underfivng– Onesolu@on:weuseprobabilis@cmodels

•  Typicalsetup:– Step1:Features– Step2:Trainingset,testset,developmentset– Step3:Evalua@on

SupervisedLearning

SupervisedLearning

SupervisedLearning

SupervisedLearning

•  Regression– Predic@ngstockprice– Predic@ngtemperature– Predic@ngrevenue…


•  UnsupervisedLearning

•  Moreabout“knowledgediscovery”

UnsupervisedLearning

•  Dimensionreduc@on– Principalcomponentanalysis


•  Clustering(e.g.graphmining)

RolX:RoleExtrac.onandMininginLargeNetworks,byHendersonetal,2011


•  Topicmodeling

Parametricvs.Non-parametricmodel

•  Fixednumberofparameters?–  Ifyes,parametricmodel

•  Numberofparametersgrowwiththeamountoftrainingdata?–  Ifyes,non-parametricmodel

•  Computa@onaltractability

Today’sOutline


•  K-nearestneighbors– Supervisedlearning– Anon-parametricclassifier



Anon-parametricclassifier:K-nearestneighbors(KNN)


•  Basicidea:memorizeallthetrainingsamples– Themoreyouhaveintrainingdata,themorethemodelhastoremember



•  Nearestneighbor(or1-nearestneighbor):– Tes@ngphase:findclosetsample,andreturncorrespondinglabel



•  K-Nearestneighbor:– Tes@ngphase:findtheKnearestneighbors,andreturnthemajorityvoteoftheirlabels

AboutK

•  K=1:justpiecewiseconstantlabeling•  K=N:globalmajorityvote(class)

ProblemsofkNN

•  Canbeslowwhentrainingdataisbig– Searchingfortheneighborstakes@me

•  Needslotsofmemorytostoretrainingdata

•  Needstotunekanddistancefunc@on

•  Notaprobabilitydistribu@on

ProblemsofkNN

•  Distancefunc@on– Euclideandistance

ProblemsofkNN

•  Distancefunc@on– Mahalanobisdistance:weightsoncomponents

Probabilis@ckNN

•  Wepreferaprobabilis@coutputbecausesome@meswemaygetan“uncertain”result– 1samplesas“yes”,199samplesas“no”à?– 99samplesas“yes”,101samplesas“no”à?

•  Probabilis@ckNN:

Probabilis@ckNN

3-classsynthe@ctrainingdata

Smoothing

•  Class1:3,class2:0,class3:1•  Originalprobability:– P(y=1)=3/4,p(y=2)=0/4,p(y=3)=1/4

Smoothing


•  Add-1smoothing:– Class1:3+1,class2:0+1,class3:1+1– P(y=1)=4/7,p(y=2)=1/7,p(y=3)=2/7

Soxmax


•  Redistributeprobabilitymassintodifferentclasses– Defineasoxmaxas

Today’sOutline



•  Linearregression– Supervisedlearning– Aparametricclassifier


Aparametricclassifier:linearregression

•  Assump@on:theresponseisalinearfunc@onoftheinputs

InnerproductbetweeninputsampleXandweightvectorW

Residualerror:differencebetweenpredic@onandtruelabel


•  Assumeresidualerrorhasanormaldistribu@on

InnerproductbetweeninputsampleXandweightvectorW

Residualerror:differencebetweenpredic@onandtruelabel


•  Wecanfurtherassume

•  Basicfunc@onexpansion


Ver@cal:temperatureHorizontal:loca@onwithinaroom


LearningwithMaximumLikelihoodEs@ma@on(MLE)

•  MaximumLikelihoodEs@ma@on(MLE)


•  Log-likelihood

•  Maximizelog-likelihoodisequivalenttominimizenega@velog-likelihood(NLL)


•  Withournormaldistribu@onassump@on

Residualsumofsquares(RSS)àWewanttominimizeit!

Deriva@onofMLEforLinearRegression

•  Rewriteourobjec@vefunc@onas



•  Getthederiva@ve(orgradient)



•  Getthederiva@ve(orgradient)

•  Setourderiva@veto0

Ordinaryleastsquaressolu)on

Overfivng

Featureweightsw:

APriorontheWeight

•  Zero-meanGaussianprior

APriorontheWeight


•  Newobjec@vefunc@on

APriorontheWeight


•  Newobjec@vefunc@on

Today’sOutline





RidgeRegression

•  Wewanttominimize

RidgeRegression


•  Newes@ma@onfortheweight

RidgeRegression



L2regulariza)on

RidgeRegression



L2regulariza)on

LeavetheproofinAssignment1!

Whatwelearned





Homework

•  ReadingMurphych1,ch2,andch7(onlythesec@onscoveredinthelecture)

•  SignupatPiazza•  hPp://piazza.com/northeastern/spring2017/cs614002

•  Startthinkingaboutcourseprojectandfindateam!– ProjectproposaldueJan26

Homework

•  ReadingMurphych1,ch2,andch7•  SignupatPiazza•  hPp://piazza.com/northeastern/spring2017/cs614002

•  Startthinkingaboutcourseprojectandfindateam!– ProjectproposaldueJan26

•  NextTime:Logis@cRegression,DecisionTree,Genera@veModels(NaiveBayes)– Reading:MurphyCh3,8.1-8.3,8.6,16.2

Documents

CS 6140: Machine Learning - College of Computer and ... · CS 6140: Machine Learning Spring 2017 ... – The purpose of this quiz is to indicate the expected ... • Deep Learning: