32
Sem2 2017 Lecturer: Trevor Cohn Lecture 1. Introduction. Probability Theory COMP90051 Machine Learning Adapted from slides provided by Ben Rubinstein

Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

Embed Size (px)

Citation preview

Page 1: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

Sem22017Lecturer:TrevorCohn

Lecture1.Introduction.ProbabilityTheoryCOMP90051MachineLearning

AdaptedfromslidesprovidedbyBenRubinstein

Page 2: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

WhyLearnLearning?

2

Page 3: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Motivation

• “Wearedrowningininformation,butwearestarvedforknowledge”

- JohnNaisbitt,Megatrends

• Data=rawinformation

• Knowledge=patternsormodelsbehindthedata

3

Page 4: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Solution:MachineLearning• Hypothesis:pre-existingdatarepositoriescontainalotofpotentiallyvaluableknowledge

• Missionoflearning:findit

• Definitionoflearning:

(semi-)automaticextractionofvalid,novel,usefulandcomprehensibleknowledge– intheformofrules,regularities,patterns,constraintsormodels– fromarbitrarysetsofdata

4

Page 5: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

ApplicationsofMLareDeepandPrevalent• Onlineadselectionandplacement

• Riskmanagementinfinance,insurance,security

• High-frequencytrading

• Medicaldiagnosis

• Miningandnaturalresources

• Malwareanalysis

• Drugdiscovery

• Searchengines…

5

Page 6: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

DrawsonManyDisciplines

• ArtificialIntelligence• Statistics• Continuousoptimisation• Databases• InformationRetrieval• Communications/informationtheory• SignalProcessing• ComputerScienceTheory• Philosophy• Psychologyandneurobiology

…6

Page 7: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Job$

7

ManycompaniesacrossallindustrieshireMLexperts:

DataScientistAnalyticsExpertBusinessAnalystStatisticianSoftwareEngineerResearcher…

Page 8: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

AboutthisSubject

8

(refertosubjectoutlineongithub formoreinformation– linkedfromLMS)

Page 9: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

VitalStatistics

9

Lecturers:Weeks1;9-12

Weeks2-8

TrevorCohn(DMD8.,[email protected])A/Prof&FutureFellow,Computing&InformationSystemsStatisticalMachineLearning,NaturalLanguageProcessing

Andrey Kan ([email protected])ResearchFellow,WalterandEliza HallInstituteML,Computationalimmunology,Medicalimageanalysis

Tutors: YasmeenGeorge([email protected])Nitika Mathur ([email protected])YuanLi([email protected])

Contact: Weeklyyoushouldattend2xLectures,1xWorkshop

OfficeHours Thursdays1-2pm,7.03DMDBuilding

Website: https://trevorcohn.github.io/comp90051-2017/

Page 10: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

AboutMe(Trevor)

10

• PhD2007– UMelbourne

• 10yearsabroadUK* EdinburghUniversity,inLanguagegroup* SheffieldUniversity,inLanguage&Machinelearninggroups

• Expertise:Basicresearchinmachinelearning;Bayesianinference;graphicalmodels;deeplearning;applicationstostructuredproblemsintext(translation,sequencetagging,structuredparsing,modellingtimeseries)

Page 11: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

SubjectContent

• Thesubjectwillcovertopicsfrom

Foundationsofstatisticallearning,linearmodels,non-linearbases,kernelapproaches,neuralnetworks,Bayesianlearning,probabilisticgraphicalmodels(BayesNets,MarkovRandomFields),clusteranalysis,dimensionalityreduction,regularisationandmodelselection

• Wewillgainhands-onexperiencewithallofthisviaarangeoftoolkits,workshoppracs,andprojects

11

Page 12: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

SubjectObjectives

• Developanappreciationfortheroleofstatisticalmachinelearning,bothintermsoffoundationsandapplications

• GainanunderstandingofarepresentativeselectionofMLtechniques

• Beabletodesign,implementandevaluateMLsystems

• BecomeadiscerningMLconsumer

12

Page 13: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Textbooks

• Primarilyreferencesto* Bishop(2007)PatternRecognitionandMachineLearning

• Othergoodgeneralreferences:* Murphy(2012)MachineLearning:AProbabilisticPerspective[readfreeebookusing‘ebrary’athttp://bit.ly/29SHAQS]

* Hastie,Tibshirani,Friedman(2001)TheElementsofStatisticalLearning:DataMining,InferenceandPrediction [freeathttp://www-stat.stanford.edu/~tibs/ElemStatLearn]

13

Page 14: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Textbooks

• ReferencesforPGMcomponent* Koller,Friedman(2009)ProbabilisticGraphicalModels:PrinciplesandTechniques

14

Page 15: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

AssumedKnowledge(Week2WorkshoprevisesCOMP90049)

• Programming* Required:proficiencyatprogramming,ideallyinpython* Ideal:exposuretoscientificlibrariesnumpy,scipy,matplotlib etc.(similarinfunctionalitytomatlab &aspectsofR.)

• Maths* Familiaritywithformalnotation* Familiaritywithprobability(Bayesrule,marginalisation)* Exposuretooptimisation(gradientdescent)

• ML:decisiontrees,naïveBayes,kNN,kMeans

15

𝐏𝐫 𝒙 =% 𝐏𝐫(𝒙, 𝒚)�

𝒚

Page 16: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Assessment

• Assessmentcomponents* Twoprojects– onereleasedearly(w3-4),onelate(w7-8);willhave~3weekstocomplete• Firstprojectfairlystructured(20%)• Secondprojectincludescompetitioncomponent(30%)

* FinalExam

• Breakdown* 50%Exam* 50%Projectwork

• 50%Hurdleappliestobothexam andongoingassessment

16

Page 17: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

MachineLearningBasics

17

Page 18: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Terminology

• Inputtoamachinelearningsystemcanconsistof

* Instance:measurementsaboutindividualentities/objectsaloanapplication

* Attribute(akaFeature,explanatoryvar.):componentoftheinstancestheapplicant’ssalary,numberofdependents,etc.

* Label(akaResponse,dependentvar.):anoutcomethatiscategorical,numeric,etc.forfeitvs.paidoff

* Examples:instancecoupledwithlabel<(100k,3),“forfeit”>

* Models:discoveredrelationshipbetweenattributesand/orlabel18

Page 19: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Supervisedvs UnsupervisedLearning

19

Data Modelusedfor

Supervisedlearning Labelled Predictlabelsonnew

instances

Unsupervisedlearning Unlabelled

Cluster relatedinstances;Projecttofewerdimensions;Understandattributerelationships

Page 20: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

ArchitectureofaSupervisedLearner

20

Testdata

Traindata Learner

Model

Evaluation

Examples

Instances

Labels

Labels

Page 21: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Evaluation(SupervisedLearners)

• Howyoumeasurequalitydependsonyourproblem!

• Typicalprocess* Pickanevaluationmetriccomparinglabelvsprediction* Procureanindependent,labelledtestset* “Average”theevaluationmetricoverthetestset

• Exampleevaluationmetrics* Accuracy,Contingencytable,Precision-Recall,ROCcurves

• Whendatapoor,cross-validate

21

Page 22: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Dataisnoisy(almostalways)

22

• Example:* givenmarkforKnowledgeTechnologies(KT)

* predictmarkforMachineLearning(ML)

KTmark

MLm

ark

*syntheticdata:)

Trainingdata*

Page 23: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Typesofmodels

23

𝑦- = 𝑓 𝑥

KTmarkwas95,MLmarkispredictedto

be95

𝑃 𝑦 𝑥

KTmarkwas95,MLmarkislikelytobein

(92,97)

𝑃(𝑥, 𝑦)

probabilityofhaving(𝐾𝑇 = 𝑥,𝑀𝐿 = 𝑦)

𝑥

Page 24: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

ProbabilityTheory

24

Briefrefresher

Page 25: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

BasicsofProbabilityTheory

• Aprobabilityspace:* SetW ofpossibleoutcomes

* SetF ofevents(subsetsofoutcomes)

* ProbabilitymeasureP:Fà R

• Example:adieroll* {1,2,3,4,5,6}

* {j,{1},…,{6},{1,2},…,{5,6},…,{1,2,3,4,5,6}}

* P(j)=0,P({1})=1/6,P({1,2})=1/3,…

25

Page 26: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

AxiomsofProbability

1. 𝑃(𝑓) ≥ 0 foreveryeventf inF

2. 𝑃 ⋃ 𝑓�8 = ∑ 𝑃(𝑓)�8 forallcollections*ofpairwise

disjointevents

3. 𝑃 Ω = 1

26

*Wewon’tdelvefurtherintoadvancedprobabilitytheory,whichstartswithmeasuretheory.Buttobeprecise,additivityisovercollectionsofcountably-manyevents.

Page 27: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

RandomVariables(r.v.’s)

• ArandomvariableX isanumericfunctionofoutcome𝑋(𝜔) ∈ 𝑹

• 𝑃 𝑋 ∈ 𝐴 denotestheprobabilityoftheoutcomebeingsuchthatX fallsintherangeA

• Example:X winningson$5betonevendieroll* Xmaps1,3,5to-5Xmaps2,4,6to5

* P(X=5)=P(X=-5)=½

27

Page 28: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Discretevs.ContinuousDistributions

• Discretedistributions* Governr.v.takingdiscretevalues

* Describedbyprobabilitymassfunctionp(x)whichisP(X=x)

* 𝑃 𝑋 ≤ 𝑥 = ∑ 𝑝(𝑎)DEFGH

* Examples:Bernoulli,Binomial,Multinomial,Poisson

• Continuousdistributions* Governreal-valuedr.v.

* CannottalkaboutPMFbutratherprobabilitydensityfunctionp(x)

* 𝑃 𝑋 ≤ 𝑥 = ∫ 𝑝 𝑎 𝑑𝑎DGH

* Examples:Uniform,Normal,Laplace,Gamma,Beta,Dirichlet

28

Page 29: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

-4 -2 0 2 4

0.0

0.1

0.2

0.3

0.4

x

p(x)

Expectation

• ExpectationE[X]isther.v.X’s“average”value* Discrete:𝐸 𝑋 = ∑ 𝑥𝑃(𝑋 = 𝑥)�

D

* Continuous:𝐸 𝑋 = ∫ 𝑥𝑝 𝑥 𝑑𝑥D

• Properties* Linear:𝐸 𝑎𝑋 + 𝑏 = 𝑎𝐸 𝑋 + 𝑏

𝐸 𝑋 + 𝑌 = 𝐸 𝑋 + 𝐸 𝑌* Monotone:𝑋 ≥ 𝑌 ⇒ 𝐸 𝑋 ≥ 𝐸 𝑌

• Variance:𝑉𝑎𝑟 𝑋 = 𝐸[ 𝑋 − 𝐸 𝑋 T]

29

Page 30: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

IndependenceandConditioning

• X,Y areindependent if* 𝑃 𝑋 ∈ 𝐴, 𝑌 ∈ 𝐵 =𝑃 𝑋 ∈ 𝐴 𝑃(𝑌 ∈ 𝐵)

* Similarlyfordensities:𝑝W,X 𝑥, 𝑦 = 𝑝W(𝑥)𝑝X(𝑦)

* Intuitively:knowingvalueofY revealsnothingaboutX

* Algebraically:thejointonX,Yfactorises!

• Conditionalprobability

* 𝑃 𝐴 𝐵 = Y(Z∩\)Y(\)

* Similarlyfordensities𝑝 𝑦 𝑥 = ](D,^)

](D)

* Intuitively:probabilityeventA willoccurgivenweknoweventB hasoccurred

* X,Yindependentequiv to𝑃 𝑌 = 𝑦 𝑋 = 𝑥 = 𝑃(𝑌 = 𝑦)

30

Page 31: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

InvertingConditioning:Bayes’Theorem

• IntermsofeventsA,B* 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 𝐵 𝑃 𝐵 = 𝑃 𝐵 𝐴 𝑃 𝐴

* 𝑃 𝐴 𝐵 = Y 𝐵 𝐴 Y(Z)Y(\)

• Simplerulethatletsusswapconditioningorder

• Bayesianstatisticalinferencemakesheavyuse* Marginals: probabilitiesofindividualvariables* Marginalisation:summingawayallbutr.v.’s ofinterest

31

Bayes

Page 32: Lecture 1. Introduction. Probability Theory 1. Introduction. Probability Theory COMP90051 Machine Learning ... • Risk management in finance, insurance, security • High-frequency

COMP90051MachineLearning(S22017) L1

Summary

• Whystudymachinelearning?

• Machinelearningbasics

• Reviewofprobabilitytheory

32