Predictive Modeling; RNN, Cognitive Computing; Text Analyticspkalra/siv895/aadhar.pdf · Cognitive...

Preview:

Citation preview

Predictive Modeling; RNN, Cognitive Computing; Text

Analytics

Gur Saran Adhar Hon. Visiting Professor DEI

Professor, Univ. Of North Carolina Wilmington, US

adharg@uncw.edu

Just for example:

1.  Topredictifthecellsobservedunderthemicroscopeindicateamalignantorabenigntumor(levelofconfidence)?

2.  Giventhewinddirec?on,pressure,humidity,andtemperatureschangesoverSouth-EastUS,predicttheland-fallofanincominghurricane(levelofconfidence).

3.  GiventhehistoryofbaGngperformancebyIndianCricketteam,onsoKwicket,predicthowwillIndianteamperforminNatal,SouthAfrica.

4.  Giventhepaymenthistoryofaclient,predicttheriskassociatedwithhisloanrequest.

To evaluate a Predictive Model

MetricsforPerformanceEvalua?on•  Limita?onsofAccuracyasPerformanceEvalua?on• OvercomingLimita?onsofAccuracyMeasure

Precision,Sensi0vity,Specificity

Example: Predicting a (malignant) tumor from lab test TP(TruePosi?ve):Predic?onthatitismalignanttumorisconsistentwithwhatisdiscoveredwithsurgery.

TN(TrueNega?ve):Predic?onthatisitnotmalignant(benign)isconsistentwithwhatisdiscoveredwithsurgery.

FP(FalsePosi?ve):Predic?onthatthetumorismalignanttumorhoweveritturnsouttobebenigninsurgery.(cost…?Needlesssurgery)

FN(FalseNega?ve):Predic?onthatthetumorisnotmalignantbutisindeedmalignant(cost…?Lifethreatening)

Accuracy as a measure

Evaluating the Predictive Model Inotherwordshowgoodisourpredic?on?AccuracyasEvalua?ontoolSitua?on:Posi?vecases990;Nega?ve10IfmodelpredictseverythingtobePosi?ve,accuracyis99%BUT:Modelfailsonnega?vecases.Whatifnega?vecasesarereallyimportantandcostlytooverlook?Forexample,predic?ngtumorsasbenignwhenitisindeedmalignant.Orwhenpredic?ngitismalignantwhenitisindeedbenign.

Precision, Sensitivity, Specificity as a measure

Precision=TP/(TP+FP)Sensi?vity=TP/(TP+FN)Specificity=TN/(TN+FP)

A bit on Modeling techniques

Common Modeling techniques for Prediction •  SupervisedLearning

Describesanddis?nguishesclassesforfuturepredic?on(onfuturedata)basedontrainingdata.Commonmethods:Decisiontrees,Regression,NearestNeighbors,Neuralnetworks

• UnsupervisedLearningAnalyzesdatawherelabelsareunknowntocreategroupsorclassesforobjectsthataresimilartoeachother(withinthegroup)butaredissimilartoobjectsin othergroups(clusters).ClusterAnalysisCommonmethods:K-means,Hierarchical

SupervisedLearningTechnique–abitmoredetails

Classifica?onConstructsaclassifica?onmodelbasedontrainingset,andusesitforclassifyingnewdata.Forexample,classifica?onofcellsina?ssue.

Predic?onPredic?ngClasslabels.Forexampleifthe?ssuesamplehasamalignanttumorcells.Modelscon?nuousvariablesandpredictsunknownormissingvalues

Commonmethods:Decisiontrees;Regression;NearestNeighbors;Neuralnetworks.

…cont…Common Modeling techniques

Associa?onAnalyzingdataforeventsoninstancethatoccurtogether.

SamplingDatainclassifica0on

SimpleSamplesNotappropriateforunbalanceddata(e.g.,1000posi?veand100nega?vecases)

ComplexSamplesClusteredsamples:usedtosamplegroupsorclustersratherthanindividualunits.

Stra?fiedSamples:usedtoselectsamplesindependentlywithinno-overlappingsubgroupsofthepopula?on.Forexample,takeasamplewhichrepresentseverysocio-economicgroupinanunbanpopula?on.Forexample,MenandWomenaresampledinequalpropor?on.

Classifica0on–Trainingandtes0ng

SpliGngthedataintoTrainingandtes?ngApproximately66-75%fortrainingand34-25%fortes?ngTrainingtheModelOnthedatawithexis?ngclassesTes?ngtheModelOnthedatathatwasnotusedfortrainingEvalua?ngthemodel-Comparingtheaccuracyofthemodelontrainingandtes?ngsets

UsingtheModel-Classifyingfutureonunknownobjects

UnsupervisedLearning–abitmore

UnsupervisedLearningAnalyzesdatawherelabelsareunknowntocreategroups/classesforobjectsthataresimilartoeachother(withinthegroup)butdissimilartoobjectsinotherclusters.

ClusterAnalysis(example:classifica?onofcellsintoplantcells;andskincellsbasedonmorphology)

CommonMethods:K-mean;Hierarchical;two-step

WhatisaNeuralNetwork

PopularNeuralNetworks

RNN(recurrentneuralnetworks)

Whatissequencelearning?

Forexample,autocompletefeatureofGoogle,predic?ngthenextword,phrase.

RecurrentNeuralNetwork

IsatypeofAr?ficialNeuralNetworkdesignedtorecognizepajernsinsequenceofdatasuchastext,genome,handwri?ng,thespokenword,ornumerical?meseriesdataemana?ngforexamplefrom,sensors,stockmarkets

WhatisRNN?

RNN

TrainingRNN

UsesBackpropaga?onalgorithmfortraining,butitisappliedforevery?mestamp,commonlycalledBackpropaga?onthru?me(BTT)

IssueswithBackpropaga?on

VanishinggradientExplodinggradient

ProblemsinvolvingcontextLongtermdependenciesleadtogradientbecomingverysmall,orverylarge.Lossofinforma?onthru?me.Consequences:--Longtraining?me--Poorperformance--Badaccuracy

Overcomingthesechallenges

LSTM(longshort-termmemory)

Usecase

UseCase

ClusterAnalysis

Associa0on

Analyzingdataforeventsandinstancesthatoccurtogether.Forexample,peoplewhobuycoffeealsobuyacinnamontwist.Associa?onrules

Intro.ToDataMining

Whatcandataminingtell•  Associa?onrules:Forexample,peoplewhobuycoffeealsobuyacinnamontwist.

•  Classifica?on:Findingamodelthatdescribesdataandclassifiestoasetofcategories.Forexample,driverswithhighinsurancepremiumalsodrink.

•  Segmenta?on:Groupingobjectsbysimilarity.Forexample,customersaregroupedintofamilieswithchildren,collegestudents,urbanemptynesters

Intro.ToDataMining

Processofdiscovering•  Insights(descrip?ve,businessintelligence)•  Pajerns•  Rela?onships

Intro.ToDataMining

Whatknowledgecanbeextracted•  Descrip?ve:whathashappenedandwhy.•  Predic?ve:whatislikelytohappennext

DataMining+Predic0veModeling

DataMiningAlgorithms:Createpredic?vemodelsbyanalyzingdataautoma?callytolookforpajerns.

Predic?veModels:Containsthepajernsthathavebeenfound,andusethemtomakepredic?ons.

Examplepredic?ons:•  Cancerdiagnosis•  Creditriskscore•  Legi?macyofTransac?on

Cognitive Computing- Watson Deep Analytics

Cognitive Computing

Mimics certain aspects of Cognition Learns from data how to predict. Relies on two main ideas: --Machine learning at the core to prediction (predictive modeling) --Natural language processing (computational linguistics)

Cognitive Computing

The way humans decide - Evidence based decisions, - Finds answers and insights locked in data Physician; Wealth Manager; Metallurgist - Put into context volume of (unstructured) information Enhance human expertise

Cognitive Computing

Mirrors some of the key elements of Human Cognitive Capabilities: 1. Observe: visible phenomenon and bodies of evidence 2. Interpret: and generate hypotheses 3. Evaluate: which hypotheses are right or wrong

(based on evidence). 4. Decide: Choosing the option with a level of confidence

How Watson works?

Watson becomes an expert by going thru similar four steps (processes) at tremendous speed and scale Unlike conventional computing which can handle structured information (Database) Watson can understand unstructured data, information by humans meant for other humans

Text- Analytics

Text-analytics applications extracts some kind of useful information from text. Literature, blogs, posts, articles, Wiki posts, tweets, images...

Text-Analytics begins with linguistic annotation Linguistic Annotation are notes about linguistic features of the annotated texts which gives information about the words and sentences of the text. For example, a part-of-speech tagger adds an annotation to a word to say that the word is a noun, a verb, or some other part of speech. Linguistic Annotation are used by subsequent applications. For example, text-to-speech application. The economy will contract next year. (verb) They will read the contract tomorrow. (noun) They read the contract yesterday.

What Annotation looks like

Example Application: Named entity recognition

Value of linguistic annotation is seen in applications such as named entity recognition, for example, people, job titles, company, URL’s phone numbers. The word general can be job title (noun) as in highest ranking general, but in the general opinion it is not a job title (adjective). If part-of-speech tagger has already added these annotations, named entity recognizer can improve its precision.

Example: Entity Extraction

•  Company names •  Dates and times •  Domain-specific names such as names of diseases in

pharmaceutical data •  Monetary amounts •  People’s names and social network handles •  Phrases, negative or positive •  Product names

Example App. of Text Analytics: Sentiment Analysis

Sentiment analysis is the process of determining a sentiment score from text. Respond quickly to negative sentiment to minimize its impact. United Airlines. Product managers want to understand any problems with newly released products and services, so they can fix them quickly. GM Ignition switch problem. Finance departments to take customer sentiment into account when doing financial planning. Well Fargo.

Process of Text Analytics

Tokenizing the text—that is, breaking it down into words and phrases Detecting term boundaries Detecting sentence boundaries Tagging parts of speech—words such as nouns and verbs Tagging named entities so that they are identified—for example, a person, a company, a place, a gene, a disease, a product …

The Process of Text Analytics

Parsing—for example, extracting facts and entities from the tagged text Extracting knowledge to understand concepts such as a personal injury within an accident claim

Grammar based processing

Machine Learning Approach to Classification

1. Train a statistical model using examples which have already been correctly classified (recall supervised learning).

2. Test the trained model using similar number of examples which have been correctly classified but not used during training.

3. Deploy the trained and tested model to classify new cases to perform the NLP task for which model was developed.

Algorithms: Naive-Bayes classifier; Decision tree classifier; K nearest neighbor classifier; Maximum Entropy Classifier.

Text analytics understand text

Relies on Natural Language ● Governed by Rules of grammar ● Context, Culture ● Not just key word matches, but reads it by breaking down ● Structurally ● Grammatically ● Relationally

Text-Analytics is Deployed in Watson Inference Engine

Different than a simple speech recognition or iPhone Siri which converts human speech to set of sentences Watson extracts logical responses and draw inferences to potential answers relying on a broad range of linguistic models and algorithms

How Watson Learns?

For a domain ● Learn the language of the domain ● Terminology (jargon) ● Mode of thought of the domain

For Example: Cancer treatment

● Many types of cancer ● Each has symptoms ● Symptoms can be for patients not with cancer ● Side effects of treatments ● Factors affecting treatment ● Watson evaluates Standard Care Practices and thousands of pages of research

Curating the Content

Acquiring literacy in a field Loading corpus of knowledge in the field into Watson (with help from humans)

Ingestion

Pre-Processing to create indices and meta data (for efficiency). Knowledge graph to answer precise questions

Training by humans

Training Data: Experts in the field upload question/answer pairs Learns linguistic patterns of meaning in a domain Continues learning by interaction between users and Watson, as new research information gets available

Ready to answer

Gives recommendations (and the confidence) Metallurgist looking for New Alloys; Researcher looking to Develop Effective Drugs Uncover new possibilities in data, based on evidence

Evaluate

Generate hypotheses Look for evidence to support or refute hypotheses Gives weight to each hypotheses (weighted evidence scores) Answer with a level of confidence

Recommended