Upload
britanni-hyde
View
99
Download
9
Tags:
Embed Size (px)
DESCRIPTION
Prescriptive Analytics Part I. Nick Gonzalez, 2/10/14. “It is change, continuing change, inevitable change, that is the dominant factor in society today. No sensible decision can be made any longer without taking into account not only the world as it is, but the world as it will be.”. - PowerPoint PPT Presentation
Citation preview
-Isaac Asimov
“It is change, continuing change, inevitable change, that is the dominant factor in society today. No sensible decision can be made any
longer without taking into account not only the world as it is, but the world as it will be.”
Topics Covered
Reference automated prescriptive analytics system
Automated algorithm selection
Distributed algorithm development
Covered in future presentations
Ontology creation and extraction
Representing solutions using ontologies
Business optimization
everything else…
Prescriptive Analytics
Scalable
Automated understanding
Automated predictive analytics
Actionable
Closed loop
Example. Video Games
gamegame
metricsmetrics
learning learning processprocess
predictive predictive modelsmodels
deploydeploy
gamegameserverserver
rulesrules
simulationssimulations
write
startunderstand
ingbuild / update
models
modify
copy to production
generate
user spaceanalytics space
Goals
Remove the human element from analysis phases
Generate accurate, actionable, predictive models
Combine predictive models and simulation to solve problems
Guiding Principle
Big data with simple algorithms will out perform
sampled data with complex algorithms.
Process
DataData
Data Data Engineering Engineering
& & UnderstandiUnderstandi
ngng
ModelingModelingPrepPrep SimulatioSimulationn Actionable Actionable
DeploymentDeployment
1. Automated Understanding
Find the data representation that is most ideal
for the problem you are trying to solve.
Automated Understanding
Clean Clean DataData
StatsStats
metameta
RepresentatioRepresentation An A
RepresentatioRepresentation Bn B
RepresentatioRepresentation Cn C
A.1A.1
……
A.2A.2
……
2. Automated Algorithm Selection
Find the algorithm that performs best against the
problem you are trying to solve, while meeting all
criteria.
Automated Algorithm Selection
Choose algorithms best suited for this type of problem.
Consider the data, types, sparsity, size, and desired outcome
Try multiple algorithms
Calculate the Root Mean Squared Error or some other appropriate measure.
Consider problem domain.
Use cross validation.
Do not just compare the average RMSE
Choose the algorithm(s) that perform the best
Bottom Up Approach
HardwareHardware
Assembly LanguageAssembly Language
C, PascalC, Pascal
C++, JavaC++, Java
Design Patterns, AlgorithmsDesign Patterns, Algorithms
ProgrammerProgrammer
Top Down
Problem SolverProblem Solver
Problem RepresentationProblem Representation
Distributed System AbstractionsDistributed System Abstractions
Functional LanguagesFunctional Languages
HardwareHardware
Building Distributed Algorithms
Identify the simplest concepts that describe data processing
Collections
Collection processing
Problem SolverProblem Solver
Problem RepresentationProblem Representation
Distributed System Distributed System AbstractionsAbstractions
Functional LanguagesFunctional Languages
HardwareHardware
Single “Box”Single “Box”
Evolution of thought
DataDataDataData
AlgorithmAlgorithmAlgorithmAlgorithm
DataData DataData
CollectionCollection
Collection ProcessingCollection Processing
No “Box”
Coming together
mapmap mapcamapcatt
reducereduce
filterfilter sortsort groupgroup
HadooHadoopp
Single Single PCPC MPIMPI ……
k-meansk-means densitydensity randomrandomforestforest
gradientgradientboostboost
……..
Distributed Processing Interface
Simple concept
Focus on building algorithms
Many ways to implement this concept
Works with both shared memory systems and distributed memory systems
Implementation
Functional language - Clojure
Reusable functions as callbacks
Hadoop drivers written on top of Cascalog
Data location and type are abstracted as “collection”