Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

  • Published on
    14-Jan-2017

  • View
    364

  • Download
    0

Embed Size (px)

Transcript

PowerPoint Presentation

2015 Healthcare Data SciencePractical Data Science: The WPC Healthcare Strategy for Delivering Meaningful Data Science ProjectsDamian Mingle

@OPENDATASCI

We primarily focus on clinical, financial, and operational data. We work with both Payers and Providers in Healthcare. 1

Representative Clients2

A sample of some of the clients that we work with2

Whats the Problem?3A Common Scenario: Johnny Data ScientistDoes not like working with othersToo much black magic not enough explanationHis process is always differentUses multiple languagesHates producing presentationsConstant unclear project statusDoesnt capture business needsModels arent production quality

In a business context dont be a Johnny Data Scientist. 3

Why Jupyter?4Interactive Computing EnvironmentNotebook Web Application: Writing and running code interactivelyKernels: Over 40 programming languagesNotebook Documents: Self-contained documents which include: Live code, Interactive widgets, Plots, Narrative text, Equations, Images, and Video

Some benefits: It gets things out into the openIt serves as a great placeholderAll your analysis can be in a single-place; even if you work with other languages.4

Why a Data Science Methodology?5Data Science Projects Involve RiskStrategically: Provides confidence to the business that Data Science projects can be delivered profitablyTactically: Management can understand status assessmentsOperationally: Empowers the Data Science team to do the right thing, the right way, the first time.

Some benefits:Having a method allows you to be creativeHaving a method allows a business only person follow alongHaving a method allows you to create learnings that you can reuse for future data science projects5

Business UnderstandingUncover important factors at the StartDetermine business objectivesAssess situationDetermine data science goalsProduce project planUnderstand the Data Science project objectives and requirements from a business perspective. Then convert this knowledge into a Data Science problem definition and preliminary plan designed to achieve the objective. 6

6

Exercise 17

https://github.com/drmingle/Boston-Data-Festival-2015/tree/master/Exercises

Data UnderstandingBecome familiar with the dataCollect initial dataDescribe dataExplore dataVerify data qualityIdentify data quality problems, discover first insights into the data, and/or detect interesting subsets to form hypotheses regarding hidden information.8

Exercise 29

https://github.com/drmingle/Boston-Data-Festival-2015/tree/master/Exercises

9

Data PreparationConstruct the Final DatasetSelect dataClean dataConstruct dataIntegrate dataFormat dataData Science task in this phase have to do with selection of table, record, and attributes. In addition, transformation and cleaning of data.10

Exercise 311

https://github.com/drmingle/Boston-Data-Festival-2015/tree/master/Exercises

11

ModelingVarious Modeling Techniques Are SelectedSelect modeling techniqueGenerate test designBuild modelAssess modelIn this phase, calibrating parameters is important. Some techniques may require the Data Scientist to go back to the data preparation phase. 12

Exercise 413

https://github.com/drmingle/Boston-Data-Festival-2015/tree/master/Exercises

13

EvaluationReview Your Steps with CertaintyEvaluate resultsReview processDetermine next stepsAt the end of this phase, a decision on the use of the data science results should be reached. 14

Exercise 515

https://github.com/drmingle/Boston-Data-Festival-2015/tree/master/Exercises

15

DeploymentMake Use of The ModelPlan deploymentPlan monitoring and maintenanceProduce final reportReview projectThis phase can be as simple as generating a report or as complex as implementing a repeatable data science process across the enterprise. 16

Exercise 617

https://github.com/drmingle/Boston-Data-Festival-2015/tree/master/Exercises

17

Data Scientist 2.018Lead Analytically Your OrganizationUse Jupyter to document your process real time using whatever language you want!Establish a Data Science Methodology that is comprehensiveProvide insights that help the organization make better decisions to solve their business problems

18

Have Questions?E-mail: dmingle@wpchealthcare.comTwitter: @damianmingleLinkedIn: DamianRMingle

Recommended

View more >