Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

  • Published on

  • View

  • Download

Embed Size (px)


PowerPoint Presentation

2015 Healthcare Data SciencePractical Data Science: The WPC Healthcare Strategy for Delivering Meaningful Data Science ProjectsDamian Mingle


We primarily focus on clinical, financial, and operational data. We work with both Payers and Providers in Healthcare. 1

Representative Clients2

A sample of some of the clients that we work with2

Whats the Problem?3A Common Scenario: Johnny Data ScientistDoes not like working with othersToo much black magic not enough explanationHis process is always differentUses multiple languagesHates producing presentationsConstant unclear project statusDoesnt capture business needsModels arent production quality

In a business context dont be a Johnny Data Scientist. 3

Why Jupyter?4Interactive Computing EnvironmentNotebook Web Application: Writing and running code interactivelyKernels: Over 40 programming languagesNotebook Documents: Self-contained documents which include: Live code, Interactive widgets, Plots, Narrative text, Equations, Images, and Video

Some benefits: It gets things out into the openIt serves as a great placeholderAll your analysis can be in a single-place; even if you work with other languages.4

Why a Data Science Methodology?5Data Science Projects Involve RiskStrategically: Provides confidence to the business that Data Science projects can be delivered profitablyTactically: Management can understand status assessmentsOperationally: Empowers the Data Science team to do the right thing, the right way, the first time.

Some benefits:Having a method allows you to be creativeHaving a method allows a business only person follow alongHaving a method allows you to create learnings that you can reuse for future data science projects5

Business UnderstandingUncover important factors at the StartDetermine business objectivesAssess situationDetermine data science goalsProduce project planUnderstand the Data Science project objectives and requirements from a business perspective. Then convert this knowledge into a Data Science problem definition and preliminary plan designed to achieve the objective. 6


Exercise 17

Data UnderstandingBecome familiar with the dataCollect initial dataDescribe dataExplore dataVerify data qualityIdentify data quality problems, discover first insights into the data, and/or detect interesting subsets to form hypotheses regarding hidden information.8

Exercise 29


Data PreparationConstruct the Final DatasetSelect dataClean dataConstruct dataIntegrate dataFormat dataData Science task in this phase have to do with selection of table, record, and attributes. In addition, transformation and cleaning of data.10

Exercise 311


ModelingVarious Modeling Techniques Are SelectedSelect modeling techniqueGenerate test designBuild modelAssess modelIn this phase, calibrating parameters is important. Some techniques may require the Data Scientist to go back to the data preparation phase. 12

Exercise 413


EvaluationReview Your Steps with CertaintyEvaluate resultsReview processDetermine next stepsAt the end of this phase, a decision on the use of the data science results should be reached. 14

Exercise 515


DeploymentMake Use of The ModelPlan deploymentPlan monitoring and maintenanceProduce final reportReview projectThis phase can be as simple as generating a report or as complex as implementing a repeatable data science process across the enterprise. 16

Exercise 617


Data Scientist 2.018Lead Analytically Your OrganizationUse Jupyter to document your process real time using whatever language you want!Establish a Data Science Methodology that is comprehensiveProvide insights that help the organization make better decisions to solve their business problems


Have Questions?E-mail: dmingle@wpchealthcare.comTwitter: @damianmingleLinkedIn: DamianRMingle


View more >