Demystifying the Data Scientist
Dan McClary, Ph.D. Big Data Product Management Oracle
Note: The speaker notes for this slide include detailed instructions on how to customize this Title Slide with your own picture.
Tip! Remember to remove this text box.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Data Scientists: By The Numbers
What’s a Data Scientist?
Do I need a Data Scientist?
How do I grow my own Data Scientist?
1
2
3
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
What’s a Data Scientist
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
What’s Data Science Buzzword or Essential Discipline?
• The buzz around “Data Science” is growing
• But isn’t it a bit like saying “chai tea?”
• What is a functional definition for data science?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
A Working Definition
• Data Science seeks to
– Extract meaning from data
– Create “data products”
– Use all available data to tell a valuable story to non-practioners
• So what makes a Data Scientist?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Anatomy of a Data Scientist
Statistical analysis
Scientific training
PhD in Computer Science? Statistics? Physics? Biology?!
Production-grade programmer in Java? Python? SQL
Business sensibility
Visualization
IT Operations Databases
Design Sensibility
Published researcher
BI Tools
Machine Learning Pattern Recognition
Competitive Intelligence Hadoop
Big Data
Excellent Communicator/Presenter Javascript
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Anatomy of a Data Scientist
Does anyone like that even exist?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Anatomy of a Data Scientist: Revised
Business
Data Analytics
• Value Proposition • Goals • Communicate
Results
• Techniques • Interpretation • Model
Requirements
• Integration • Manipulation • Quality
Assurance
A person who has some degree of experience in each of
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Do I Need a Data Scientist?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Do You Need A Data Scientist?
• Do you need an army of PhDs to solve machine learning problems?
– Probably not
• Could you find more value in the data you do and can collect?
– Undoubtedly
• Do you need people to find that value – Almost certainly
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Fitting for Data Scientists
• Where?
– Kaggle.com – a community for Data Science • +100,000 members
– KDNuggets – forum for Data Mining and Data Science
• Who do I hire? – Some call themselves “data scientists,” but most call themselves
• Mathematicians
• Scientists
• Reasearchers
• Physicists
Who? Where? How Many?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Fitting for Data Scientists
• Most organizations will benefit from a few seasoned data scientists
– Help transition to a more data-driven business
– Direct efforts to integrate analytics more tightly with LoBs
– Good understanding of how to tackle new problems
• Data scientists can be grown at home
– Leverage the existing workforce
– Provide growth opportunities for employees
How many do I need?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
How do I grow Data Scientists?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Step #1: Find Motivated Individuals
• Developers who want to
– Become more statistically oriented
– Better understand business challenges
• Business Analysts who – Have some programming ability
– Want to grow their technical capabilities
• All candidates should
– Possess tremendous curiosity
– Be able to self-manage
Sources for Good Candidates
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Step #2: Find Low-Hanging Fruit
• Find a project that has
– High ROI
– Limited, defined scope
– Isn’t impossible
• Define
– The business value
– The time to invest
Analytically Important, Not Impossible
Val
ue
to B
usi
nes
s
Time to Answer
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Step #3: Combine
• Add your data science team
• And the well-defined project
– Add a seasoned data scientist for best results
• Watch the team grow new skills
• Evaluate the outcome
– For the team members
– For the business
And Iterate
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Step #4: Publish and Promote Share Data Science Results as a Service
Data Scientist
Useful Derived Dataset
Anyone
Spark
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Summary
• What is a Data Scientist
– Someone who can help drive value through data
• Do you need one?
– Possibly
• Can you grow a data scientist – Absolutely
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |