Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
UX & Social
Wearables & Internet of things
Mobile
Quality Engineering
Gaming
Product Innovation
Consumer Experience
Cloud Computing & Infra
Digital Content
Big Data & High Performance
After Going Live
Enterprise Consumerization
Mobile Product Dev / Native & Hybrid
Cloud Computing/ Managed Services / Information Security
E-Commerce / Travel / Front End Engineering
Product Innovation / Innovation as a Service / Product Landing
Design Thinking / User Experience Design / Visual Design / Service Design / Creative Workshops
Test Automation / Mobile Testing / Game QA
Hardware Design & Integration / Native Wearable / Wearable App Usability / Interface Design
Digital Platforms / Game Development / Graphic Engineering
Content Management / Digital Marketing / Video Content Production / E-Learning
Software Archaeology / Software Maintenance
Data Architecture / Data Science / Data Visualization
Collaboration Solutions / Process Engineering Tools
PODs
Wisdom Nuggets from practicing Data Science in the “Real World”
Different people have different ideas about what Data Science actually is
● Common definition as key success factor
● Interdisciplinary approach
● Effective decision-making models & Production Environment
The intended use of the output may differ from a given technique’s purpose
● It all begins with a question
● Predicted Values, Decision Logic, Feature Relevance, Variable Impact
● Predictive vs. Explicative
Model complexity is a wolf in sheep’s clothing
● Simplest Model that gets the job done
● Degrees of freedom, risk of overfitting, chances of unintended consequences, less stability, false discovery
● Well Done usually looks “Simple”
Neverending discovery
● Highly Iterative, not chaotic
● Frameworks as communication tools (CRISP-DM)
● Refining the original question as much as actually finding an answer
Image: Kenneth Jensen
It’s just Semantics● Getting an actual piece of quality data may be the most difficult part
● Mocking data: integration and visibility tasks, like UI, dashboards, visualization, pipeline
● Simulation: insights about mechanics, not patterns
Big data volumes breeds a different kind of beast, even when it’s not “Big Data”
● False discovery rate upon many trials
● Statistical significance must relate to actual impact to be meaningful
● Cross validation schemes, regularization, minimizing degrees of freedom (parsimony),variable selection strategies and analysis ofthe magnitude and importance of variables
Outliers
● Outlier has a cause behind it, business logic, reason to measure, reason to use the model, and requirements of the objective
● Both extreme and middle dwellingmay be “normal” or outlier
● Clean for the sake of clean is sterile data
Reproducible, or it didn’t happen
● Build the processes rather than the analysis
● Visibility, peer review, disconnections between a conclusion and expected outcome
● Inputs, parameters, scripts, environments, library versions, fixed random seeds
● Trust is lost but once
image: http://blog.f1000research.com/2014/04/04/reproducibility-tweetchat-recap/
A data science “deliverable” is not necessarily completed features
● Completed features is an incomplete measure
● Committable strategies and steps to achieve a set objective
● Progress includes all the knowledge gained about the problem
The role of the Anthropovangelist
● Get to know the customer’s culture from within
● Get the client and the team to trust and empower the solving strategy
● From “so what” to “what if”