Lessons Learned from Testing Machine Learning Software

Preview:

Citation preview

Lessons Learned from Testing Machine Learning SoftwareChristian Ramírez

@chrix2 @formiik

About me• Computer Engineer• M.Sc. (Astronomy)• Ex-googler• Python lover• Actually formiiker (principal researcher at

formiik and formiiklabs)

About formiik and how we use MLFormiik is a platform designed to improve productivity of on-site staff.– Route optimization– Workload balancing– Ranking– Time series anomaly detection– Image recognition (deep learning)

Lesson 0

• Corollary number 2 by @dx• “In theory, there is no difference between

theory and practice. But, in practice, there is”

Yogi Bera

Introduction• In machine learning, computers apply

statistical learning techniques to automatically identify patterns in data.

• A core objective of a learner is to generalize from its experience.

•WHAT?

Lesson 1• Forget all you know about testing

Traditional software programming• In a general way, we do this:• With a specification of a function

» f(x)– We implement the function to meet the

specification» f(x)=y

Traditional software programming• How we do testing, basically:– Inputs a=[X1,X2,…,Xn]– Expect results b=[Y1,Y2,…,Yn]– We use assertions to validate the specification

– f(Xi)==Yi

Machine learning software programming• In the most of cases• We give examples– Pairs (Xi,Yi)– Induce f() such that• y≈f(x)for given pairs, and generalizes well for unseen

x

Machine learning software programming• In other cases• We give examples– Input (Xi)– No Yi are given to the learning algorithm, leaving

it on its own to find structure in its input

Testing (ML) software• Traditional software is modular– We can decompose it and understand it– Each module has inputs and outputs that can be

defined and isolated

Testing (ML) software• Machine learning systems appear to be

monolithic– Everything depends on everything else– Changing any one thing changes everything else

Lesson 2• Learn about machine learning– Kinds of learning– Coding

Lesson 3• The models will learn what you teach them to

learn.– Test models no requirements– Adversarial examples– Algorithms don't align with reality– Spurious correlations

• Corollary number 4 “human beings are so fools to programming” by @dx

Lesson 4• No matter what you have read on blogs, you

need a good level of mathematical knowledge

• Remember again corollary number 2

Lesson 5• Build your own ecosystem– Frameworks– GPU or CPU?

Lesson 6• Identifies how much data you really need– More data or better models?

Lesson 7• You need a new toolset– Experimental data scientist

Lesson 8 • Be a story teller– Have a goal– Manage the expectative, this is not magic

Summary

Questions?

Thanks

@chrix2christian.ramirez@formiik.com

Recommended