18
Leveraging open technologies pragmatically within a traditionally closed ecosystem Dharhas Pothina US Army Engineer Research and Development Center

Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Embed Size (px)

Citation preview

Page 1: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Leveraging open technologies

pragmatically within a

traditionally closed ecosystem

Dharhas Pothina US Army Engineer Research and Development Center

Page 2: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Some History• 5 years university research• 10 years state government• 3 years federal government

Started out with mostly in-house codebases plus proprietary tools and some scripting for automation / data transformation

Page 3: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

my workflow

circa 2008

• bash• perl• awk/

sed• fortran• c

Page 4: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

artisanal data scienceworkflows are fragile and ineffective

Image credit: Quilted Northern April Fools

Page 5: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Why Python?

Page 6: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Transitioning was easy• I could understand the programs I read• Had the scientific libraries I needed• Could interoperate with everything in my processing

pipeline• Had powerful data structures and language features• Great community support

I tried learning Java 3 time in my career Python was nicer

Page 7: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Python Scales

Easy things are easyComplex things are sensibleHard things are possible

Non Technical User/AnalystData Scientist/EngineerSoftware Developer

PYTHON IS OPTIMIZED FOR HUMAN PRODUCTIVITY RATHER THAN MACHINE PRODUCTIVITY

Page 9: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Image credit: Sonny Abesamis (CC BY 2.0)

Closed Ecosystems are Resource Limited• Limited staff• Limited time• Limited

expertise• Limited

fundingso stop building your own machine learning library

and use your limited resources on mission

critical activities instead

Page 10: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Reduce License Friction*• impacts development speed• impacts agility/trying new things• impacts deployment • Impacts scaling

whenever possible avoid proprietary

tools*

* If you work for state/federal agencies, or anywhere with a long procurement process

Page 11: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

internal teams cannot match the resources of the open data science community (neither can commercial vendors)

Page 12: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Build a layer not a internal platform

Internal Software

Page 13: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

or you will own that puppy…

Image credit: Marcos Leal (CC BY 2.0)

Page 14: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Risks

Page 15: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Be very selective• Bus Factor + Code Complexity• Software Ecosystem• Code Quality• Python 3 compatibility• Continuous Integration• Cross Platform Compatibility• License – BSD, MIT, Apache

understand your dependencies

Page 16: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Packaging is hard (we use )

Packaged by Continuum

Packaged by Community

Internal, Secret & Export Restricted

Page 17: Leveraging Open Technologies Pragmatically Within a Traditionally Closed Ecosystem | AnacondaCON 2017

Should you make internal code open?• Can (but may not) gain you external contributors• Takes effort • Refactoring/Clean Up• Documentation• Legal Review• Tests/Continuous Integration

• Social contract• Gains you the open infrastructure ecosystem – ci,

github, conda-forge, etcmost of the steps you need convert a tool to be open are the same to make it useful across your

own organization