24
thieman Travis Thieman THE NEXT GENERATION DEV ENVS Talk through the outline on this slide. What is a dev environment? Current state of the world (running on local and VMs) Ways we can make this better (containers) Whether we can realize this benefit with the tools available (Dusty)

Dev Environments: The Next Generation

Embed Size (px)

Citation preview

Page 1: Dev Environments: The Next Generation

thieman

Travis Thieman

THE NEXT GENERATION

DEVENVS

Talk through the outline on this slide.

What is a dev environment?Current state of the world (running on local and VMs)Ways we can make this better (containers)Whether we can realize this benefit with the tools available (Dusty)

Page 2: Dev Environments: The Next Generation

What are development environments? It’s all of this crap.

From the code you’re running, to the filesystem, to the OS, to the hardware, to the bits of electricity pinging around inside the machine, all of this affects what you’re actually trying to do, which is write and run software.

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

Here’s a different model for thinking about all the crap inside our machine that affects how we run and write software. There’s a *lot* going on.

Here’s yet another way of looking at a typical dev env that encompasses an entire machine. Dev envs are inherently complex, and our tiny human brains have no chance of being able to comprehend them. We need to create tooling to help us do this.

Page 3: Dev Environments: The Next Generation

Get our code to work at all

Reproduce our results

Share our results

Test our code properly

Change our dev environments

Getting development environments right is hugely important, and the consequences of screwing it up affect almost everything we do.

“[T]he alternative [to testing with SQLite] is not testing at all because there is limited time for

testing and setting up a proper database for this is so much more trouble.

The choices are not good test vs bad test.

They are test-with-issues vs. no test.”

- jbb555

Hacker News, August 4th, 2015

We are failing at managing the complexity of our dev environments.

This is from a Hacker News thread on testing code against SQLite instead of the database you actually run in production.

Clearly we have a lot of work to do here.

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

Let’s start thinking through the problem by imagining a simple Python app and getting it in the right environment to run correctly. If we want to share that code or reproduce our results on another machine, we need to make sure that our circles of Hell are compatible all the way down.

Most of the time, in Python-land, we are dealing with the three layers at the top. In this example…

Page 4: Dev Environments: The Next Generation

Your Code

requests 2.6.0

Python 2.7

Let’s say our Python code needs a specific version of requests and Python to run. If we have these set up, our code will probably run. So we give our awesome script to our friend and tell her to install requests 2.6.0.

pip install requests==2.6.0

This goes fine, but it turns out she already had requests 2.5.0 installed to run some other important app on her machine. When we install 2.6.0…

pip install requests==2.6.0

requests 2.5.0

…her old version gets unceremoniously uninstalled by pip.

Page 5: Dev Environments: The Next Generation

Your Code + Friend’s Code

requests 2.5.0

Python 2.7

requests 2.6.0

So we have a conflict here. Because of how Python works, these two versions cannot live harmoniously within the same set of Python libraries. So, if they refuse to get along, we’ll have to separate them.

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

This is where we’re going to see the concept of isolation come into play. What if we cut our circles of Hell above the system library level. We’ll keep *one* total copy of everything below the cut, but we’ll let ourselves create multiple, isolated versions of everything above the cut.

Userland

OS / Kernel

Hardware

Your Code

requests 2.6.0

Runtime

Friend’s Code

requests 2.5.0

Runtime

This means we can have a structure that looks like this. Two separate, isolated stacks of Python components. In one, our awesome Riker code and requests 2.6.0. In the other, our friend’s app.

Page 6: Dev Environments: The Next Generation

Hey, guess what, this is actually a thing! Venvs give us scalable isolation over our Python programs. This lets us resolve simple version conflicts, but scalable isolation has a bigger, more general impact on our development process as well.

Isolation is also really useful for keeping the number of things we need to think about smaller, which helps us approach and reason about complicated tasks. Without isolation, we have to consider all parts of the system all at once, all of the time.

If we have a problem in the middle of this mess, we’re going to have to wade through the whole thing to fix it. This is really hard.

Page 7: Dev Environments: The Next Generation

With scalable isolation (like venvs!) we can break the problem down into smaller, isolated components.

Now, if we have a problem in a specific area…

…we can focus our cognitive effort entirely on that unit of isolation, which is much smaller and easier to reason about. Scalable isolation helps us identify problems and takes away a lot of the cognitive burden we otherwise face when trying to solve them.

Page 8: Dev Environments: The Next Generation

Venvs are not enough

• Need C extensions?

• Conflicting system libraries?

• A language other than Python?

• Run a database? Multiple databases?

However, virtualenvs don’t go far enough. We need a more general solution.

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

So we need our unit of isolation to penetrate deeper into the circles of Hell. Let’s recall what this entire thing is equivalent to…

…a machine, right? And one machine is one big stacked Hell metaphor. Let’s restrict ourselves to running on one physical machine.

Page 9: Dev Environments: The Next Generation

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

Beyond that restriction, let’s go wild. What if we use a unit of isolation that encompasses everything BUT the hardware?

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

It’d look like this, right? We’re definitely going to get some great isolation out of this.

Hardware

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

It’d look like this, right? We’re definitely going to get some great isolation out of this.

Page 10: Dev Environments: The Next Generation

This is called a virtual machine. It is, also, totally a thing. There’s a big ecosystem around these, because a lot of people use them for their dev environments. Up until recently, they were the latest greatest thing in dev environments.

VMs have one major flaw that leads to a whole host of problems: they’re big and slow. Turns out, running a whole ‘nother kernel takes a lot of clever engineering and the end result doesn’t scale well. As a result, we *can’t* run a bunch of VMs. Maybe two, maybe three, probably not five, definitely not 100.

Hardware

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Host Machine Guest VM

This winds up with us ending up basically where we started. All of our stuff ends up in the same unit of isolation. Whether it’s on one Hell stack or in 1 VM running in parallel with our host machine, it’s effectively the same. VMs give us a way to jumpstart the process, but not much more. You end up with most of the same problems you had running locally. We need something better.

Page 11: Dev Environments: The Next Generation

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

Let’s review. Venvs are lightweight, pretty simple, but they don’t offer enough isolation.

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

Virtual machines offer great isolation, but they’re so heavyweight that we can’t run many of them. And, really, do we need a separate OS and kernel to run a Python app, or even a database? Generally, no.

Your Code

Language Libraries

Language Runtime

Userland

OS / Kernel

Hardware

What if we learned something from Goldilocks and created a unit of isolation above the OS?

Page 12: Dev Environments: The Next Generation

OS / Kernel

Hardware

Your Code

Language Libraries

Language Runtime

Userland

Your Code

Language Libraries

Language Runtime

Userland

We know from VMs that running multiple kernels is really hard and expensive, so what if we split the circles of Hell right above that and created a new unit of isolation? This could let us run our two apps with different library dependencies pretty easily.

OS / Kernel

Hardware

This isolation is incredibly powerful. Now we *only* need the dependencies in that unit of isolation that let us run our code. So we can make them pretty small and MUCH easier to reason about!

OS / Kernel

Hardware

Taking this approach to its natural conclusion, we can create a bunch of these small units of isolation for everything we need to run as part of our development environments

There are numerous benefits here. Smaller, easier to reason about. Actually scalable, unlike VMs.

Page 13: Dev Environments: The Next Generation

=

This unit of isolation already exists, and it’s called a container!

If you’ve heard about the Docker project, that’s the leader in implementing containers right now. Docker provides a core engine for managing containers, and has done a lot of work around building an ecosystem around all of this.

So we have this new shiny scalable unit of isolation called a container. What can we do with it?

Page 14: Dev Environments: The Next Generation

Get our code to work at all

Reproduce our results

Share our results

Test our code properly

Change our dev environments

?????

To review, let’s go back to our scary checklist from earlier. This is what we *want* our dev environments to help us do. Let’s see how containers can achieve this.

First, we need to get our code to work at all. Because our containers only need to have the minimum amount of stuff to run our app, this is usually pretty simple! The isolation keeps the amount of stuff we need to think about pretty low. Let’s just throw in exactly what we need.

OS / Kernel

Hardware

When we throw this on top of the non-isolated parts of our stack, we end up with a running container! It’s small, isolated, and runs our code just fine.

Page 15: Dev Environments: The Next Generation

Get our code to work at all

Reproduce our results

Share our results

Test our code properly

Change our dev environments

????

One of the biggest value adds of the Docker ecosystem is the ability to easily reproduce and share containers. You can either…

My Awesome Python Container PyGotham 2015

…take a snapshot of the container once you’ve got it the way you like. Quick and dirty.

Page 16: Dev Environments: The Next Generation

FROM debian:jessie

RUN apt-get update && apt-get install python

ADD . /my-python-app

RUN pip install -r /my-python-app/reqs.txt

CMD python /my-python-app/awesome-app.py

You can also formalize the instructions on how to build the container, which makes it actually reproducible.

Get our code to work at all

Reproduce our results

Share our results

Test our code properly

Change our dev environments

??

OS / Kernel

Hardware

Containers enable an awesome testing flow. Let’s say we’re running our Python app against a Postgres database. We take advantage of the scalable isolation of containers to create a totally separate copy of the app container running against a fresh Postgres container. We can even use a separate container with our test requirements already installed. Afterwards we can throw those temporary containers away.

Page 17: Dev Environments: The Next Generation

Get our code to work at all

Reproduce our results

Share our results

Test our code properly

Change our dev environments?

OS / Kernel

Hardware

Scalable isolation is what makes it easy for us to change our development environments. It doesn’t matter how many containers we add, this Python container will still work just fine, because it’s not really affected by external forces. We could visit an apocalypse on all the other containers individually, but our Python container won’t be affected by anything that happens inside of them.

Get our code to work at all

Reproduce our results

Share our results

Test our code properly

Change our dev environments

Page 18: Dev Environments: The Next Generation

Great, so everything is sunshine and rainbows, right? Talk over, goodbye.

NOW WE GOT

PROBLEMS

Not quite. Using containers for everything is going to introduce some problems we didn’t have before.

OS / Kernel

Hardware

Isolation makes normal everyday stuff like navigating through a space in a shell, copying files around, or accessing processes through ports more difficult. Maybe you’re starting out in this space over here, but your Python app is over here in a separate unit of isolation, and your database is separate from both of those.

Page 19: Dev Environments: The Next Generation

OS / Kernel

Hardware

Wrangling your vast menagerie of containers can also get overwhelming, even with relatively small stacks. When you add in test containers with short lifecycles, the problem gets worse. The solution is still *simple*, but it’s also *complicated* (as opposed to complex). Some tooling and automation could go a long way here.

Another problem: how do you share your crazy setup with all these containers floating around? How do you share how to run tests on them? How do you share common tasks like a database migration that needs to be aware that it’s running in a containerized environment?

Here’s another big issue…

Page 20: Dev Environments: The Next Generation

…if you run Linux, you’re in luck. You can use containers out of the box. Did I mention this whole concept is more accurately called “Linux containers”?

+

+

If you use something else like Mac or Windows, you’re going to need to run a Linux VM in order to utilize containers. This adds another level of isolation/abstraction that we constantly need to navigate. This adds a lot of cognitive overhead unless we can automate it away via tooling.

Docker Machine Docker Compose

The Docker ecosystem provides us tools to solve some of these problems. Machine gives us a way to get up and running on non-Linux systems. Compose lets us wrangle sets of containers into a working stack in a reproducible and sharing way. But these are ultimately fairly low-level tools, and they can’t meet the specialized and complex needs of running a development environment specifically.

Page 21: Dev Environments: The Next Generation

• Usability

• Sharing code

• Restarting when code changes

• Sharing tests

• Navigating through containerland

• Mix and Match

Other dev env problemsDusty is our attempt to use containers to solve the problems with dev environments. It’s written in Python and uses Docker and existing tools from the Docker ecosystem.

Dusty provides a solution which:1. Uses containers to achieve the right level of isolation, keeping things

simple.2. Scalable. The 100th app you run in Dusty is as easy as the 1st.3. Specialized. Dusty solves problems specific to a dev environment, like

running tests against isolated database containers.

• Usability

• Sharing code

• Restarting when code changes

• Sharing tests

• Navigating through containerland

• Mix and Match

Dusty allows to define groups of containers which can be toggled on or off separately. We can tell Dusty how to run hundreds of services, but if we only need one right now, it only bothers running the containers for that one. This mix and match lets us be efficient with our resources and scale out Dusty to support stacks of any size.

Page 22: Dev Environments: The Next Generation

When Dusty runs your containers, the plumbing needed to navigate the layers of isolation between your normal operating space (your local filesystem, etc) and the containers is set up for you. Your code gets seamlessly mounted into the container. You can make a request in your browser and Dusty will pipe it through the VM and into the container just the way you need.

We’re developers, we think in code as often as we do in processes. To reflect this, Dusty doesn’t just know about your containers, it also knows about your repos. If you make a change in a repo, you can ask Dusty to make sure all containers using that repo get restarted to pick up the new code. This is also really easy to automate with a tool like watchdog.

Dusty has first class support for the awesome container testing flow we talked about earlier. You can define a test script, give it a set of service containers to run against, and Dusty will handle rigging up the test harness for you. When it’s done, all the containers involved are cleaned up. Each time, you start fresh. This whole process takes about 5 seconds right now, and we’re still working to make it even faster.

Page 23: Dev Environments: The Next Generation

Dusty, along with the other tools in the Docker ecosystem, is going to help us manage away the pain of using containers and help us realize all the promise we saw earlier in the talk.

Outro on technical pieces.

Containers give us an opportunity to address a lot of the problems with existing local-only and VM-based dev environments. Docker and other container ecosystems are emerging and maturing, making it easier to get to a workable solution. Dusty and other projects are creating end-to-end solutions to make the dream of non-Hellish dev environments a reality. But we aren’t there yet.

The night is dark, we have problems and a lot of work to do, and the Python community can help to solve this problem. Call to action for contributors and for people to start their own projects trying to solve the problem.

Page 24: Dev Environments: The Next Generation

dusty.gc.com

thieman

Docker-NYC