46
Constructing Open Source SDKs for Ops Teams with REST and GraphQL Chris Wahl

Constructing Open Source SDKs for Ops Teams with REST and ... · 95th percentile load times with GraphQL: 3.256 seconds 95th percentile load times with REST: 6.619 seconds . Since

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Constructing Open Source SDKs for Ops Teams with REST and

GraphQL

Chris Wahl

Chris Wahl

Chief Technologist @ Rubrik

Author of Networking for VMware Administrators

Open Source Enabler @ Rubrik Build

ex Datanauts Podcast host

🥑 he/him

Twitter: @ChrisWahl

GitHub: chriswahl

LinkedIn: /wahlchris

Blog: Wahl Network

@ChrisWahl | #DevWeek2019 3

https://twitter.com/AxolotlCure/status/1136284938830045184

This is a story about toil And a lot of learning through triumph and mistakes

@ChrisWahl | #DevWeek2019 4

The kind of work tied to running a production service that tends to be

manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows

- Toil

@ChrisWahl | #DevWeek2019 5

@ChrisWahl | #DevWeek2019 6

Life of an operator • At the end of the release cycle

• “Here’s a thing, make it work, keep it working”

• Myriad of systems to understand and maintain while being short staffed

@ChrisWahl | #DevWeek2019 7

I need a one-liner or script to accomplish this task so I can copy and paste it into my environment, solve my problem, and get back to putting out a

hundred other fires

- Systems Administrators

@ChrisWahl | #DevWeek2019 8

Abuse from Crude Tools

Tools like AutoIt

• Script GUI actions using a DSL

• The ultimate “sad panda”

@ChrisWahl | #DevWeek2019 9

Key Ingredients

@ChrisWahl | #DevWeek2019 10

RESTful API

Operator Audience

Free Time

SDK

Initial Research

• Our audience preferred Microsoft PowerShell

• Auto generation of SDK was ugly

• Our swagger specification was non-standard

• Decided to craft a bespoke SDK

@ChrisWahl | #DevWeek2019 11

The Mission

• Give operators a familiar tool to manage our product and remove toil

• Use my background as an operator to control the UX

• Selfishly: Learn how to build an SDK

@ChrisWahl | #DevWeek2019 12

Project Plan

• Everything in GitHub as an open source project

• MIT licensing (Legal 👍 )

• One project per repository

• Official product support for projects

• Unit tests for new features

• External CI: AppVeyor, Azure Pipelines

• Internal CI: CircleCI

• Integration of Jira and GitHub via Zapier

@ChrisWahl | #DevWeek2019 13

People use this thing? The mysterious tale of unloved APIs

@ChrisWahl | #DevWeek2019 14

Our API’s Original Purpose

• Distributed systems to chat with each other

• Supply the GUI with an interface

@ChrisWahl | #DevWeek2019 15

me

This created friction

• There were no API versions

• Breaking changes were normal

• Standards for model, params, enums, etc. did not exist

• The product surface area was rapidly expanding

@ChrisWahl | #DevWeek2019 16

@ChrisWahl | #DevWeek2019 17

@ChrisWahl | #DevWeek2019 18

We Made Versions!

• Internal

• meant for testing and developing new features and for providing command and control endpoints for the software itself.

• Versioned (Vn)

• meant for public consumption with a declaration on versioning, deprecation, and when breaking changes would be introduced.

@ChrisWahl | #DevWeek2019 19

API versioning does not prevent breaking changes. It just helps

control when, where, and how the break occurs. Someone must still

update their code. - Me

@ChrisWahl | #DevWeek2019 20

More Cleanup

• Placed major integrations at the parent (root) level

• Leveraged HTTP methods to simplify workflows

• Used Boolean field naming conventions

@ChrisWahl | #DevWeek2019 21

Ugly: POST to “/add_node” and “/remove_node/{id}” Pretty: POST to “/node” and DELETE to “/node/{id}”

Start with ‘has’, ‘is’ or ‘should’ to make it clear that it is a Boolean field

Examples: ‘hasRootAccess’, ‘isAdmin’ and ‘shouldDoSomething’

The sooner you start to code, the longer the program will

take.

- Roy Carlson

@ChrisWahl | #DevWeek2019 22

Internal Became the Hypnotoad

• No incentives for versioning

• Over 95% of the API resided in Internal

@ChrisWahl | #DevWeek2019 23

The Universal Solvent Embracing our audience further

@ChrisWahl | #DevWeek2019 24

Too Much Complexity

• Each function with the SDK was a closed loop

• The community found it too difficult to contribute

• A new architecture was needed

@ChrisWahl | #DevWeek2019 25

SDK Design Goal

API File

• Gather information for each supported endpoint

• Supply the SDK with methods, params, status codes, etc.

• Version the data for backwards compatibility

Generic Functions

• Functions look at the API File to understand their purpose

• Functions can alter their state based on the target product version

@ChrisWahl | #DevWeek2019 26

@ChrisWahl | #DevWeek2019 27

Product versions 1.0+

Product versions 5.0+

Enablement and Communication

Too focused on the technology

Not enough focus on the hygiene

Lots of questions from our customers

General fear of GitHub and coding

More was needed

@ChrisWahl | #DevWeek2019 30

Choose Your Own Adventure

Educational Workshops for Operators

Communication Efforts

@ChrisWahl | #DevWeek2019 34

The rules of versioning and deprecation.

Future deprecation of endpoints / resources.

New or updated endpoints / resources.

And then GraphQL appeared There goes the neighborhood

@ChrisWahl | #DevWeek2019 35

@ChrisWahl | #DevWeek2019 36

You haven't mastered a tool until you understand when it

should not be used.

- Kelsey Hightower

@ChrisWahl | #DevWeek2019 37

Initial Research in 2017

• Dramatic speed improvements for the GUI

• As more objects are added, REST continues to fall behind

• Simple to query all objects and use cursor / pagination

• More flexibility with our returned values

@ChrisWahl | #DevWeek2019 38

Stress tested load times 95th percentile load times with GraphQL: 3.256 seconds 95th percentile load times with REST: 6.619 seconds

Since Then

Added GraphQL to our on-premises product.

o Reporting

o Dashboards

o Various other components

Constructed a SaaS platform with GraphQL as the standard API

o Started from scratch

o Using what we learned

o Lots of tweaking

@ChrisWahl | #DevWeek2019 39

Challenges

Schema is in flux

There are no versions

Documentation holy wars

We’re all still learning GraphQL

@ChrisWahl | #DevWeek2019 40

Graph-Que-What?

Current State

• Schema tools (Voyager, GraphiQL) for visualization

• Internal construction of new SDKs

• Existing auth methods (e.g. tokens) are valid globally

@ChrisWahl | #DevWeek2019 41

Base platform will continue with REST and GraphQL SaaS platform will remain entirely GraphQL Using GitHub private repos for development

SDK Development

Let use cases drive stack-ranking

Mimic a near-identical UX

Educate and enable in parallel

Invite early-adopters and give them checklists

@ChrisWahl | #DevWeek2019 43

Takeaways A bit of navel gazing

@ChrisWahl | #DevWeek2019 44

If we could do it all over again

• Increased collaboration with engineering and support

• Create incentives to document and polish the API

• Make documentation a top priority

• Educate internal stakeholders on API usage

• Bring (more) operators into the SDK build process

@ChrisWahl | #DevWeek2019 45

Use cases, UX, testing, feedback