Upload
ricard-clau
View
637
Download
1
Embed Size (px)
Citation preview
WHAT WE TALK ABOUT WHEN WE TALK ABOUT DEVOPS
Ricard Clau - GeeksHubs @ Numa Barcelona
WHO AM I?• Currently working as CTO at Holaluz
• Ex Wonga, Hailo, SocialPoint, Ulabox, Privalia…
• Developer for many years, been automating things for a while, DevOps before it was trendy!
• Open-source contributor & occasional speaker
AGENDA• Problems most companies have
• What is DevOps about?
• Tactical patterns to tackle problems
• Introduction & examples: Packer, Ansible & Terraform
THEORYWhat does DevOps try to help with?
WHY THIS TALK?• Most companies misunderstand DevOps
• Most teams don´t know how to get started
• Not every project is green field
• Automation and DevOps quickly add value
• Tools work for Windows as well! No excuses!
COMMON PROBLEMS• Hard to integrate new features
• Deployments are an event
• Environments are completely different
• Poor applications monitoring
• Weak DR and painful error recovery
WRONG MINDSETS• Devs think their work ends when it works locally
• Ops don´t want to change things for stability
• C-levels often don´t get it, or see it as a project
• Bad dynamics reduce time to rethink processes
• Tools have a learning curve, need to invest
USUAL FRUSTRATIONS• Devs don´t feel empowered
• Ops don´t trust Devs (generally speaking)
• C-levels, POs, don´t understand these deps
• Legacy architectures and code don´t help
• Small time to improve if prod constantly breaks
TIME FOR A CHANGE!Stop the suffering!
DEVOPS IS NOT...• A separate team or a job title
• Some tool / process you can buy
• A silver bullet to solve all your problems
• Devs with root access / Ops writing Ruby
• A threat to existing Ops
DEVOPS IS…• Devs and Ops working together to deliver value
• Empower teams, reduce hard dependencies
• Communicaton, Integration, Collaboration
• Boosting productivity, make life easier!
• Automation, CI/CD, Infrastructure as code…
ENABLE THE BUSINESS!That´s what they pay us for!
TACTICAL PATTERNSGradual introduction, like in Holaluz
CI / CD / DEPLOYMENTSIf anything, start with this!
DEPLOYMENTS• 1 click deploy / rollback. No excuses
• Start with a tool like Capistrano / Ansistrano and a simple rsync / git strategy (Github dep)
• Generate artifacts in your CI/CD system
• Consider if this is enough or go extra mile with immutable infrastructure
CONTINUOUS INTEGRATION• Git flow or trunk development?
• Having Jenkins in the stack is not CI
• Run tests automatically every time you push
• Keep the build quick, green and gradually increase test coverage
CONTINUOUS DELIVERY• Logical evolution of CI, after the build stage our
code is prepared to go to Test / Prod
• Not the same as Continuous Deployment
• Small and faster releases, less risk, less bugs, boost productivity, sense of progress
• Definition of done: Deployed to Production
CONFIG MANAGEMENTStop having snowflakes! Envs all the same!
WHY THESE TOOLS?• We used to do shell commands to build servers
• Nobody remembers all that was executed!
• Your servers WILL fail. It is not an IF question, but a WHEN question. And you need to rebuild them
• Bonus: Local, Test and Prod are exactly the same
PRODUCTION• “Production” is a config hashmap, with the
exact same components as test, just less power
• It often ends up being some mythological place nobody is able to constantly rebuild
• It is painful to apply to existing infra, but totally worth the investment
PUSH VS PULL MODELS
Control Machine
Connects to N servers (SSH o WinRM) and
pushes changes
Master
Servers have“agents” installed who pull updates from master
PROS & CONS• Push model is easier to introduce gradually but it
can get tricky to keep track of what and when was executed
• Pull model requires maturity as you can cause massive disasters. It also presents some scale issues
IMAGES CREATION• Many platforms allow the creation of “images”
• Or we can create Docker images as well
• Servers are built much quicker if we bake high!
• Packer can orchestrate all this and integrates with all config management tools
LOGS, TIME SERIES, MONITORWhat is happening in my apps & infra?
MEANINGFUL LOGS• Get to know the logging levels standards
• Send them to a common place where you can see real time and query (ELK, Splunk, …). No more grep / tail PLEASE!
• Add context and apply “grok” filters
• Bonus: Remember to enable logrotate!
TIME-SERIES DATA• Evolution of metrics over time
• Both Infrastructure and Business metrics
• Grafana + InfluxDB / ElasticSearch / Cloudwatch…
• Crucial for Internet of Things monitoring
• Identify patterns, forecast, intervention analysis…
MONITORING / ALERTING• It is all about setting thresholds and taking
actions if we go over / below them
• Cloudwatch + SNS, Zabbix, Pagerduty, Sensu…
• Take out alerts that get ignored: NOISE
• Better basic monitoring than nothing at all
EXTRA THOUGHTS• Try to have the same setup in all envs
• There are too many tools, hard to standarise, and we all have our preferences!
• Many devs don´t see value in this… until they are on-call and cannot see what is going on!
SOME TOOLS I USEPacker, Ansible & Terraform
BUILD AUTOMATEDMACHINE IMAGES
CONCEPTS• Builders: Platforms you build images in. It is all
about what you start from!
• Provisioners: Installs and configures
• Post-processors: Optional final steps
DEMO TIME!• Virtualbox and AWS examples for Ubuntu 16 and
Windows Server 2012R2
• Check these packer scripts at https://github.com/ricardclau/geekshubsbcn/tree/master/packer
AWS EBS BUILDER• Start from an existing AMI
• Packer creates a temporary key pair (in Windows it retrieves the admin password)
• Provision box
• Store instance as new AMI
VIRTUALBOX / VMWARE• Start from an ISO or existing image
• Need to bypass GUI for SO installation using boot_command / Autounattend.xml
• Provision box
• Store as new image
WHAT I LIKE• Builds for multiple platforms from a single
source configuration
• VERY Easy to understand
• Works (and can provision) in Win, Mac, Linux
• Easy to share provisioning scripts or use Puppet / Ansible recipes
CAVEATS• Need to be very prescriptive or you end up
with multiple very similar templates
• A bit hard to go with a DRY approach
• Some things are hard to destroy / replace with new images
ANSIBLEAutomation for everyone
SHOW TIME!• Let´s explore some Holaluz playbooks!
• We combine Galaxy roles with our own stuff!
BASIC CONCEPTS• Inventories -> Group of servers
• Tasks -> Actions to execute
• Roles -> Reusable sets of tasks
• Playbook -> Tasks + roles applied to a part of an inventory
PLAYBOOKS• Group we target (from the inventory) -> hosts
• We connect with a remote_user
• And we can “become” another user
• For Windows we need to set communication mode to WinRM and port to 5985 or 5986
ROLES• Reusable tasks changing variables
• Folders: defaults, tasks, handlers, templates…
• Many open-source roles in Ansible Galaxy
• Sometimes tricky to make your Ansible code reusable by other people
INVENTORIES• We can create one “by hand” if small setup
• They can also be dynamic
• ec2.py -> creates groups by different AWS concepts (EC2 Name, tags, ASGs…) we can use in playbooks as targets
WHAT I LIKE• Relatively low learning curve
• Easy to gradually introduce
• No need for agents, only need SSH / WinRM
• Plays nicely with Windows servers
• Decent community roles in Ansible Galaxy
CAVEATS• Many bugs, BC breaks and questionable changes
• Tricky to know when we last ran some playbook in a big setup (Ansible Tower can help)
• Tricky to make it fully idempotent
• Windows support has room for improvement
WRITE, PLAN AND CREATE INFRASTRUCTURE AS CODE
CONCEPTS• Provider: Platform we are automating
• Resources: Automatable things in the Provider
• Modules: Reusable set of resources
• State: Used to diff desired state to existing. Can be stored remotely and supports distributed locking
DEMO TIME!• Let´s build a test and prod VPC with Apache
servers under ELB!
• Check these terraform code at https://github.com/ricardclau/geekshubsbcn/tree/master/terraform
VPC (10.161.0.0/16)Region: eu-west-1
AZ: eu-west-1a AZ: eu-west-1b AZ: eu-west-1c
DMZ1 (10.161.0.0/24) DMZ2 (10.161.1.0/24) DMZ3 (10.161.2.0/24)
APP1 (10.161.4.0/24) APP2 (10.161.5.0/24) APP3 (10.161.6.0/24)
BASTION
NAT2
APP PUBLIC ELB
NAT3
APP2APP1 APP3
NAT1
PUBL
IC IP
SO
NLY
PRI
VATE
IPS
WHAT I LIKE• Can integrate with anything that has an API
• Easy to extend, contribute and really quick to add new features. Excellent Github community
• Existing resources can be imported (PAIN)
• Have used it for 18 months, multiple providers, rarely hit a bug and was always quickly fixed
CAVEATS• Once you go Terraform, STOP using Console
• Some providers don´t have nice update support
• Terraform modules feel a bit hacky
• Sometimes state needs manual edition (getting much better but beware new providers)
THANKS TO…• Ex-colleagues Hailo & Wonga - Stephen Tan,
Nico Engelen, Chris Hoolihan, Álex Hernández
• Peter Mounce ex-Just Eat - Windows
• London DevOps meetup organisers
• All of you for coming!
RECOMMENDED BOOKS• The Phoenix Project - Gene Kim, Kevin Behr, George Spafford
• The DevOps Handbook - Gene Kim, Patrick Debois
• The Logstash Book - James Turnbull
• Ansible for Devops - Jeff Geerling
• Terraform: Up and Running - James Turnbull
QUESTIONS? CONTACT?• Email: [email protected]
• Twitter : @ricardclau
• Github: https://github.com/ricardclau
• If you think these techniques help your company, let´s talk!