Continuous Release with CircleCI and Houston
Combining CircleCI, GKE and Houston to get code to customers simply, safely, and quickly.
Overview
Engineering teams don’t exist to manually execute release processes or stare nervously at production metrics; they exist to build and deliver products to customers. We develop release processes to ensure the products we’ve built are good, understanding that the time spent on this assurance comes at a price: time spent thinking about releasing is time spent not thinking about product. At Turbine Labs, we spend a lot of time thinking about how to improve this balance, and it’s why we built Houston. Here we’ll show you how to create a high quality, low overhead continuous release pipeline for web applications using a combination of CircleCI, GKE and Houston.
The goal of a release process is to get new code to customers quickly and safely. In an ideal world, this process is instantaneous and risk free. In the real world, every release comes with some risk, and all processes take time. This is a delicate balance. Spend too much time increasing your confidence in a release and your execution pace slows to a crawl. Release with too much gusto and product quality suffers, which also grinds execution to a halt.
Decades ago, the code/build/verify/deploy/release cycle was verified. Most of the steps in the process were performed by humans, and the process was slow. The rise of continuous integration systems automated unit testing and packaging. More recently, cloud and container-orchestration infrastructure has made it easier than ever to automatically push code to hardware. But as we’ve seen, continuous deployment can add significant customer-facing risk when applied without a separate release process.
The stress and caution involved in a traditional release process are due to several factors:
- cost of releasing new code
- cost of reverting a failed release
- difficulty determining the quality of a release
- broad scope of released defects
The right tools can help build a workflow that addresses each of these points. Releases that are cheap to execute, cheap to revert, simple to inspect, and limited in their impact dramatically accelerate engineering teams.
We’ll walk through a concrete example using CircleCI for continuous integration, GKE for continuous deployment, and Houston for high confidence releases. All code from this example is available at https://github.com/turbinelabs/circle-ci-integration/. You can fork this repo and follow the instructions in the README to work through examples yourself.
In our workflow, developers push branches that are built, tested and packaged by CircleCI, which automatically pushes them to GKE as deployments. Engineers can test these deployments before releasing them to a broad audience, verifying code in production without exposing it to customers. Release engineers push tags that are also packaged by CircleCI and pushed to GKE. These deployments can be incrementally released to customers, with behavior compared against existing code. If the new deployment looks bad, it can be instantly turned off. The total time from a git push to code showing up in production is about 2 minutes.
GKE Configuration
This example builds on our Kubernetes integration guide. We have a simple service deployed to GKE, using Houston to manage ingress traffic. This setup allows us to separate deploy from release, enabling incremental blue green deploys, verification in production, and easy comparison of app behavior across different code versions. On its own, this addresses two of our challenges: Houston’s customer-focused approach to measuring application health means it’s straightforward to determine whether new code is behaving well, and Houston’s release control means reverting a release as simple as flipping a switch.
Requirements
We want it to be easy to deploy and release code, but we don’t necessarily want to push every commit that makes it to git. In this example we use the following conventions:
- Every commit has automated tests run.
- Branches that match a naming scheme of /server-dev-.*/ are pushed to GKE, with each branch getting its own deployment. As modifications are made to the branch, the corresponding deployment is updated. This lets developers iterate on branches in progress without polluting the GKE environment.
- Tags that match a naming scheme of /server-prod-.*/ are pushed to GKE, with each tag getting its own deployment.
CircleCI 2.0 provides Workflows, which let you codify different steps to perform for different git activities. In our config.yml we have three Workflows defined:
- The “build” Workflow executes on every commit, and just runs automated tests.
- The “dev_deploy” Workflow is triggered on updates to branches that match the regex /server-dev-.*/. It runs automated tests, builds a docker image, pushes it to GCR, and then creates a kubernetes deployment labeled with a stage of “development”.
- The “prod_deploy” Workflow is triggered on tag pushes of the form /server-prod-.*/. It tests, packages and pushes images to GCR just like dev_deploy. However, it also includes an approval step before creating the deployment, and it labels its deployments with a stage of “production”.
Continuous Integration
Our example project is a simple node app, so integration is just executing an automated test suite. CircleCI 2.0 introduced the ability to use custom container images, which dramatically improved build/test times by allowing users to start with a pre-configured image, thus saving the cost of downloading and installing required software on every build. Instead of starting with a stock Linux install, we start with the official node:8.4.0 image, and run npm install and npm test.
Continuous Delivery
Continuous delivery is more involved, as we need to configure CircleCI to work with GKE. We use project environment variables to keep secrets out of our repository, setting the following values
1. GCLOUD_CLUSTER_NAME — which can be found with gcloud container clusters list
2. GCLOUD_COMPUTE_ZONE — which can also be found with gcloud container clusters list
3. GCLOUD_PROJECT ID — described here
4. GOOGLE_AUTH — described here
The intro tutorial on using CircleCI with GKE provides a good overview, but starts with a fresh ubuntu image on each build, requiring each build to download, install, and configure GCloudSDK and Docker. We’ve wrapped that up into an image that has pre-installed software, configured from environment variables. This saves significant time during each build. With this in hand our push-dev-server job runs docker build, tags the built image, and pushes it to GCR.
Continuous Deployment
Now that our code is packaged and pushed to GCR, we can create deployments in GKE. We’ve built yaml templates for dev and prod deploys, with replacement strings for the branch, tag, git sha, stage (dev or prod), and version of the deploy. A simple shell script uses information provided by the CircleCI environment to translate the template to a valid kubernetes spec. We use the same gcloud-build-base image to invoke kubectl, and our deployment gets created or updated in GKE.
Note that this will deploy code, not release it.
Verifying in Production
Now that our Workflow is in place, we can create a development branch, make a change, push it, and see a new deployment in production in ~2 minutes. Once deployed we can use Houston to preview new changes and verify them before they go out to customers. To streamline this process our frontend developers created a chrome plugin (source also available on github) that lets you select a version of a service to view.
We first configure Houston to look for cookies named “Tbn-All-in-one-server-Version”, and if such a cookie exists, route traffic to a service instance whose version label matches the value of the cookie. With that in place, our plugin can ask the Houston API for a list of deployed service versions, and set a cookie that pins the browser session to that version of code. This lets you share pushed code with other team members with no risk of disrupting customers. If you need to make changes, just make changes on your branch and push it. The CI pipeline builds, packages and pushes the new code.
Releasing
Production releases are slightly different. When you want to create a releasable version, create a tag and push it to your git origin.
git tag server-prod-v1.1
git push origin server-prod-v1.1
This will trigger the prod_deploy Workflow in CircleCI. This follows the same steps as the dev_deploy Workflow, but includes an additional approval step before a deployment is created in GKE. This isn’t required, but if you want a lightweight approval step before production instances are created, CircleCI provides a nice solution.
Once the deployment is created you can use Houston to incrementally release the new version to customers. Start with a small percentage to minimize impact if a defect slips through. Then, compare the latency and success rates of the old and new versions. If anything goes wrong, simply turn off the new release, and customers will all be back on the previous version.
Wrapup
With this pipeline in place, we can build, test, deploy, verify, and release software easily and safely. Creating a deploy is as simple as pushing a branch or tag. These deploys can be released to individual developers, or incrementally to a slice of the user base, dramatically reducing the impact of defects. When incidents do occur, reverting to the previous state is as simple as flipping a light switch. All this means engineering teams spend less time stressing about making changes, and more time focused on building products.