What is canary deployment? When and how to use it.
What is canary deployment?
Canary deployment is a software deployment technique where a new feature or version is released to a small subset of users in production prior to releasing it to a larger subset or all the users. It is also sometimes termed a phased rollout or incremental release. By design it is low-risk, as a new feature is only deployed to a defined subset of users which is a small number to start with. It also provides a path to test out a new version on production with minimal harm in case anything goes wrong. In this post, we will discuss when and how to use canary deployment and also its benefits and relationship with feature flags. Let's get rolling!
When to use a canary release
A canary deployment or canary release can and should be used in multiple use cases. Before going into when to use it, let’s discover how it got its name “canary”. Until the 1980s coal miners in the UK, Australia, Canada and US used canary birds as an early warning system for harmful gases like carbon monoxide and methane. These birds would show visible distress in presence of gas alerting the miners of danger before they could recognize it themselves.
Similar to the canary birds, the early sub-segment of users that you expose a new feature to can give you warning if something isn’t right. That canary subset can be for example 0.25% of the customer base chosen randomly or with some logic who will get to try out the new version before a bigger population of the customer. If anything breaks, it can be rolled back. If things are all dandy, the 0.25% can be pushed up to 1%, then gradually 5% and more while monitoring the logs, errors, and overall health of the software.
For the question of when to use canary deployment; we see it as a must for any change in the critical path. It can also be used for other non-critical features or to A/B test some new idea as an experiment. The highest benefit of canary releases can be realized when the stakes are high, as in the example below.
Example of Payment gateway for an e-commerce company
As an example, let’s say an e-commerce company is using Braintree for processing all its payments. For flexible payment options and to integrate with other payment providers, the company decides to switch to Stripe as the payment gateway. To make this switch, our hypothetical company should use a canary release to minimize the risk. Bear in mind that most if not all of the income for an e-commerce company comes from the checkout and collecting money from customers at checkout. It is the most vital part of the critical path, and any glitch on checkout or payment gateway level is a real disaster for our little e-commerce venture.
Let’s suppose the company has thousands if not millions of customers and also thousands of orders a day. Since the software development is done to a production-ready state for the new payment gateway, the first release should be the team responsible for development.
The canary release will allow the code to be deployed to production, but it will be released only to the team who developed the Stripe integration. This is textbook deploy is not release philosophy in practice. Of course, at this point, and for the foreseeable future, both the Braintree and Stripe gateways will work on the checkout page. After this first release and ironing out any bugs, the next stage could be to release it to all the staff of the company; identified by their email address or the internal IP of the office.
From 1% to 5% then eventually to 100% customers using Stripe
If the second release is successful, the next step can be to release the new payment gateway to 1% of the customer and see how that works out. At this stage, 1% of the customers will use Stripe as the payment gateway, and 99% will still be using the tried and tested previous integration with Braintree. Gradually in days/weeks, the balance can be shifted to 50/50 where 50% of the customers use Stripe and the other half use Braintree.
Similarly, any bugs can be fixed and it only affects the percent of users using Stripe. Depending on the risk appetite and the reasons to switch to a new payment gateway. Eventually, 100% of the traffic/customers will use the Stripe payment gateway and after weeks/months of inactivity, the Braintree integration can be removed from the code as needed. At this point, the Stripe integration will be generally bug-free and ready for prime time even from an infrastructure point of view. Visually it would look like below:
The above is just one example, but canary releases can be used in a multitude of other scenarios. For instance, Facebook uses canary deployment mobile app releases. Recently when we started seeing time insights on Google Calendar to show the breakdown of time spent on meetings, some of my friends saw the feature a week or so ahead of me. As you can see, even billion-dollar companies utilize canary deployments to release at the scale of millions of users if not billions of users.
Regardless of the scale, the idea of canary deployment is very powerful. Even if you have 100s or 1000s of users, testing out new features in production with a small set of real users can be very beneficial. In the next section, we will look at the numerous benefits of canary deployment.
Benefits of canary deployment
Without a doubt, canary deployment minimizes the risk of releasing a new version of the software. There are other benefits of canary deployment too, some notable ones are:
We can generally predict the resources needed for a new system and give a range of how much traffic it will get. This is important when we want to introduce a new microservice that is replacing an existing older system. With the canary release, we can divert for example 1% of the production traffic and test our assumptions on the resources the service will need. Any performance issue or bottleneck can be identified earlier, thereby it can get solved faster. With the safety valve of a canary release, we can turn back the traffic to 0% on the new system while issues are being fixed.
Even though we thoroughly test new systems or features in staging, it cannot be said with certainty how the feature will turn out with the full production traffic load. There might be those edge cases that are only seen when the feature is used in a production environment. This is where a canary release helps us get the early feedback without affecting most of the traffic from a very small subset/percent of the production traffic. With early feedback, we can change the feature if need be and make it even better for the next set of users who will get the new feature.
Canary deployment is also about having a safety net to roll back to a working system while the new feature/version is being introduced. In the example of the payment gateway, let’s say we saw a major bug in the Stripe implementation when we released the new payment gateway to 1% of the traffic. The rollback would be very easy, simply updating a feature flag where we can change the value from 1% to a segment of software engineers who developed the feature. This ease and high level of control on a new feature or version of the software is the power of canary releases.
A/B testing, though new in its widespread adoption, is a 100 years old method, which basically is a way to compare two versions of something to find out which one performs better. In software teams utilizing canary deployment, we naturally serve two different versions of the software as we introduce the new or updated feature. This gives us an opportunity to review users' reactions to the update and know if it is working better. Generally, A/B testing is done for a specific period of time with a control and variation group to find out which version to choose as per the data collected. Canary releases are a great way to incorporate A/B testing in your releases and get the most out of them.
There are other benefits but the above mentioned are the main ones. Now that you know all the reasons to use a canary deployment, let’s discuss how to put the concept into practice.
How to use a canary deployment
Without delving deep into the technical details, canary releases can be mainly done in two ways. Below we discuss both the ways to enable canary deployment first from the infrastructure level and the second from the code level.
Canary deployments from the infrastructure level
The first way to manage a canary deployment is to control the traffic flow to the new version or feature from the infrastructure level. One product that allows doing this without the need to go into the nitty-gritty of how this is done with things like a load balancer or service mesh is Google Cloud Run. It gives the user the ability to route x% of traffic to a newly deployed version. It also allows splitting traffic with tags for multiple versions, truly handling canary releases from the infrastructure level without the hassle to understand the details underneath. AWS beanstalk also supports traffic splitting as detailed in their docs.
Obviously, unless we have a great DevOps/SRE team who enables us to do this with our existing infrastructure, this is more of a pipe dream than a reality. We will actually need to dive deep into load balancing and blue-green deployment, rolling deployment, and the likes to get this into practice. This brings us to the next way to do it, from pure code and feature flags.
Canary releases enabled by code and feature flags
The second way to enable canary deployment and releases is to have the feature code always deployed, and control the reach of that new feature code to a certain set of users with code and conditions. This is exactly where feature flags shine. Using feature flags, we don’t need a large SRE team to enable us to do canary or phased rollouts. Software engineers can do it themselves with the clever use of feature flags and Flagsmith. Combining the release with segments and identities adds that needed level of options and flexibility for canary releases.
With optimal use of feature flags or multivariate flags, we can send 5% of the customer to Stripe and 95% to Braintree as we saw in the above example. Below is a sample screenshot of an example multivariate flag of the case with 5% Stripe and 95% Braintree usage:
Furthermore, the change in the canary size and who can make the change becomes ultra-easy. With feature flags, it is simply going to an interface and changing the values from 1 and 99 to 5 and 95 and now 5% of the traffic can see the new version or feature, as shown above. This particular change can be done by someone nontechnical like a product manager. This is much easier compared to the earlier discussed infrastructure option where each change in the canary size might need a new deployment or change to the infrastructure, which is far riskier.
Similar to other things in software engineering it is always good to follow best practices for feature flags. Depending on your language and framework of choice, you can try out feature flags on Flagsmith with Node.js, React.js, Flutter, and even iOS. Please check out all of our SDKs on our Github profile.
As we have seen, canary deployment and releases provide us finer control on the users we want to release the new feature or version too. It de-risks a potentially buggy feature being released to the whole customer base and adds more confidence to the business as the new feature will first be made available to only a small subset of users. Utilize the safety valve provided by canary releases and be confident in releasing experiments and even not-fully-baked features to a very small percent of users to get early and super valuable feedback. Canary releases can be your secret weapon to move fast and
break flag things for only 0.1% of the users. Always release software responsibly with user impact in mind!