What is Canary Deployment? When and How To Use It

By
Geshan Manandhar
on
January 15, 2025

What is canary deployment?

Canary deployment is a software deployment technique where a new feature or version is released to a small subset of users in production prior to releasing to a larger subset or all users. It’s also sometimes called a phased rollout or incremental release. By design, it reduces risk, only exposing new features to defined subsets of users and gradually ramping up from there. In addition to reducing the risk of accidentally releasing buggy code, it provides a path to test out a new version of a feature in production to see how users respond. In this post, we will discuss when and how to use canary deployment and also benefits and relationship with feature flags. Let’s get rolling!

canary deployment

Photo by Joshua J. Cotten on Unsplash. Image Credit: https://unsplash.com/photos/q77K0zIDTmI

When to use a canary release

Until the 1980s, coal miners in the UK, Australia, Canada, and the US used canaries as an early warning system for harmful gases like carbon monoxide and methane. These birds would show visible distress in the presence of gas, alerting the miners of danger before they could recognise it themselves.

Today, no canaries are harmed and canary releases help engineers safely release new features and updates. As teams grow more wary of “big bang” releases, for good reason, canary deployments have been popularised and have many use cases.

In a canary deployment, the early sub-segment of users that you expose to a new feature can give you a warning if something isn’t right. That canary sub-segment can be, say, 1% of your customer base chosen randomly. The subset could also be a segment of your users who have self-identified as Beta testers or could follow some other logic that you apply. The idea is that you will expose the new version to chosen users before ramping up to a larger section of your whole user base. If everything goes well with the initial deployment, the percentage of users exposed to the feature can be increased gradually, all while monitoring the logs, errors, and overall health of the software.

For the question of when to use a canary deployment; we see it as a must for any change in the critical path. It can also be used for other non-critical features or to A/B test some new idea as an experiment. The biggest benefit of canary releases can be realised when the stakes are high, as in the example below.

Example of Payment gateway for an e-commerce company

Let’s say an e-commerce company is using Braintree for processing all its payments. For flexible payment options and to integrate with other payment providers, the company decides to switch to Stripe as the payment gateway. To make this switch, our hypothetical company should use a canary release to minimise the risk. Bear in mind that most, if not all, of the income for an e-commerce company comes from the checkout and collecting money from customers at checkout. It’s the most vital part of the critical path, and any glitch during checkout or at the payment gateway level is a real disaster for this e-commerce venture.

Let’s suppose the company has thousands—if not millions—of customers and also thousands of orders a day. Since the software development is done to a production-ready state for the new payment gateway, the first release should be the team responsible for development.

A canary release will allow the code to be deployed to production, but it will be released only to the team that developed the Stripe integration. This separates deployment from release and frees the team to test in production.  Of course, at this point, and for the foreseeable future, both the Braintree and Stripe gateways will work on the checkout page. After this first release and ironing out any bugs, the next stage could be to release it to all the staff of the company; identified by their email address or the internal IP of the office.

From 1% to 5% then eventually to 100% customers using Stripe

If the second release is successful, the next step can be to release the new payment gateway to 1% of the customer base and see how this works out. At this stage, 1% of the customers will use Stripe as the payment gateway and 99% will still be using the previous integration with Braintree. Gradually the balance can be shifted to 50/50, where 50% of the customers use Stripe and the other half use Braintree.

Similarly, any bugs can be fixed and will still only affect the users using Stripe. Depending on the company’s appetite for risk, as well as the reasons for switching to a new payment gateway, this may take days, weeks, or months. Eventually, 100% of the traffic/customers will be routed to the Stripe payment gateway and after weeks/months of inactivity, the Braintree integration can be removed from the code. At this point, the Stripe integration will be generally bug-free and ready for prime time even from an infrastructure point of view.

The above is just one example, but canary releases can be used in a multitude of other scenarios. 

For instance, Facebook uses canary deployment for mobile app releases. Recently, when we started seeing time insights on Google Calendar to show the breakdown of time spent on meetings, some of my friends saw the feature a week or so ahead of me. As you can see, even billion-dollar companies utilise canary deployments to release at the scale of millions of users if not billions of users.

Regardless of the scale, the idea of canary deployment is very powerful. Even if you have hundreds or thousands of users, testing out new features in production with a small set of real users can be very beneficial. In the next section, we’ll look at some of the drawbacks of canary deployments.

Downsides of canary deployments—and how to resolve them with feature flags

Canary deployments, despite their many virtues, can introduce complexity to your setup. Luckily, this complexity is easily mitigated by running canary deployments with feature flags (and feature flag software). Let’s look at a few of the potential drawbacks.

Additional infrastructure & increased complexity

Canary deployments usually necessitate additional infrastructure to manage the code and can introduce more complexity to your codebase as well. But this infrastructure is provided (and made simple) with feature flag tools like Flagsmith. You can run a canary deployment in a straightforward UI and change things like user percentages without needing to redeploy. 

Management overhead and technical team overhead

Canary deployments generally require more management, particularly by engineering teams who might need to set up the deployment for product teams.  

With a feature flag tool, the deployment becomes simpler to manage and can be managed by non-technical teams and product managers without needing engineering support.  Essentially, the person managing the canary deployment doesn’t need to also manage the code. 

Benefits of canary deployment

Without a doubt, canary deployment minimises the risk of releasing a new version of your software. There are other benefits of canary deployment too, some notable ones are:

Capacity testing

We can generally predict the resources needed for a new system and give a range of how much traffic it will get. This is important when we want to introduce a new microservice that is replacing an existing older system. With a canary release, we can divert 1% of the production traffic, for example, and test our assumptions of the resources the service will need. Any performance issue or bottleneck can be identified earlier, thereby it can get solved faster. With the safety valve of a canary release, we can turn back the traffic to 0% on the new system while issues are being fixed.

Early feedback

Even though we thoroughly test new systems or features in staging, it cannot be said with certainty how the feature will turn out with the full production traffic load. There might be edge cases that are only discovered when the feature is used in a production environment. This is where a canary release helps us get critical early feedback without affecting most of the traffic from a very small subset/percentage of the production traffic. With early feedback, we can change the feature if need be and make it even better for the next set of users who will get the new feature.

Easy rollback

Canary deployment is also about having a safety net to roll back to a working system while the new feature/version is being introduced. In the example of the payment gateway, let’s say we saw a major bug in the Stripe implementation when we released the new payment gateway to 1% of the traffic. The rollback would be very easy, simply updating a feature flag where we can change the value from 1% to a segment of software engineers who developed the feature. This ease and high level of control over a new feature or version of the software is the power of canary releases.

A/B testing

A/B testing, though new in its widespread adoption, is a 100-year-old method, which basically is a way to compare two versions of something to find out which one performs better. In software teams utilising canary deployment, we naturally serve two different versions of the software to users as we introduce the new or updated feature. This gives us an opportunity to review users’ reactions to the update and know if it’s working better. Generally speaking, A/B testing is done for a specific period of time with a control and variation group to find out which version to choose as per the data collected. Canary releases are a great way to incorporate A/B testing into your releases and get the most out of them.

There are other benefits but the above-mentioned are the main ones. Now that you know all the reasons to use a canary deployment, let’s discuss how to put the concept into practice.

How to use a canary deployment

Without delving too deep into the technical details, canary releases can mainly be done in two ways. Below, we discuss both ways to enable canary deployment—first from the infrastructure level and second from the code level.

Canary deployments from the infrastructure level

The first way to manage a canary deployment is to control the traffic flow to the new version or feature from the infrastructure level. One product that allows doing this without the need to go into the nitty-gritty of how this is done with things like a load balancer or service mesh is Google Cloud Run. It gives the user the ability to route x% of traffic to a newly deployed version. It also allows splitting traffic with tags for multiple versions, truly handling canary releases from the infrastructure level without the hassle of understanding the details underneath. AWS beanstalk also supports traffic splitting as detailed in their docs.

Obviously, unless we have a great DevOps/SRE team that enables us to do this with our existing infrastructure, this is more of a pipe dream than a reality. We will actually need to dive deep into load balancing and blue-green deployment, rolling deployment, and the like to get this into practice. This brings us to the next way to do it, from pure code and feature flags.

Canary releases enabled by code and feature flags

The second way to enable canary deployment and releases is to have the feature code always deployed and control the reach of that new feature code to a certain set of users with code and conditions. This is exactly where feature flags shine. Using feature flags, we don’t need a large SRE team to enable us to do canary or phased rollouts. Software engineers can do it themselves with the clever use of feature flags and a feature flag platform like Flagsmith. Combining the release with segments and identities adds that needed level of options and flexibility for canary releases.

With optimal use of feature flags or multivariate flags, we can send 5% of the customers to Stripe and 95% to Braintree as we saw in the above example. Below is a sample screenshot of an example multivariate flag of the case with 5% Stripe and 95% Braintree usage:

Feature flag in Flagsmith being used for canary rollout with 5% weight on the new feature

Furthermore, the change in the canary size and who can make the change becomes ultra-easy. With feature flags, it is simply going to an interface and changing the values from 1 and 99 to 5 and 95 and now 5% of the traffic can see the new version or feature, as shown above. This particular change can be done by someone nontechnical like a product manager. This is much easier compared to the earlier discussed infrastructure option where each change in the canary size might need a new deployment or change to the infrastructure, which is far riskier.

Similar to other things in software engineering it is always good to follow best practices for feature flags. Depending on your language and framework of choice, you can try out feature flags on Flagsmith with Node.js, React.js, Flutter, and even iOS. Please check out all of our SDKs on our GitHub profile.

Conclusion

Canary deployments and releases provide us with fine-grained control of the users we want to release a new feature or version to. It reduces the probability of a potentially buggy feature being released to your whole customer base and adds more confidence to your business as the new feature will first be made available to only a small subset of users. Utilise the safety shoot provided by canary releases and be confident in releasing experiments and even not-fully-baked features to a very small percentage of users to get early and super valuable feedback.

Canary releases can help you release safely and with more agility, letting you collect valuable real-world feedback before making any big changes. Always release software responsibly with user impact in mind!

Quote

Subscribe

Learn more about CI/CD, AB Testing and all that great stuff

Success!
We'll keep you up to date with the latest Flagsmith news.
Must be a valid email
Illustration Letter