Build vs. Buy for Feature Flags: My Experience as a CTO with a 20+ Engineer Team
Our need for a feature flag system
I was the co-founder and CTO of BibliU, an EdTech SaaS with more than 100k monthly users. Our team had over 20 engineers, and we hit a point where we needed a robust feature flag system.
Feature flagging was critical for rolling out changes to a few users at a time, testing it in the real world without having to launch it fully. As CTO, this was super important for me, particularly after recovering from rollout issues we had with feature releases in the past.
The criticality increased as more teams were involved in releases and our needs shifted from just rolling out new features, to comparing the success of the changes we were releasing.
Back then, my job involved overseeing a number of engineering teams, working across various platforms including web, native desktop and mobile apps, and a web-based admin portal. In addition, creating a coordinated relationship between our product and engineering departments.
We ended up building feature flags in-house. In this article, I'll talk about how we weighed up build vs. buy, in-house vs off-the- shelf, and the challenges we faced building our own feature management system (some weren't clear until we actually got into it).
In-house or off-the-shelf?
First up, we had to choose between making our own feature flagging tool or using something from outside. We went with building it ourselves.
Here's why:
- Easiest path: At BibliU, we dealt with universities and had strict data control agreements. Every time we brought in an outside data processor, it meant getting permissions from customers, which was a slow process. So, making stuff in-house often appeared easier from the outset.
- Thinking we were unique: In retrospect I did this a lot earlier in my career, thinking our needs were so special that off-the-shelf solutions wouldn't cut it. For instance, we thought we needed a way to turn on features for specific university-classes and assumed it'd be too hard with a third-party tool. We could've used something like Flagsmith's identity traits, but we didn't look into it enough.
- The way engineers get promoted: Often, engineers get promoted for creating new things rather than implementing existing tools. This mindset can lead to choosing sub-optimal solutions overall at the benefit of giving additional material to discuss at performance review and promotion meetings. At the time, I didn't see it as a big issue, but looking back, it might've played a part.
The challenges we faced with in-house feature flags
Cross-team development
We had different teams with their own responsibilities structured around customer types, but to have a consistent release procedure we needed the entire engineering team on board. We decided to assign the full-stack team that was responsible for our student apps to feature flags, as they needed the capability the most and were comfortable with our entire tech stack.
However, when we started planning, we discovered our first issue. The team who were building in-house feature flags had most of the requirements and worked together daily, but they had to consider completely different needs from other teams.
Additionally, fitting this new system into our existing setup was tricky. We’d built complex logic for grouping users that needed to align with the feature flag system. And to make things more complicated, another team was responsible for this user grouping system.
This led to a fairly lengthy and time-consuming planning and development stage, but thanks to all of the extensive scoping work up front it was not as bad as it could have been.
Ease of use for other teams
We took a major shortcut in the scope of the system by making the feature flag changes only accessible through manual database edits instead of building a fully featured user interface.
Initially, this was fine since engineers in the engineering team who built the system were the first users. But as the system was used more widely, our product managers and other teams needed to use it more. This led to a reliance on engineers to make changes, which due to the lack of input validation had to be done extremely carefully. This slowed down release cycles down and resulted in a higher operating cost.
We thought about building our own UI or using a tool like Retool, but both had issues. Building it ourselves would take time and require maintenance. Additionally, using Retool without an API meant allowing a third party to have direct database access, which could mess with our GDPR/data protection approvals. Effectively negating one of the primary reasons we’d decided to build in-house in the first place.
Another major piece of work that needed to be worked on a lot more than I had anticipated after the system was completed was documentation so other engineering teams could utilise the system as well as the people who had built it in the first place. This required a fair bit of ongoing work to outline all of the potential use cases and was not assisted at all by the system requiring manual database edits.
Some hidden benefits
That said though, building the system in-house compared to getting something off the shelf meant that we didn’t have all the features of a 3rd party system would have, though I knew it would exactly meet our requirements. The team who built the system was also already very well trained in its operation! Additionally, if we wanted an extra feature added or an existing feature changed we had full control over the system and knew exactly how it worked.
Maintenance and Reflections
Even though building our own system worked out, it had its maintenance costs. As the feature flags were tightly coupled with our user and user grouping logic it needed to be extremely well monitored and maintained. Although we didn’t have any major outages due to our system, it still increased the complexity of the platform.
Applying Lessons to HowdyGo
Looking back, making our own feature flag tools was a good move, but now, seeing feature flagging software out there like Flagsmith, I think we might’ve made some different decisions.
When you run a build vs. buy analysis, you should consider the following decision criteria:
- Is it a core competency of your company? If it’s core and is fundamental to your product offering as well as a key differentiator to your competition having greater control by building it in house can be extremely useful down the road.
- How critical is it? Is downtime acceptable or would it cause serious issues? Does the off the shelf solution have a well regarded and professional engineering organization that you consider better than your own or is it the other way around?
- Does the cost of the off-the-shelf solution have a scalable structure that will expand with usage and value, or is it high-cost immediately? If we built it ourselves what is the ROI of an engineer to build + maintain this in perpetuity?
- Is build & maintainability confined primarily to one team? or are multiple teams going to be impacted by the build, changes and maintenance of such a product? As your customer-base, team, and volume of requests grow, home-grown systems can become strained.
- Are there lots of potential stakeholders? Regardless of how technically competent they are, having lots of potential stakeholders can push your team in multiple (potentially competing) directions. Having lots of stakeholders is also a great way to have major scope creep in the initial build and maintenance phase.
- What are the data privacy and regulatory requirements of your company and this piece of the service?
With my new startup, HowdyGo, I'm applying these lessons, ensuring we make smarter choices especially when choosing to build in-house or use third-party tools. For example, we no longer jump straight in trying to build our systems in-house and often look more closely at the different 3rd party solutions available.
That said though, a lot of this is down to necessity as we are bootstrapping the company with a much smaller team than previously, so we can’t just build everything in-house unless it’s core and critical to our product.