OpenFeature Special | The Craft of Open Source Podcast
As you dive into more and more of what feature flags offer, you realize how much it can actually clean up your code base.
Check out our open-source Feature Flagging system – Flagsmith on Github! I’d appreciate your feedback ❤️
Why don’t you introduce yourselves each and what your role is at Dynatrace?
I'm Mike Beemer. I work at Dynatrace as a Senior Product Manager on the Cloud Automation team focusing on release automation topics and OpenFeature as well. We'll talk about that quite a bit here.
I'm Todd Baert. I'm a Senior Software Engineer at Dynatrace. I focus, pretty much entirely on OpenFeature at this point. In the past, I have had a lot of experience with identity and access management, which is another spec-heavy area in the software engineering industry. I would like to think it helps me with our current efforts a bit.
Can you both talk a little bit about the work you are doing at Dynatrace? Does it swim in and out like open and close? How does that work?
I'm balancing between the two worlds at the moment doing a decent amount on the open-source side, and then working on how we can take advantage of some of the standards that we are building and use that as part of the Dynatrace platform itself. If we tie it back to OpenFeature and feature flagging, how do we start showing some analytics around that and giving some confidence that a new feature had the intended effect on the system?
I'm entirely working on OpenFeature. Integration points and extensibility were important for Dynatrace's interest in the project. We have things like the OpenTelemetry hook, which we may get into in this conversation. As Dynatrace is not a feature flag vendor, our interest in the project is based on the fact that we collect a lot of data and telemetries. We are moving towards being a data platform for our customers.
Being able to standardize something like Feature Flags benefits us because it allows us to do telemetry more effectively, essentially, and make sense of data that was once disparate, which is now normalized and made consistent. That's one of the reasons why OpenFeature has things like telemetry, and other kinds of integrations and extensibility considerations baked into it from day one which, in retrospect, has worked out even better than we would have hoped. I hope that answers your question.
In terms of like the genesis of OpenFeature, we talk about how Dynatrace have a positioning or strategy in terms of when they do it. How do they choose whether to work on something out in the open or close source? Do they have set policies around that? Maybe we are leading into where the OpenFeature genesis came from in this case where OpenFeature is meant to be a defacto standard. How does the company figure out whether it should open or close source of code?
It's something that Dynatrace hasn't been active in. Primarily due to OpenTelemetry, Dynatrace got fairly heavily involved in that project, and that opened the door for other projects including OpenFeature. It's still evolving and the line isn't clear for the most part. If it's something that isn't considered proprietary or something, we try to open-source it.
That line is not clearly defined. It's project by project. We did get an open-source program office at Dynatrace as well to help drive some of these initiatives and apply some standardization across repos and things like that. I think that will help too. It will start pricing more and more open-source projects on Dynatrace.
The very nature of the project lends itself to open-source. When you are developing an open standard, you can't do that in a closed way. We have a whole bunch of SDK implementations of the standard, and they are all open-source as well. A lot of it comes down to the particular thing that Dynatrace is interested in developing.
In this case, it was bringing consistency to an industry that had a lot of different mature solutions, but they are all different. They are all divergent, and yet they all have some stuff in common. It seemed like a good target for standardization. To Mike's credit, he noticed that early and brought it to some of the higher-ups at Dynatrace and they saw a lot of potential in the project. I think it makes sense with our goals with this project.
Was there a penny-drop moment where you were like, "This is crazy?" Talk that through.
It’s pretty similar to the Flagsmith origin story too. I was on a dev team, we needed Feature Flags and we ended up rolling our own solution. There was no obvious tool that we wanted to use right away. Our use cases seemed basic so no reason to over-engineer them. We ended up rolling our own, but it turns out we weren't the only team at Dynatrace that did it. It also turns out that that's a pretty standard pattern for a lot of companies where they roll their own solution. You are almost stuck at the code level at that point. You can certainly change other tools, but then you had to change your whole code base.
Once it started realizing that was an issue for us and others and then thinking about it from a Dynatrace perspective like how it could fit in from a monitoring perspective, you are blind to that. If you were to use any observability tool right now with feature flagging, you do not see that on a trace. Those two things seem an ideal fit for me where you could standardize it to make it easy to use internal tooling, and then have a path to upgrade to an enterprise-level vendor.
We also can add that into Dynatrace to start seeing how that affects performance, failure rates, and things like that. It seemed like a nice fit. I was surprised when we started looking at offerings and the CNC app where there was nothing in the feature flag space. When I noticed those three things, that's when I made the initial proposal that it maybe makes sense to start seeing what we can do in the space.
OpenTelemetry has been super successful. Was there a similar origin story to that? Do you know who originally had the idea for that?
I don't know the full history. I do know that it's two projects that ended up merging. I know that some people from Google and then Light Stuff were pretty early in the process to come up with the initial pieces of that collaboration, but it does solve the same problem that we have set out for OpenFeature. They do that for observability.
It's at a grander scale. It's quite a bit more complex problem that OpenTelemetry is trying to solve, which benefits even more from community support. There's the instrumentation that has to be built and a lot of pretty complex tooling that doesn't make sense to have every vendor build their own. You think about it like a startup too. There's almost no way like a smaller company could get into the observability space without something like OpenTelemetry. It's far too complex to get started with that.
Even for a service as simple as Feature Flags, we underestimate it or we continue to underestimate how hard writing high-quality SDKs for different languages and platforms is. Especially having to learn the idioms of packaging and practice for particular languages is a hugely complicated thing even for something as simple as a boot. We are constantly underestimating it. I can't even imagine. You are right. With something like observability, you would be looking at a sheer rock face, there's no way that you could achieve that. In terms of the OpenFeature birth, you put together a proposal and sent it out of the flag pole.
One of the leaders at Dynatrace I had worked with on a previous project reached out and wanted to grab a coffee. I hadn't talked to him in quite a while. I was talking to him about where I was in life. It came out that I was interested in this topic. It turned out that they were also thinking about this as a potential topic. He asked me to put together a proposal. I did that and then a few months later, worked on the project. It moved pretty quickly, but it was the right idea at the right time.
How did you get involved, Todd?
I worked at a vendor who worked on some Feature Flags as one of its SaaS offerings, and I wasn't even on that team. I was working in the IM space at the time, and we made heavy use of that SaaS Feature Flag product internally. I loved it. I did not appreciate Feature Flags when I first heard about them, and I don't think this is an uncommon experience. My initial take is, "Do I need that?" I have config. As you dive into more and more of what it offers, you realize how much it can clean up your code base.
We had lots of what you would describe as rules or targeting or something like that in a Feature Flag vendor or a Feature Flag solution of any kind baked into our code. That's super common and it's the last thing you want. You don't want some section in your code probably repeated in a few places that are saying something like, "User from such and such EU country disabled feature because we are working at some GDPR issue. This is a legacy customer on an old platform, so don't show this feature.”
You have code like that sprinkled all over your code base. If you are doing a microservice-type architecture, you probably have multiple services. Being able to extract that logic and maintain it outside of your code base, and change it dynamically on the fly is extremely valuable. I recognize that as I was working for that company. Mike reached out to me. It must have been a month or so after. He ran this idea up the flag pole. He would show it to me and say, "I have an idea for a project. Do you want to quickly meet?" I thought it was such a good idea. I had to come on board.
How does that work in terms of the organization? Is it regarded as a peer to the commercial components of the Dynatrace platform?
I would say from an internal team perspective, I imagine it would be an easier sell at Dynatrace and not a completely foreign idea because we already have engineers working on completely open-source projects. OpenTelemetry is the big example, but we have engineers putting time into other open-source projects too.
Even the UI components that Dynatrace uses were open-sourced. There are UI engineers who are working on those components which are open-source. It felt natural joining with that intent to work on this open-source project. It didn't feel like a stretch at all. I had peers and comparable projects who I could enmesh with right away. It was an easy transition.
I would say from Dynatrace because of the experience on some of the other projects. They knew what they were getting into in this case. It wasn't unexpected. Maybe the speed at which things develop as we talked about earlier. It's a little bit slower if it was all close source and we were working on a team that we fully manage. The pros in this case outweighed the cons and the leadership was all for going with this approach.
Mike, what was your original vision for OpenFeature? Was it you wanted to build a common denominator SDK diversions issue? Was there a grander idea in your head?
For me, it's about how it could fit into a lot of the data that's being collected. It's not so much about toggling the flag. That's pretty cool, but the interesting part to me is how that affects the system. You can do it. If you look at a lot of feature flag vendors right now, you could see that when you toggle this flag, this is the response time.
If you start thinking about whether we are doing some split or maybe we are only enabling this feature for a subset of users, that may not be enough to significantly move the overall response time. If we can split that metric based on those flag values, it becomes much more interesting from an analytics perspective. That's what gets me excited. It is if we can start automating those types of metric collections to start doing some interesting behavior analysis on some of the subsets of users that see a particular feature and things.
That's where I see it going. This is a way to make it work across all systems. It doesn't matter, at least, from a Dynatrace perspective or a monitoring perspective. It doesn't matter what's toggling the flag. The management around the flag life cycle is how that affects the system behavior. That's the stuff that is interesting to me personally.
To close that loop, you've gone from the many-to-many problem of having many telemetry providers and many Feature Flag providers. You've now got OpenTelemetry. You still got a one-to-many problem there. The holy grail for you is to have that being a one-to-one, and then you can turn it all on and you close that information loop entirely. Even for a vendor like ourselves, we don't have to worry. We get that for free.
If you had dashboards in Flagsmith, for example, you could query maybe Dynatrace API to get some of those metrics if you needed to display it, or maybe toggle a feature based on a detected issue or whatever the case may be. It unlocks a lot of use cases between both platforms.
How do you measure the progress and success of the projects given that you are now getting through this set of stages of the CNCF? You’re probably not moving as quickly as you would expect if you've been used to working with a closed private team. How do you manage that? How does Dynatrace recognize that?
It's always evolving too. Some weeks seem we are making a ton of progress and others seem like it's painfully slow. For me, it's a learning experience. I'm used to being able to create some issues in an internal system and assign it to specific developers. It more or less gets done in the time period you are expecting.
In the open-source world, it is not always the case. For better or worse, sometimes the reason for the delay is interesting ideas from community members. Maybe a pivot that makes sense based on other people's experiences. That makes it a little bit challenging, but the benefits in a lot of cases outweigh the negatives.
One thing that's pretty cool about OpenFeature is there are only a handful of Dynatrace people working on it, but we are still making a lot of progress across a lot of SDKs. The Flagsmith team is helping out with the Python SDK, for example. Other companies and individuals are contributing in very useful ways. That ties back to maybe defining the spec pretty early on.
We are still evolving the spec, but having this common specification available makes it a little bit easier to self-service. For example, the.NET SDK that's being worked on right now. That is completely driven asynchronously based on the specification. That's pretty cool. That shows the power of the open-source community but also maybe defining a spec that's reasonably easy to follow by developers that don't attend the community meetings and stuff regularly.
Have you worked on a project similar to this?
I have not.
Nor have I. I have done a couple of one-off contributions to open-source projects, but it's always one-off contributions because I found a bug or there was a feature. I was interested in working on this particular project, but I have never invested this amount of time and energy into an open-source project before now.
What all experiences have you had in terms of working on that? Has anything surprised you or is it as you expected?
I'm always surprised at the diversity of ideas that come out of working in public. Almost every PR, especially if it's a big change. Somebody brings up some amazing points. Somebody brings up a cool pattern or idea for integration. Working when there are many eyes on the work you are doing feels different. When you are on a small team that has a pretty constrained internal goal and more hierarchical roles, you don't get that feedback as frequently. That has been cool to be involved in.
Seeing the project head in that direction almost naturally, there are goals and things we want to do, but everybody brings in their own goals. Everybody brings in their own passions, so then the project takes a shape and a direction of its own that is vetted by everybody involved. That's a cool dynamic too. As the project matures, it's interesting to see how our metrics for success also change.
To go back to the question you were asking before, initially seeing a lot of vendor interest was cool. Seeing people contribute to their research was awesome, and now we have people building SDKs like Mike was talking about. We are at a point now where in our community calls, people jump in and say, “I'm here because I have this particular challenge and I think OpenFeature could help.” We are going to go from seat of our pants rapid iterations to something where we have a more clear roadmap, but all along the way, the metrics for success are constantly changing. That's challenging in a way but exciting in another one.
It's interesting as well working with commercial organizations because they all naturally pull things a little bit in the direction they want to go in. The quorum of where that ends up, that's the thing that surprised me. They are not competing, but they are not all in exactly the same direction and you end up with a set of goals that coalesced naturally, which to me is super interesting to see. Mike, did you find it hard to get it off the ground outside of Dynatrace and to get people like ourselves interested in it? There are people who are contributing who work for eBay. How did they cross paths with you?
I suppose that might be what gives me confidence. We are moving in the right direction or at least the messaging resonates with most people. We didn't struggle. Once we had the right context and presented what we were trying to do to the right audience, it was a pretty easy conversation for the most part. There's a handful of people at Dynatrace that have quite the Rolodex.
They have been involved in a lot of different CNCF projects and have been doing this for many years. They had the right context. They have a lot of followers on Twitter and that type of stuff. All those things add up to where we are now. Once you are starting to get involved in CNCF itself, then that handles part of the advertising as well.
We are on the CNCF landscapes and then people see that, and then they want to join and contribute. That has been satisfying for me. We'll see people that will find an issue and say, “Assign this to me. I will take it.” That's pretty cool. That's the exciting part about open-source. People are willing to contribute, get excited about these ideas, and help push them forward for everyone's benefit.
OpenTelemetry, at the moment, does it have the momentum that it becomes this self-sustained thing? Does it still have people whose job is to keep pushing it forward?
I don't know the specifics, but we did talk to a couple of the core maintainers when we were out in Valencia, and they made that comment. If a few of them were to disappear or move on to other projects, the project would probably be in a difficult spot. Despite it being a few years in, it’s the second most popular CNCF project. There are a lot of things going for it.
You do need to have a few people that at least propose the direction, and then it's up to the community to decide if that's where they want to go or not. Back to one of your other points too, I think that's also why a lot of different vendors are involved to some extent. It is because they each want to have their little say and make sure that the direction is going to benefit their long-term strategic goals.
Once you hit a certain level of maturity or a project, you start having those different companies do that. You have full-time engineers fully dedicated to the project’s success more or less. That's an interesting spot that perhaps eventually, OpenFeature is in a similar spot too. We have some, more or less, dedicated people from different feature flag vendors in other places to push this forward.
A big part of that too is we were lucky in a few regards to have the OpenTelemetry and some input from some OpenTelemetry contributors at Dynatrace as well that helped us with our projects and share some pitfalls and stuff like that. The other thing we were lucky to have was a rate at the get-go to have some help from people who have maintained very mature open-source projects. Jenkins, namely, is contributing right from the get-go in terms of governance structure, contribution guidelines, and charters that govern how people contribute and what our core values are. That was helpful because I didn't have experience with any of that stuff. I don't think Mike did either.
It was helpful to have that and it gives you a sense of calm around the fact that there are so many different competing interests involved. As we were talking about before, that is generally a good thing because everything is vetted, and you have this collectivized and aggregated goal that you are heading towards that maintains value for everyone.
At the same time, it's a little bit nerve-wracking to see large companies that have a lot of power and money behind them contribute to the project. We want to make sure that everybody has a say. Having proper governance in place and having those kinds of values and rules at the outset is a good thing to have.
Mike, for people who don't know too much about this, do you want to explain how you would define the CNCF, and then what the stages of the life cycle of a project can be through that organization?
The CNCF is part of the Linux Foundation. They are primarily focused on Cloud technologies. As you see, OpenTelemetry is not necessarily explicitly required to be only working in Cloud-native or what you would consider Cloud-native technologies. It's a popular organization right now. They even say that Keepcon is the biggest open-source event in the world. It's a popular community. A lot of contributors and interesting projects are going on. That was part of the reason we thought it was a good fit, plus with the OpenTelemetry angle, we had some experience with some contacts in those teams. That's partially why we got involved with CNCF.
The process was a little bit slow but it all makes sense. You need to put in the application. It comes up and they do a monthly meeting, and then they do a vote on that. If you do not make the cut, they recommend that you reapply in six months or so. That's the process. We are still in the process of filling out the transfer form. Once you are accepted as a project, you have to transfer all IPs and everything over to the CNCF and the Linux Foundation. We are in the process of that right now.
Back to the questions on the different stages, there's the sandbox stage, which is where we are right now. They say it's got the lowest entry barrier, but you also get the least amount of promotions and things like that from the CNCF itself, then they have the incubation stage which is where most projects are. That has been in the sandbox for quite a while.
I believe OpenTelemetry is in that stage. That is where you can start having maintainer tracks and stuff at Keepcon. They provide a few more marketing-related services. I believe graduated is the last one. I'm not sure if you can confirm that, Todd. There are a handful of projects like Kubernetes and stuff that’s a graduated project.
Have there been any challenges that come up along the way that you weren't expecting?
The hardest one for me is because it's not a single repo-and-get, it means that the project management side of it becomes quite a bit more complex. Every new thing requires a repo, which then starts requiring a bunch of tooling around that. We still haven't fully addressed that situation. Some repos are in pretty decent shape. Other ones are bare bones and missing some of the toolings that it needs. Even from the planning perspective to break that out into an org-level roadmap, there is tooling in GitHub, but it's still marked as beta and there are a few limitations around that.
He brings up a good point. We must have 70 repos in GitHub, but under the advice of the organization, the tool to meta-manage those is not that great. That has been tough. If you start bringing repos together, that brings its own challenges as well. That's a hard problem to solve. I don’t know how they would do that.
It's a tough one for sure. One of the things we are working on right now is a standard test harness and test library that all of the SDK repositories can share so that we can get some standardization across our testing. A lot of these problems are probably too big and diverse for even GitHub to solve or something like that. It takes a lot of CICD work and auditing work. GitHub helps to some extent. There are lots of plugins and good open-source tools that you can leverage, but at the end of the day, you are piecing a lot of these things together.
Maintaining all those repositories is a challenge. It's an ongoing challenge and something that we have to manage. It's one of the things that is the cost of doing business in an open-source type world. It never ceases to amaze me how interested people are in volunteering even for those sorts of tasks. Everybody has different passions.
When you are not drawing from a fairly constrained internal pool of individuals, but instead from the entire open-source software community, you have a lot of people who have a lot of different passions and interests. For a concrete example, I bootstrapped our new doc site. Documentation is super important. It's critical to a project like ours.
At the same time, my experience has been that a lot of people don't enjoy writing documentation especially when you are in a catch-up mode and writing documentation for something that already exists. We have had volunteers that are very interested in writing docs, and not just interested but offering insights into how we can structure our docs in inspired ways that have been proven successful for other projects. There is so much amazing expertise that comes from the open-source community. That's encouraging even though things like an explosion of repos can seem intimidating.
I realize that software engineering is way more important than the software part. In my mind, it's all about engineering and the software is the secondary part. I have to keep telling myself like, "If I fix or build scripts or something, I feel bad that I haven't done any work because I haven't been writing code." Especially in open-source projects as well, you can spend weeks doing that and you think, "I didn't do anything the last three weeks," and you did because you are doing the engineering and not the software. That's one of the great things.
Also, we have seen this before. People contribute to the documentation. That surprises me. People are excited when some new documentation page gets written on. You wouldn't expect it. Do you have goals in mind personally for when you feel like you'll be like, “I'm up for my next challenge?” Where do you see that?
I don’t know if I have a particular goal in mind in that sense. I have specific goals for the project. Maybe not even goals because it's a little bit hazier than that. I think that what I have started to see is that there are some options for OpenFeature in terms of bringing some paradigm shifts to the existing feature flagging space.
I would love to see those paradigm shifts fully mature and see customers make use of these new patterns. I will be specific. One aspect of our project that I find compelling and also different is some of our cloud-native components. The primary thing and the most important thing out of the gate was the abstractions we were building on top of feature flagging, vendors, providers, or whatever homegrown solutions so that you could have this consistent API.
One of the interesting things in the project, especially as it matured, is exactly what this looked like came into shape where these cloud-native components were. They have two pieces. One is something we call flag D, which is a means of moving the flag evaluation out of the client process or out of the application and into this other process. This is what flag D is. That has a lot of cool benefits. One of the things it does is allow developers to connect their OpenFeature API abstractions to flag D, which is running in some other process. They don't make any choices about where flag D is getting its flags.
Flag D can be getting flags from a file or in cloud-native or Kubernetes-native deployments. A config map is mounted as a volume or as a file to flag D. There's also the potential to have flag D get its flags from any vendor or feature flag system because it uses the same abstractions that we defined in our specification and that we implement in our SDKs. What that means is you could be using some exotic language. You could be writing an application or line or something like that or something very unusual. You could still take advantage of feature flags hypothetically from any vendor by essentially having flag D do the flag evaluation out of the process and communicating with flag D over a very simple protocol over a socket, http or GRPC.
That's interesting. The paradigm shift there is not just the flexibility that it brings in terms of supporting any language that can connect over TCP to a process but, also the paradigm shift that it brings in terms of moving feature flags more into an operational direction. Instead of the application developer having control over how these feature flags are evaluated, and the flag vendor or even source that you are connecting to, the Kubernetes deployer or whoever that admin is can sub that out by configuring flag D in Kubernetes or even if you are running this on some Linux system. Flag D can run as a system D service, but the person who administers that VM can choose what feature flag vendor to connect to. That's an interesting paradigm shift. I think that it has some potential.
How about you, Mike? Have you got an end goal in mind personally?
If you are going to get started with feature flagging, you would use OpenFeature. That would be a given in the space. There would be no question if Feature Flags are even important because that's something that's also come up. It's like, "Why bother? Do I even need this type of stuff?" If we can overcome those obstacles where, "Yes, it's very important. Here's why." We are going to use OpenFeature and then make the decision on what powers it and what providers and terminology we use in OpenFeature, be that some homemade solution or a vendor.
That would be ideal for me. To Todd's point too, the other thing that's exciting is people with other backgrounds start getting involved in feature flagging and understand the benefits. There's a lot of potential beyond applications. It's primarily focused on feature flagging for applications, but it could be used for infrastructure potentially and some other pretty interesting areas that maybe haven't used feature flagging historically. We'll see where that evolves over time as people get more comfortable with some of these terminologies and some of the tooling around them.
I remember when we had the first call where flag D was shown. I was like, "I hadn't considered that." The exciting thing for me is I feel like there are 1 or 2 more things that will appear, and that someone will have an idea.” As of the way it's designed, there will be things that will be like, “Now we have done, that means we can do this easily.” It will probably be a different discipline or area that people come up with them. That's super interesting to me. Do feel like you were ready for the second generation of the paradigm and how people can use it?
It’s exposure to people that weren't familiar with the concepts and how they apply to their world. There are going to be some interesting possibilities there. As it's all done in the open, anyone can benefit from that if it makes sense for a particular company or a vendor to start supporting these models. By all means, it's all open.
That's pretty cool too. The community can benefit from each other's ideas in this space, and there's a lot of potential here. The basic concepts are incredibly simple, but the fact that you can send fairly complex rule sets during runtime unlocks insane levels of potential, in my opinion. We are getting started with what we can do.
Thanks so much for your time. We have got a ton of content. Maybe we could even split things up. Was there anything else you wanted to bring up?
No, we covered everything. I appreciate the time. Thanks for inviting us.
If you are interested in OpenFeature, visit our repo at GitHub.com/OpenFeature or visit OpenFeature.dev. That's probably the mandatory callout.
Thanks again. I will catch up with you soon.
Learn more about OpenFeature and OpenFeature providers