Semaphore Interview | Craft of Open Source Podcast

Interview with Darko Fabijan: Co-Founder, Semaphore

Ben Rometsch

September 5, 2023

Ben Rometch

Host Interview

Darko Fabijan

Co-Founder

Darko Fabijan

Co-Founder

Do you spend so much time building and running your pipeline? That doesn’t have to be the case anymore. In this episode, Darko Fabijan, the co-founder of Semaphore, shares the efforts Darko and his team went into improving Semaphore products to serve their clients. Through the development of their products, it minimizes the time spent in running your pipeline with CI/CD Solutions. Darko also explains how Semaphore On-Premise functions as your team grows. Join us in this episode and learn how Semaphore makes development teams move faster.

---

We had an interesting chat with Darko and myself. He is from Semaphore CI. We couldn't figure out whether I was going to interview him or he was going to interview me. We decided we were going to do two interviews so people could read things from both sides of the story. I picked the straw to go first. I'm the one doing the intro. Darko, welcome. Do you want to introduce yourself in the business?

Thanks for having me. I'm Darko Fabijan. My role is CTO for Semaphore CI. We have been doing CI/CD business for several years. We started in 2010 that connecting point between GitHub and Heroku for a lot of people who remember those days. We didn't want to manage and maintain Jenkins. That's something we did for some time. That's why we created Semaphore. We served smaller companies. We are serving teams that are north of 1,000 engineers both in SaaS, a hybrid option, and full on-prem for the most demanding companies that need security compliance.

This happens on this show a lot. I tried to figure out what the state of the world was like in terms of software engineering and CI/CD several years ago. Jenkins was the only solution in town for quite some time.

Back then, there was the default, it's fair to say. I was young and naïve. I thought in a couple of years later, it will be wiped off the surface of the planet. However, it is still around. It is still used in a lot of places and deeply integrated in a lot of situations. Apart from Jenkins, there was nothing else. Travis CI was started with an offering for the open source projects. That was a big thing. GitHub came around and offered to host open-source software for developers. Travis CI was an answer to that.

A couple of years later, they also started providing commercial offerings. We launched before them with our commercial offering. At the beginning of the decade, every second student was making a hosted CI/CD thing. It's an interesting thing for developers generally to make. It's an automation used by developers.

Over the years, a lot of companies have disappeared. There was quite a lot of consolidation. Towards the end of the decade, there was a push from big cloud providers. Think of Microsoft, meaning GitHub these days. Azure, AWS, and Google, everyone has something to offer. What we'll luckily manage to develop over time is our deep expertise in helping the teams make their feedback loops much shorter.

If someone has 200 engineers working on something, it can be an analytic application and comes to us. They were like, “I would love our developers if you have a feedback loop which is under ten minutes.” It can be 5 or 7. We agree on something. Our tool has features that are there to support that. They also have expertise in helping them understand their pipeline application. We have that edge of people who want to have a great, polished product and amazing performance. We can answer there with a variety of tools.

Can you talk a little bit about the genesis of the company and how it started? Am I right in saying that you haven't taken any funding?

That's correct. Initially, we were a couple of people doing Ruby on Rails consultancy. It was early days of that. There was a vibrant community. It was also leading a lot of practices within the software discipline, test-driven development, behavior-driven development, continuous integration, and continuous delivery deployment. We're all strong within that community.

After a couple of years of doing consulting work for US-based startups, those years, generally around 2010, were the golden age of interfaces where your SaaS products are still great and around. We are part of that of the generation. We always wanted to build a product. I haven't figured out what we can build. Doing a custom software consult setting is great and interesting, but you don't control the whole situation around that. We wanted to have a bigger impact on picking the work we are doing and how we are doing that.

We configure Jenkins once and scale it two more times. We say, “Enough. There must be other people who are feeling the same pain who want to develop software, do a couple of clicks, and get the tools they need.” That's how we started. For several years, we did those things in parallel. We are a product company but also doing a consultancy on the side.

In a relatively short period of time, we managed to let go of those clients and focus exclusively on the product set for various reasons. We decided not to take funding. We have been bootstrap profitable all along. We are now around 40 people. The business is great. However, we have never been in that competition, being the one CI that is going to have the largest market share.

You mentioned that there were a lot of competing products that were released. After people saw how revolutionary Jenkins was, it was designing a CI/CD tool for a software engineer the same as designing a to-do list as a product designer or UX designer. It's the holy grail. Everyone has an opinion about it. No one can quite agree on what it looks like.

I can't remember the name. There were open source products made more Java-specific to build tools that we were looking for within our agency. We had bash scripts. That's all you had at the time. You hand-wrote a bash script to compile stuff. If I didn't have something working in two hours, I'd get rid of it, give up, and try to remember to look at it several months later. Jenkins was the first tool. Ten minutes later, there was a wild package with my application. It is interesting because that was click and click as opposed to writing a declarative Yammer file, which the industry seems to have consolidated.

I'm interested in talking about whether you are going a bit deeper and you had, particularly, not radical ideas, but were there concepts and product designs that you felt strongly differentiate what you put down to you guys surviving? A huge number of those others fell by the wayside. You didn't raise money. It might have something to do with that.

If you raise a lot of money, you not only burn through the money, but you also burn through the people and opportunities in inappropriate ways. What we wanted to achieve was you can do a couple of clicks, and you will get your pipeline running. That was a big thing for us.

Semaphore wants to achieve that you can do a couple of clicks, and you will get your pipeline running.

Do you think that was based on the Rails paradigm? You mentioned Heroku and Rails. Heroku and Rail’s design philosophy around that was that you remove the need for CI/CD. It is implicit in those two tools working together. The moment you need to do a front-end compilation step and you want to bring in some other package, that design breaks down.

The reality of things, even back then, is you needed something to run your tests. That component was missing. A big drawback with Jenkins, which they hit in a nice way, is you download that single Java file. You double-click it, and you get something running, which is amazing. That was a great usability component. Everything after that was we needed to scale it, maintain it, and plugins. That's still a huge problem. What we want is there is something that's going to run your builds. If you want to run many of them or many jobs in parallel, it will work. You'll not have to think about it in any way.

There was that specific moment in time when Rails was popular. Heroku was serving Rails. That was the main focus. GitHub was also built on Ruby on Rails, which had an impact on that community. There is also that component of convention over configuration, which allowed us to have a deep knowledge of that framework.

When you were setting up your project, it was a repository. After that, we analyze your repository through the API. We are looking into some files. We analyze that automatically. We recognize your database configuration and your other configuration files. We set up good enough defaults that, for a lot of people, resulted in a greenfield if your tests passed straight after three clicks in the application. That allowed us to have that great a-ha moment within that community. That was a big thing. As you were describing, you don't want to get into a tool and feel stupid because the tool is putting so much on you. You want to feel empowered. That's why we also decided to pull your credit card out of your pocket because this tool is helping you save time and nerves. That was a significant component.

That usability is still something important for us. We cared about the great UI and how simple and easy it is. We think we're still the best in that realm. Although we had to expand by in the demand of the market to support many other frameworks, programming languages, and now an even much more complex pipeline. You can have on Semaphore something as simple as a single job, which is simple, but there is also a possibility of having pipelines with multiple dependencies that can be chained together.

COS 51 | Semaphore — Semaphore: Semaphore will expand for the market's demand to support many other frameworks and programming languages, and now you have even more complex pipelines.

‍

We have to evolve in that direction to support the customers that are now most important to us. As I mentioned, those are usually engineering teams, 50 plus people in a lot of situations, and a couple of hundred engineers. There are always different needs. To connect to something you were describing, a bit of a curse with CI/CD is it's inherently hard to set up. It's almost like deploying your production in the application. All the dependencies have to be there. You have to understand the environment. You have to get to a green stage. You have to understand the tool and adjust your application quickly enough so you don't lose interest. If it's going to be, “I need two days to set up something,” I'm not going to do that.

I'm curious to know out of interest. I’m not going into specifics about individual customers. I'm assuming the project pipelines have gotten longer, slower, and more complicated over the last several years. Is that a trend that you've seen?

I have to say yes because the most successful customers we have, meaning the biggest, most profitable ones, and we achieve the most success, usually, it takes several years to get to that stage of having a couple of hundred engineers. There is a lot that happens during those times. A lot of features were developed. A lot of tests were written. It takes longer to run various complexities. It has been added. At some point, JavaScript, the front-end stuff, and security also became much more important over the last decade as everything was moved into the cloud. An important component is that it's not twenty engineers working on that application or a set of applications but dozens of people, which adds an additional layer of complexity and importance to that feedback loop.

That's one thing we are seeing. There is an increase in complexity and duration. We are helping customers fight with that. There is that alternative path, which is a smaller path of microservices and small applications being tied together through various APIs, which exists and works well for some people, maybe not for others.

What we have seen in a lot of anecdotal evidence is that a lot of technical leaders are hoping to solve a bunch of their problems by chopping up their monolith into microservices. We have been hearing that for several years. “We are almost there. We are going in the right direction,” but it's not happening. I'm not saying that's impossible.

For a lot of companies, you have to keep moving forward. The monolithic application is holding people back because it's possible to have a feedback loop for under a couple of minutes, even for a big application. It was a combination of hype and hope. If it moves, things are going to become much easier. That's not happening if you have a ten-year-old application that easily.

You should keep moving forward, but it won't be easier if you have a ten-year-old application.

One of the reasons that Flagsmith exists is because we have an agency that still exists. We work on that, but we had large customers trying to migrate from a monolith to microservices as part of a cloud transformation project. They were starting from a greenfield on AWS. We wanted to go to microservice. Coordination of CI/CD tooling with microservices is something weak and people don't talk about as being a huge problem.

That was one of the reasons we built it, at the time called Bullet Train, but now Flagsmith because we couldn't find a way of reliably coordinating those sorts of releases. People say, “These APIs should have well-defined contracts and specifications and be versioned.” In a perfect world, that's what you do, but it costs a ton of money to do that. That costs an awful lot of money in the same way that if you want to do a complex database migration, you could maybe have the two versions of the application gracefully handle different database schemas.

For us, a feature flag is a good solution to this problem. You can deploy all your applications and microservices. It doesn't matter whether one of the pipelines takes 20 minutes and one of them takes 3 because you don't have to deploy and release it. I am still surprised that no one ever seems to talk about that problem. I wonder, do you get feature requests into Semaphore about how to deal with that?

If you are working with microservices, you would want to deploy them independently. That's what I'm seeing people are doing. You are deploying independently. You're playing a completely different service. That seems to be the pattern that works for people. That's where you're also getting most of the benefits of microservices. That’s one thing we are seeing.

The way that microservices touched us in a more substantial way is where that code is stored. It's about the mono repo support. If you have 25 microservices and it's growing, you need to onboard people and coordinate that development. It's much easier to do a good clone of a single repository and start from there. Our answer to that was the support for those products where we have a DSL, which you can specify on dependencies and directories that you want to build, redeploy, or run the test if those have changed. That's where we have seen the biggest need and the most requests. That's how we answer to that.

‍

It also depends on the industry and the type of application. We are running those big end-to-end tests across their whole application and microservices together. It is being spun up, and only after that, they're doing the deployment. Acquisitions I have, and the numbers that we are seeing, is that mostly related to the industries are more heavily regulated. You need to know in that release that you're shipping out. You need to guarantee that you did all the tests together.

However, it's more on that wallet side that everyone is deploying their microservice on their own. Maybe doing some canary deployments and some fast feedback loops through observability and metrics. If you mess something up, you roll back, and you try to satisfy those interfaces that you were mentioning.

On that point, in running a microservices approach, there is a significant investment you have to make in defining and maintaining those contracts between services. In terms of dealing with the data and database, you lose that goodness of having transactions. It is something that people are aware of, but there are always trade-offs.

Several years ago, when you started to get a little bit of traction, you must have had people wanting to throw money at you. Was that the case?

Do you mean in terms of investment?

Yes.

There were conversations. At some point, we were taking that path and talking with people, traveling around, and seeing what could be done in that realm. There were a wide variety of reasons that we didn't take that path. One of the elements is we were growing organically to what we perceived as a satisfactory rate. We don't have that view of the world that we have to be absolutely the biggest. It is a sad situation for me still. There is that element of winner that takes it all, there has to be one, and everyone is going to gravitate towards that one solution, and all of that.

What we also learned over the years is that's not necessarily the case. Various people are searching for companies they want to partner with on a different basis. Some people want to work with smaller companies that can be much more agile, react and treat customers differently, and all those elements of providing a service. That's one of the biggest learnings for us. 2022 was our best year ever in terms of growth. It is a combination.

It's interesting because it seems to me that the competitors of the products or projects and businesses that had raised money at that point are on the, “Win the market and get a top five in the market, or you are gone.” It's such an important tool for engineering teams. If you're going to try and throw money at the problem, you're going to go up or go out.

If you will try and throw money at the problem, you will go up or go out.

Another thing I wanted to talk about was Safe Flags makes the money, not all of it, but the majority of its revenue through on-premise deployments. That's something that you guys released. What was the driver for that? Talk about how much work you guys had to put in to get to the point where you were able to hand over an on-premise lump of software to customers.

We wanted to do it for a number of years, but there is always enough growth and more than enough work in the SaaS world that we need to tackle. We never allocated enough resources to do that. We were moving upmarket and having a number of customers who were looking into that because they were growing, getting into the public market, and various security and compliance needs. They're coming to them because of the industry they're in, like healthcare and finance.

At some point, maybe roughly two years or a bit more than that, it was clear, “If we don't jump on that train and create the on-prem installation, sooner or later, we are going to use one of our potential on our biggest and most valuable customers.” We jumped on that. At the same time, we had a number of requests after we started.

In the next couple of months, we had two customers who wanted to do a FedRAMP certification, where they wanted to sell their software to the federal government. Those security and compliance needs are different from those before. The easiest way to answer that is to offer them on-prem installation, and I can install that wherever they need to.

The process took several months to get something up and running in a decent shape and the next 3 to 6 months to stabilize that and make sure that it could support hundreds of engineers as an on-prem installation. There was a lot of work that was put in, but in hindsight, it was easier than I anticipated. There are a couple of components that were not present a couple of years prior to that.

To stabilize and ensure support for hundreds of engineers as an on-prem installation requires much work, but in hindsight, it was easier.

Kubernetes is a default environment where people want to install things. There are enough things defined and enough compatibility between different public cloud providers. If you spin something up within Kubernetes and want to have an externally managed database such as Postgres, Redis, or something available within those clouds, there is a Terraform, which is also mature enough, that can help people spin those services up depending on the configuration. There was a lot of work, but it wasn't as hard as unanticipated.

However, with that being said, our release process now works differently. We are shipping constantly to our SaaS platform, but we have to compose those releases that are going to the on-prem. That has a huge impact, but there is a significant impact on the team as you build features and decide how to power them. That's also in the realm of we have an old feature flag system that we build ourselves a couple of checkboxes in the admin interface, but we are relying heavily on that. Things are not available in the on-prem installation for the first several months before it stabilizes SaaS. They are flipping a switch on the on-prem. It's also available to the on-prem customers.

It's interesting hearing what you're saying about Kubernetes because if you'd tried to do it several years ago, that story would've been much more complex. Here is one thing that we've noticed over the several years that we've been selling Flagsmith on-premises. Several years ago, Kubernetes was the orchestration name that you heard most often. People talked a lot about OpenShift, Rancher, HashiCorp products, and things like that. We keep talking about this place. I can't remember the last time Kubernetes wasn't the deployment target for a large customer. It is a simplification, but is it fair to say that you took your SaaS platform and built a Kubernetes deployment story around that, and that's what the enterprise on-prem version became?

There are a couple of other moving pieces, but yes.

You talked about the additional work that you now need to worry about, which we know we feel in terms of getting those enterprise releases and versions out. What other things do you find you run up against? I heard of an interesting challenge that Dynatrace has. They have a new part of their product that leans on using large numbers of S3 buckets to do ultra-high-performance big data stuff. That's hard to replicate in a non-cloud vendor or someone who's got the keys to the data center in their basement or somewhere else. Do you guys run up with issues like that at all?

We do it in a different way. As you might imagine, our application is composed of something we can control plane. By the app, the orchestrator is running everything. There are jobs that are running. You can have thousands of them. They can run almost anywhere. It can be in any Windows machine, Mac machine, Linux, Docker, or Kubernetes as an orchestrator. That is that component for us, which is a concept of evolving and changing in different environments. People have different needs.

That's a component for us, which is carrying quite a lot of complexity. For clouds, if people want to run in VMs, we have auto scalers that scale those VMs for people within their environments based on the demand and the queue of the jobs that need to be handled. There is Kubernetes as a platform that people can use to run their jobs.

‍

For our concrete use case, that is the main source of complexity. For better or worse, there are only three cloud providers that have a significant market share, so AWS, Google, and Azure, plus Kubernetes. Everyone wants to move to the Kubernetes unless the vendor is locked in. That's the layer that Kubernetes provides. If our control plan and jobs are running within the Kubernetes cluster, it also gives the ability to our customers. If they decide to move and need to move, they can change clouds without a major vendor lock-in. A component for us is reducing that complexity.

One of the things that we were fortunate about was that our platform is open source, and we wanted it to run on your laptop all the way up to scale to tens of thousands of containers. We know what it does. We have one store of state, which is the Postgres database. We don't have any hard dependencies on any other bit. You've got a monolith, and we've got a database. You can run Redis. There are a couple of other things you can run. That made the on-premise deployment story to build that product simple.

A lot of our customers already have a managed database service within their organization. You're running some stateless containers that the Flagsmith runtime, which these days makes it almost impenetrable and stable. There's almost nothing that can go wrong. The auto-healing stuff works. Were there a lot of cloud-specific services that you had to go about either unpicking or re-implementing? Did you run up against that problem?

We are not in a good situation that you guys are in. I cannot think of dependency there. It was a major issue. Luckily, we never wanted to pick a service that would tie us to a particular cloud provider. We never decided to use RDS. What we have is Postgres and Redis, which are agnostic and popular enough. They are provided everywhere. They were stable and well-supported enough. We did have to do some work in that area, but it wasn't major for us. However, part of where the agents are and how they run was and still is a major area for work for us.

I love GitHub Actions as a product. You have those afternoons where you've pushed 57 commits to the same repository because you're trying to debug a workflow. You can't do it locally because it doesn't run locally. Did any of that work help with that particular problem?

Are you describing the problem of having a burst of changes you're putting?

No, if you want to author and do a lot of work on a set of pipelines, and you're constantly having to push to the remote runtime and have it spend five minutes before it fails, you get an alert and repeat that for an afternoon. By the end of the day, you've had enough of it all.

I cannot map anything directly to that. Within the product, we have a couple of strategies that help with that. There is a cancellation strategy where you are doing a burst of pushes. We are going to cancel those previous ones or stop them. You can have a specific set of rules that maybe that behavior is not wanted on your main or master branch. You only want that behavior on your feature branch, which is one of the things that can help with that.

In terms of the platform itself, what's next for Semaphore? It must have been nice to get that feature or that component of the product out. What was it that you had on your sites after you deployed the on-premise solution?

That was a huge milestone for us. We are talking with various companies of various sizes. With companies who have those specific needs, we are finally not out of the question anymore. We can be part of the story. Continuing with that area where we are investing a lot is to develop productivity. In those bigger organizations we are speaking with, our direct contacts are people who are maybe head of developer experience, developer productivity, and those roles where we are helping them to help their teams be faster. In that realm, there are always many things to measure and insights to get.

That's a nice thing about CI and CD domains. A lot of that data is stored within the system about various things that the developers are experiencing in tumor slowdowns. Flaky tests are one of them. How generally are test suites growing over time? Where is the majority of time being spent? What are some of the biggest bottlenecks the developers are seeing?

You mentioned those big applications. They're getting more complex over time. There is also that interesting element that we are exploring. There are dozens or, in some situations, hundreds of tests that haven't failed in the last couple of years for people. What is that saying? Is that saying that those tests are badly written, which is a possibility? A portion of them, probably.

There is also a question. What is the value of those tests? Are they bringing, or are you not taking away more value? Those are the conversations we are most passionate about that are, at the end of the day, helping developers on the other side have faster feedback loops, being happier, and shipping code faster. After wrapping the on-premise big batch of work that we did, this is one of the next steps for us.

That sounds fascinating. I hadn't thought about it. You guys have the aggregate data of a lot of different signals. Especially for larger teams, it is potentially valuable. I know you are like, “We've got 1,078 tests. They're all green all the time. Why bother running them?”

At some point, you mentioned GitHub Actions. That's an area where we differ. GitHub Actions are doing what we were doing several years ago. There is that set of functionalities. If you have a small application and not too big of a team, you are fine, as they are quite dominant in that part of the market where a company of ten engineers or so has a relatively straightforward application. That thing can be decently well taken care of with GitHub Actions. Where we mainly excel is once people pass that stage, they need to do more for their developers and team. That's an area where we have a number of years more of experience and product development built into the tool.

‍

Darko, thanks so much for your time. It's been great chatting. I look forward to doing this again. Thanks again for coming on.

It was a pleasure talking to you.

‍

Important Link

Semaphore CI

About

Darko Fabijan

Darko, co-founder of Semaphore, enjoys breaking new ground and exploring tools and ideas that improve developer lives. He enjoys finding the best technical solutions with his engineering team at Semaphore. In his spare time, you’ll find him cooking, hiking and gardening indoors.

‍

Available for talk, coaching and workshops on:

Links from the Episode

No items found.

Success!

We'll keep you up to date with the latest Flagsmith news.

Must be a valid email

Semaphore Interview | Craft of Open Source Podcast

Important Link

Links from the Episode

Subscribe