Adventures in Terraform: How and why we built our Terraform Provider

By Gagan Trivedi on November 16, 2022

Hey there - my name is Gagan Trivedi and I am an engineer at Flagsmith. A few months back I had absolutely no experience building Terraform Providers. Flagsmith needed a Terraform provider, so we built one! I wanted to document my journey to help other people looking to build providers for their products.

Context

At Flagsmith, we offer an open-source feature flagging tool. Our goal is to help teams ship faster and continuously improve their products. Over the last few years, we have seen a growing number of teams that have adopted Terraform to automate parts of their CI/CD processes. This product has been an absolute game changer in moving more and more of the build process into code, ensuring consistent approaches and best practices to building software. 

Understanding this approach, we decided to build and launch our own Terraform Provider a few months back. This provider is great for CI/CD automation and lets dev teams control and manage feature flags/toggles as part of their Terraform workflows.

Terraform Feature Flag

A big part of being an Open Source company is really leaning into transparency. This article will look at how / why we made the decision to build the Terraform provider and how to get started if you are looking to build your own. If you are interested in these topics, we wrote about our scariest release ever and our infrastructure costs of running SaaS at scale (billions of requests/month).

Why we decided to make this integration

Whenever we talk to our potential clients we always make notes about their requests. Since the beginning of 2022, we’ve noticed that more and more teams are adopting Infrastructure as Code (IaC) best practices. The platform that consistently came up was Terraform.

Terraform feature toggle

To make a decision about whether we actually need to invest in this and if so, how we should build this integration we started the customer discovery calls.

Terraform feature flagging integration

The feedback was extremely helpful for our initial design and allowed us to scope an MVP plus a near-term roadmap for improvements to the provider over future releases. Once we were aligned on what we wanted to build, it was time to start writing code!

Building the Terraform Provider

Step 1: Let’s create a basic terraform configuration first to give us some idea of what we need to build, and we can build it backwards from there.

In our case, for the version v0.1.0, it was just the ability to update a feature state

and our configuration looked something like this:

1terraform {
2  required_providers {
3    flagsmith = {
4      source = "github.com/Flagsmith/flagsmith"
5    }
6  }
7}
8
9provider "flagsmith" {
10  master_api_key = "<master_api_key>"
11  base_api_url = "<http://localhost:8000/api/v1>"
12}
13
14resource "flagsmith_flag" "feature_1_dev" {
15  enabled         = false
16  environment     = 2
17  feature         = 14
18  environment_key = "<environment_key>"
19  feature_name    = "some_feature"
20  feature_state_value = {
21    type         = "unicode"
22    string_value = "some_flag_value"
23  }
24
25}
26

With what we need to build answered, the next step is it to figure out the how are we going to build that for that, we decided to use https://github.com/hashicorp/terraform-provider-scaffolding-framework because we believe it has a clear and concise interface. If you want, you can take a look at https://www.terraform.io/plugin/which-sdk and decide which one suits your requirements better.

With the most important building block out of the way, it was time to build the Golang library that will sit between our plugin and Flagsmith server, in other words, our terraform plugin is going to need a way to interact with our API and our Golang library is going to do just that.

Since we only need to GET and PUT (update) the feature states (what about create and delete, you might ask, since we need to implement them as part of the provider interface? Well more on that later), our Golang library is only going to have those two public methods (apart from the one used for initializing the client)

Now, let’s get back to our plugin. We start by replacing the `internal/provider` directory with `flagsmith`, i.e. the name of our provider.

Next, let’s create some files:

1[gagan@gpad flagsmith]$ ls  
2models.go  models_test.go  provider.go  provider_test.go  resource_flag.go  resource_flag_test.go
3

We decided to create a file called models.go to hold our resource data struct instead of it being part of resource_flag.go which makes the abstraction a bit clearer, and we also decided to implement some helper methods on that resource data to make it interoperable with the data structures used by our golang library.

Because we have implemented To/From methods to convert data structure to and from our golang library, our plugin code looks much simple:

Now to answer your question; what about creating and deleting

Now create is a bit tricky since we do read and an update in place of create. In Flagsmith whenever we create a feature, the feature states for the all the environments are automatically created; hence we can't ‘create’ a feature state but in order to implement the create functionality we first read the feature state and if there are any changes between what we have read vs what is specified in terraform we perform an update.?

Let’s talk about delete first: Well for delete we do nothing, we just remove the resource from the state. The reason for this is that in Flagsmith, whenever we create a feature, the feature states for all the environments are automatically created, i.e: they live and die with the environment, hence it does not make sense to delete an environment feature state. The logic is pretty similar to why we do read and update instead of create.

In Flagsmith whenever we create a feature, the feature states for the all the environments are automatically created, i.e: they live and die (gets deleted) with the environment, hence it does not make sense to delete an environment feature state

1func (r flagResource) Create(ctx context.Context, req resource.CreateRequest, resp *resource.CreateResponse) {
2	var data FlagResourceData
3
4	diags := req.Config.Get(ctx, &data)
5	resp.Diagnostics.Append(diags...)
6
7	if resp.Diagnostics.HasError() {
8		return
9	}
10	// First we read the resource by calling the `Read` method 
11  // Manually 
12	readResponse := resource.ReadResponse{State: resp.State}
13	r.Read(ctx, resource.ReadRequest{
14		State: tfsdk.State{
15			Raw:    req.Plan.Raw,
16			Schema: req.Plan.Schema,
17		},
18		ProviderMeta: req.ProviderMeta,
19	}, &readResponse)
20
21	if readResponse.Diagnostics.HasError() {
22		resp.Diagnostics.Append(readResponse.Diagnostics...)
23		tflog.Error(ctx, "Create: Error reading resource state")
24		return
25	}
26
27	// Next, we call the `Update` method manually 
28	updateResponse := resource.UpdateResponse{State: resp.State}
29	r.Update(ctx, resource.UpdateRequest{
30		Config:       req.Config,
31		Plan:         req.Plan,
32		State:        readResponse.State,
33		ProviderMeta: req.ProviderMeta,
34	}, &updateResponse)
35
36	if updateResponse.Diagnostics.HasError() {
37		resp.Diagnostics.Append(updateResponse.Diagnostics...)
38		tflog.Error(ctx, "Create: Error updating resource state")
39		return
40	}
41  // finally we set the state from update response
42	resp.State = updateResponse.State
43	resp.Diagnostics.Append(updateResponse.Diagnostics...)
44}
45

Check it out!

Now you can use this provider https://registry.terraform.io/providers/Flagsmith/flagsmith/latest

and control feature flagging from your Terraform environment. Stay tuned for updates! We are doing a lot of work in this area to make our customer’s lives easier and move more of their infrastructure into code. 

Thanks,

Gagan