Feature Flags Done Properly: From Toggles to Trust

Feature Flags Done Properly: From Toggles to Trust

Feature flags promise safe, gradual delivery, but left unmanaged they become a second codebase nobody owns; here is how to treat them as the operational artefacts they actually are.

First posted:
Read time:
6 minutes
Written by:
Steven Godson
Tech

Feature Flags Done Properly: From Toggles to Trust

Feature flags are one of those techniques that everyone agrees is sensible and almost nobody manages well. The pitch is seductive: decouple deployment from release, ship dark, roll out gradually, kill instantly. All true. What rarely gets said in the same breath is that every flag you create is a branch in your runtime behaviour, and an unmanaged collection of flags is simply a second, undocumented codebase running in parallel with the one in your repository. In my experience, the teams who get genuine value from flags are the ones who treat them as operational artefacts with owners and a lifecycle, not as clever little if statements that happen to live in a dashboard.

1. Understand What a Flag Actually Is

A configuration decision masquerading as code

The first mistake is treating all flags as the same thing. They are not, and conflating them is how organisations end up with a thousand toggles and no idea which ones matter.

  • Release flags — short-lived switches that gate an in-progress feature until it is ready, and should be removed the moment the feature ships fully.
  • Operational flags — kill switches and circuit breakers that exist to let you degrade gracefully under load or during an incident, and these legitimately live for a long time.
  • Experiment flags — flags driving A/B tests or staged rollouts, which expire the moment the experiment concludes.
  • Permission flags — entitlements that control access by plan or cohort, which are really product configuration and arguably should not be flags at all.

In my opinion the single most useful thing a team can do is label every flag by type at creation. A release flag that has quietly become permanent is a defect, whereas an operational flag that lives for three years is doing its job.

2. Give Every Flag an Owner and an Expiry

Because orphaned flags are technical debt with a UI

The reason flag estates rot is that creation is trivial and removal is nobody's responsibility. I have seen production systems with flags whose original author left the organisation two years prior, gating behaviour no one dares touch.

  • Assign an owner at creation — a named individual or team, not "platform", recorded in the flag metadata itself.
  • Set an expiry date or review date — release and experiment flags should carry an explicit date by which they are removed or justified.
  • Surface stale flags automatically — a weekly report of flags older than their review date, posted where humans will actually see it.

My preference is to treat a flag without an owner as a failing build concern, not a tidy-up-later concern. If your flag tooling cannot enforce ownership, wrap it in something that can.

3. Keep the Flag Logic Out of Your Domain Code

Conditionals are cheap to add and expensive to live with

The technical sin is scattering flag checks throughout business logic until the genuine behaviour of the system is impossible to reason about. Every nested toggle multiplies the number of code paths you must mentally hold.

  • Resolve flags at the boundary — evaluate the flag once, near the edge of the request, and pass a clean decision inward rather than threading the flag deep into the call stack.
  • Avoid flag-on-flag dependencies — two flags that interact create four states, and four flags create sixteen; combinatorial explosion is not your friend.
  • Wrap the SDK — never call the vendor SDK directly from a hundred call sites, because the day you migrate providers you will be grateful for the single seam.

I am always conscious that a flag added today is a code path someone will be debugging at three in the morning eighteen months from now. Write it so that future engineer can understand the intended state without archaeology.

4. Test the States, Not Just the Happy Path

A flag you have never tested in its "off" state is a liability

Flags multiply the test surface, and pretending otherwise is how you ship a feature that works perfectly in the rollout cohort and falls over for everyone else.

  • Test both sides of every flag — at minimum, exercise the system with each release flag both on and off in your pipeline.
  • Default to the safe state — if the flag service is unreachable, the code should fall back to known-good behaviour rather than failing open or crashing.
  • Validate flag cleanup — when you remove a flag, confirm the retained branch is the one that has actually been running in production.

My experience has been that the genuinely dangerous moment is not the rollout but the cleanup, when someone deletes the wrong branch because the flag had been at one hundred percent for so long that nobody remembered which way round it went.

5. Connect Flags to Your Change and Incident Practice

A runtime change is a change, whatever the dashboard calls it

This is where the engineering craft meets the operational reality, and where most flag governance quietly evaporates. Flipping a flag in production alters live behaviour without a deployment, which means it sits in an awkward blind spot for most change management.

  • Log every flag change as an auditable event — who, what, when and why, ideally flowing into the same place as your deployment record.
  • Treat high-blast-radius flags as changes — a kill switch on your payment path deserves the same scrutiny as a code release, even though it ships no code.
  • Make flags first-class in incident response — when an incident opens, the list of recently changed flags should be one of the first things responders see.

In my opinion the ITIL-minded among us have rather missed this one. We have spent years maturing deployment governance whilst leaving an entire category of production change to be made by anyone with dashboard access and no record kept.

6. Where to Start

Small, deliberate steps beat a grand flag platform

You do not need to procure an expensive platform on day one. You need discipline, and discipline is free.

  • Audit what you already have — list every live flag, classify it by type, and find the orphans; this alone is usually sobering.
  • Introduce a creation standard — owner, type and review date as mandatory fields, enforced however your tooling allows.
  • Schedule removal as real work — put flag cleanup on the backlog as a recurring commitment, not a someday aspiration.
  • Add change logging before you add more flags — knowing who changed what is worth more than any sophisticated targeting feature.

Feature flags are not the problem. Unowned, untested, undocumented feature flags are the problem, and that is entirely within your control to fix. Get the lifecycle right and flags become exactly what they promise: a way to deploy with confidence and release with care.

Hopefully this has been useful to you and I wish you well in your own work…

Estimated reading time: 6 minutes

Comments

Loading...

Leave a comment