featurebloat.com is an independent editorial site. Some links are affiliate links. All company names are trademarks of their respective owners. Opinions are the author's. Company references cite public sources. This is editorial content, not endorsement or advice. Learn more
16 min read

Feature bloat for engineering leaders: the maintenance tax is the product

Engineering leaders understand feature bloat better than anyone else in the organisation. They are the ones who see it in the codebase: the feature flags from 2021, the database columns that are never read, the UI components maintained for six users, the API endpoints that serve a single enterprise customer's custom integration. The problem for engineering leaders is not recognising bloat. It is making the case for removal in terms the business understands.

This page gives you that case, in numbers. The maintenance tax is real, quantifiable, and directly affects engineering velocity. The surface area equation links more features to more incidents. The flag rot problem is specific enough to be actionable. And deleting code is one of the few engineering activities that engineers genuinely enjoy.


01. The maintenance tax math

The maintenance tax math

Every feature has a monthly maintenance cost. This is not metaphorical. It is calculable. A feature in production requires: test coverage (tests that run on every CI build), documentation (someone must own it), support (someone must answer tickets about it), monitoring (someone must respond when it misfires), and periodic updates (dependency upgrades, security patches, API migrations).

A reasonable estimate for a moderately complex feature: 4-8 hours of engineer time per month for maintenance activities, not counting new development. At a fully-loaded cost of £100/hour, that is £400-800 per feature per month. A product with 200 features that are actively maintained is spending £80k-160k per month on maintenance alone, before a single line of new code is written.

The productive application of this math: sort your features by (monthly usage) / (estimated monthly maintenance burden). Features with low usage and high maintenance burden are candidates for removal. Features with high usage and low maintenance burden are your core. The ratio tells you where the leverage is.

For features used by fewer than 5% of active users, the maintenance-per-user cost is typically 10 to 20 times higher than the median feature. These are the features that are subsidised by the users who do not use them.


02. Feature flags forever: the silent debt accumulator

Feature flags forever: the silent debt accumulator

Feature flags are a legitimate operational tool. They enable gradual rollouts, A/B testing, and safe deployments. They become a debt accumulator when they persist after their purpose has been served.

The failure mode is specific: a flag is created for a rollout. The rollout completes. Nobody owns the flag. Nobody schedules its removal. Six months later, it is still in the codebase. Two years later, it is still in the codebase, now wrapped by other code that assumes its existence, untested in its false branch because the false branch has not been exercised in production for 18 months.

LaunchDarkly's engineering blog documented this pattern internally: long-lived flags become implicit architectural decisions. The code that wraps them accumulates logic. The flag is no longer a toggle; it is a load-bearing architectural choice with no documentation.

The fix is structural: every flag has a creation date and a sunset date. The sunset date is written into the flag creation process, not added later. Flags that reach their sunset date without review are removed by default, not kept by default. This inversion of the default is the key: flags should expire unless explicitly renewed, not persist unless explicitly removed.


03. Test suite bloat

Test suite bloat

Feature bloat in the product maps to test suite bloat in the codebase. Every feature ships with tests. Tests for unused features continue to run on every CI build. The CI time for tests that cover features used by 2% of users is identical to the CI time for tests that cover features used by 80% of users.

The CI slowdown from test suite bloat is often invisible because it accumulates slowly. A test suite that runs in 12 minutes in year one runs in 28 minutes in year three, not because of any single change but because 50 small additions each added 20 seconds. The engineers who join in year three assume 28 minutes is normal. The engineers from year one have forgotten what 12 minutes felt like.

Removing a feature should always include removing its tests. This is obvious, but it is frequently skipped: the feature is removed from the UI while the tests continue to run, now testing code paths that no longer exist in the product, occasionally catching bugs in dead code that is not affecting any user.

Track CI duration over time. If it is increasing faster than team size, you have test suite bloat. Audit the test coverage for your bottom 20% of features by usage. Remove the tests when you remove the features.


04. More features equals more incidents

More features equals more incidents

This is the surface area equation. Each feature is a set of code paths. Each code path is a set of potential failure modes. More features means more code paths means more potential failure modes means a higher baseline incident rate.

The relationship is not linear. A product with 100 features does not have twice the incident rate of a product with 50 features. The relationship is approximately power-law: each new feature adds not just its own failure modes but also its interaction failure modes with every other feature. A new feature that integrates with five existing features adds one set of intrinsic failure modes plus five sets of integration failure modes.

This is why the simplest products have the best reliability records: fewer features means fewer interactions means fewer interaction failure modes. Basecamp's legendary uptime is not incidental to its limited feature set. It is partly caused by it.

For engineering leaders making the case for feature removal to the business: the incident-rate reduction from removing a set of underused features is a real financial benefit. Model it: (incidents per quarter) * (average incident resolution time) * (engineer cost per hour) * (proportion attributable to the feature set in question). The number is usually surprising.


05. What CTOs should measure

What CTOs should measure

Four measurements that tell you the engineering cost of feature bloat in your organisation:

Cycle time by feature area. If features in one part of the codebase consistently take longer to ship than features in other parts, the slow area is carrying technical debt. That debt is often the accumulated maintenance burden of underused features that nobody has removed.

Incident rate by feature area. Which areas of the product generate the most incidents per unit of user activity? High-incident areas are often high-feature areas: the enterprise customisation layer, the configuration system, the legacy integrations.

Ownership graph. Can every piece of code in production be attributed to a person or team who is responsible for it? Code without an owner is the technical equivalent of the feature without a champion: it will accumulate bugs and never be removed.

Feature removal velocity. How many features has the engineering team removed in the last 12 months? If the answer is zero, the codebase is growing in one direction only. A healthy team removes roughly as many features as it adds, measured by code volume.


06. When to refactor vs when to delete

When to refactor vs when to delete

The refactor-vs-delete decision is often framed as a technical question. It is actually a product question. You refactor when the feature is going to stay and needs to be maintainable. You delete when the feature does not justify its maintenance burden regardless of its current code quality.

The common failure mode is refactoring features that should be deleted. An engineer who is uncomfortable with the code quality of a low-usage feature proposes a refactor. The refactor is approved because clean code is a legitimate engineering value. The refactor is completed. The feature is now clean, well-tested, well-documented, and still used by 2% of active users. It is now harder to remove because it has just had investment.

The order of operations matters: run the usage analysis before approving any refactor. If the feature is in the bottom 20% of usage, the answer to 'should we refactor this?' is 'should we have this at all?' Ask the product team. If the product answer is 'this feature is going to stay,' approve the refactor. If the product answer is uncertain, park the refactor pending a product decision. If the answer is 'actually, nobody uses this,' delete it.

Engineers who delete code are doing valuable work. The PR that removes 3,000 lines of dead feature code is a contribution. It deserves the same recognition as the PR that ships a new feature.

Related in this portfolio

MetricsHow to cut featuresFeature flag anti-pattern
codedebtcost.comincidentcost.comburnoutcost.complatformengineeringcost.com