The debt that doesn't show

Tech debt has, for a long time, had a familiar shape. Some part of the codebase carries the cost of a shortcut — a workaround taken to ship, an abstraction skipped because the deadline was real, an integration done the quick way because the right way was too expensive. The team knows it is there because the code tells you. The TODO comment, the workaround naming, the awkward shape that signals temporary, the function whose comment apologizes for itself. Debt was always the reckoning of a previous trade-off, and the trade-off left visible markers.

That bookkeeping system worked because hand-writing a workaround required articulation. The engineer who took the shortcut typed out the apology along with the shortcut, because typing the workaround forced a moment of explicit acknowledgment. The TODO was not a virtue; it was a side-effect of the writing process.

AI tools removed the side-effect. The model produces stylistically clean output whether the design is sound or not. There is no awkward shape to signal trouble, no TODO, no apologetic comment. The same code that would have looked rough by hand looks polished — and because the markers that used to make debt visible were artifacts of the writing process, removing the writing removed the markers. The debt did not go away. The trail of breadcrumbs that used to lead to it did.

Three things follow. They are uncomfortable in different ways, and most teams are still operating on the assumption that the older bookkeeping holds.

1. Debt accumulates invisibly

The first failure of the older model is that debt now grows without leaving a trail. A codebase can accrue substantial structural debt — code that commits to the wrong abstraction, violates an invariant other code is silently relying on, picks the third-best of several patterns — and read, on the surface, as clean. The lints pass. The diffs are tidy. The names match the team’s conventions. There is no flag to grep for.

This is not the same as saying the debt is undetectable. A senior engineer reading carefully can still find it; the failure modes that show up in practice — code that reads as locally correct while quietly violating an invariant somewhere else — are nameable, and a careful reader can spot them. The shift is that finding the debt now requires deliberate effort, where it used to surface itself. Pre-AI, debt advertised. Post-AI, debt hides.

The cost of this invisibility is structural. Teams used to estimate their debt load by ambient signal — the density of TODOs, the share of files that felt messy, the engineers’ qualitative sense of where the bodies are buried. None of those signals work the same way now. Teams consistently underestimate how much debt they carry, because the markers they used to count are not the markers debt now leaves. The debt is not lower; the team’s read on it is.

The remedy is structural too: stop relying on ambient signal, and start scheduling deliberate audits of the regions of the codebase where AI authorship is densest. The audit looks like reviewing a piece of code that has not changed recently, asking what it commits to, and writing down the answer. This work used to be unnecessary because the code told you what it committed to. It is now necessary because the code does not, and waiting for an incident to surface the answer is more expensive than asking it on a Tuesday.

2. The cost migrated

The second failure is that the cost of carrying debt — the metric by which teams used to feel debt’s pressure — moved to a surface most teams are not looking at.

Pre-AI, tech debt’s cost was paid in implementation slowdown. Returning to a debt-laden region of the codebase meant working through the mess, fighting the workarounds, taking longer than the work would have taken if the area were clean. The slowdown was felt by the engineer typing, in real time, against the deadline. This made the cost legible: every senior engineer could give you a list of the parts of the codebase they hated to touch, and the list was the team’s debt register.

AI tools made implementation cheap regardless of how clean the underlying region is. The model plows through tangled code as easily as it plows through clean code. The implementation slowdown that functioned as the team’s debt-pressure signal stopped functioning. A team can carry significantly more debt than before and feel less of it, because the place the cost used to land is not where the cost lands now.

Call the pattern cost migration: the debt is unchanged, the surface where the cost shows up moved. Debt now shows up in:

Architectural reasoning. Working out what to build next, in a codebase whose contracts are not clean, takes longer than it used to. The slow part is not the typing; it is the modeling.
Review. Reviewing a change in a debt-laden region requires reconstructing what the surrounding code commits to, against a codebase whose commitments are unclear. Senior reviewer time per PR creeps up.
Debugging. Production failures in messy regions take longer to localize and longer to fix safely, because the blast radius is opaque.
Onboarding. New joiners take longer to become productive in regions that read clean but commit to inconsistent contracts, because the clean reading is misleading them.

These costs are real, and they aggregate to numbers larger than the implementation slowdown ever produced. They are also paid in senior attention, which is the only currency a team has that does not scale with adopting better tools. A team that traded implementation tax for senior-attention tax made a worse deal than they realized; they are now paying their debt out of the smallest budget on the team, in a way that does not show up on any of the dashboards that measure team health.

3. Refactoring does not reset the clock

The third failure is the most counterintuitive: the standard remedy for tech debt — refactoring — got cheaper to perform and worse at fixing the underlying problem.

The classical refactor has a reliable shape. A senior identifies a debt-laden region, writes a plan, the team rewrites it deliberately, and the result is code the team understands. The understanding was the actual deliverable. The clean code was a side-effect of the team having thought it through carefully.

AI-assisted refactoring inverts the deliverable. The clean code now arrives quickly and the understanding does not necessarily follow. A senior can ask the model to refactor a tangled module and get a tidy replacement in an afternoon. The replacement is, on its surface, less tangled. The team’s understanding of it is, in many cases, lower than their understanding of what was there before, because nobody worked through the rewrite line by line. The team has paid down the visible form of the debt and acquired invisible debt of a new kind in its place — code that is plausible-but-unfamiliar, written in a style nobody on the team chose, committing to contracts nobody on the team articulated.

The AI-refactored region is now less debt-laden and more unowned — a different problem, often worse. The classical debt at least had authors who could be asked. The refactored replacement has no author at all. The team’s grasp on the region got weaker, not stronger.

Call the pattern debt rotation: the team did not pay down debt, they swapped one kind for another, and the new kind hides better. The lines of code dropped, the complexity metric improved, the dashboard nodded. The team’s actual ownership of the codebase declined.

This is not an argument against refactoring with AI assistance. It is an argument that the older remedy, applied with the new tool, requires explicit re-authoring discipline to actually pay down debt. The senior who runs the refactor has to read the output line by line, articulate the contracts the new code commits to, write down the design rationale, and treat the AI-produced version as a draft to be reviewed rather than a deliverable to be accepted. Without that discipline, the refactor is not paying down debt; it is rotating it.

Most teams do not yet do this. The speed dividend of AI refactoring is too tempting, and the result is a generation of refactors that look successful and have, in fact, traded one debt for another at par.

Closing

Three assumptions that used to be roughly true about tech debt no longer hold. Debt was visible; it isn’t. Debt slowed implementation; it doesn’t, much, because implementation isn’t the bottleneck anymore. Refactoring paid down debt; it does, sometimes, but not by default — and the default is increasingly the failure mode.

The decisions that fall out of operating on the older assumptions all sound reasonable from inside the team. Our debt load is low; the codebase looks clean. Our team is fast; implementation has never been quicker. Our refactors are working; the metrics improved. None of these is supported by what is in the codebase. The codebase is increasingly opaque about its own debt, the team’s reading of it is increasingly out of date, and the discrepancy compounds quietly until something hard arrives — an architectural decision, a difficult debugging session, an onboarding that takes twice as long as expected — and the unaccounted debt finally surfaces on the surface it now lives on.

The remedy is unglamorous: name what changed, stop applying the older bookkeeping, and start the new work the new conditions require. Audit the AI-dense regions deliberately. Track senior-attention cost in the regions you used to track implementation cost in. Treat AI refactors as drafts that require re-authoring, not deliverables. None of this is hard. It is, however, additional work the team’s velocity narrative is not making room for, and finding the room requires admitting that the velocity narrative is reading a number whose meaning shifted under it.