The Stack

Using Agents, Keeping Agency.

Devin Dickerson 15 Jun 2026 6 min read

I’ve been thinking about how many decisions I don’t make independently anymore. The series I queue up because it was the top suggestion. The brand I bought because it sat at the top of the list. The answer I hone in on because the chatbot serves it up confidently. Most are harmless, and almost all I can walk back: return the order, skip the show, run the search again.

The trouble starts where the recommendation hardens into something you can't easily reverse. And nowhere does that happen faster, or with less ceremony, than with a coding agent.

Kick off a new project and count the technical decisions that fly by in the first twenty minutes. Language, framework, database, auth provider, hosting target, testing. Before coding agents, each got made on purpose, weighed for tradeoffs, cost, and maintainability, even if the call turned out wrong.

Are we still weighing them that way? My conversations with developers and product leaders, plus my own project work, land on "sometimes." I remember one of my early agentic projects where the agent reached for a popular BaaS and I waved it through. But the domain turned out document-shaped and stubbornly polymorphic, the kind of data that fights being bolted into a relational table as JSON blobs. But by then my "swap it later" had quietly become "rebuild auth, the generated API, and storage," because the convenient default wasn’t just a database. And now I was paying interest on a decision I didn’t put through the paces.

On a weekend project, that's a shrug. But when I use this anecdote with the hundreds of enterprise and big-tech engineers I've talked to, it still lands because even in those environments they find it relatable. Under pressure to prove AI makes us faster, we move faster. But is speed you got by skipping a decision actually speed, or just a loan?

The Internet's Default Settings

So if agents are making stack decisions, where do the suggestions come from? An agent's instinct for your architecture is a statistical echo of its training data. This isn't a hunch; researchers have measured it. A 2025 study of eight LLMs found they overuse mainstream libraries even when the task doesn't call for them, reaching for NumPy unnecessarily in up to 48% of cases and defaulting to Python in 58% of high-performance setups where it was the wrong tool. Rust, the better fit for several of those tasks, came up zero times. The authors are blunt about the cost: models that "prioritise familiarity and popularity over suitability" introduce security vulnerabilities and technical debt.

The mechanism is a flywheel. Ubiquity means more code, more docs, more forum threads, which become more training data, which sharpens the preference in the next model update. Need a backend? Here's the usual one, regardless of your use case.

And the loop is no longer organic. There's now a generative engine optimization industry whose entire product is making AI recommend yours; the consumer version, where a company's own "best platform" listicles get cited straight back by ChatGPT, is already documented. The dev-tool version is quieter but the same play: vendors ship llms.txt files so agents read their docs first, with Mintlify auto-generating them at zero configuration and Fern treating them like robots.txt. None of this is sinister; it's rational marketing for the agentic era. But the recommendation in your chat window isn't a neutral survey of your options. It's the winner of history's most sophisticated popularity contest.

Two-Way Doors and One-Way Doors

Not every decision carries the same weight, and the split between what you can walk back and what you can't has a name. Bezos famously called them two-way doors and one-way doors: the decisions you can undo cheaply, and the ones you have to live with. Your styling library is a two-way door. Your data model, your auth provider, your tenancy boundary, your compliance posture are one-way doors.

But the doors don't stay put. A two-way door calcifies as code piles against it, and the agent's favorite move is to bundle several one-way doors behind one convenient pick. The backend I waved through wasn't one reversible call. It was a data model and an auth provider arriving as a single line in a chat, filed as a two-way door because that's the shape it showed up in.

On the two-way doors, I say let the agent rip. This is where popular stacks earn their reputation: they're battle-tested, well documented, and rich in worked examples. Building a to-do app on a Saturday? The default is fine, and arguing with it wastes the Saturday.

The trouble starts when "fine for most projects" meets a one-way door in yours. The default doesn't know your compliance regime, your data model, your scale curve, or what your team already runs in production, and those are the variables that pick winners. They live in your world, not the training data. Even if you paste them into a prompt, the agent grasps the shape of your problem without knowing your problem, and that gap is just as real on day one of a greenfield build as on the next migration in a system someone else stood up a decade ago.

The deeper problem isn't that the default is always wrong, because sometimes it's right. It's that on the one-way doors nobody actually decided: a choice with years of consequences gets made in the gap between what the agent knew and what it needed to know, with no tradeoff ever surfaced. And this is where a gamed default does its damage: today’s version of agentic SEO optimization, rather than your constraints, is picking what you ship.

The Decision Layer

The standard rebuttal is operator discipline: ask the agent for three options and the tradeoffs. It kind of works, on the days you remember. But prompting is a habit, and habits don't scale across a team or a Friday afternoon, and every chat window is its own seam, the ones I wrote about in "In the Seams": context renegotiated each session, yesterday's discipline gone with the context window. The real issue is altitude. Agentic speed makes it trivial to collapse an architecture decision into the implementation and keep moving, which is exactly how a one-way door gets walked through with nobody deciding. These calls belong above the session, made deliberately and owned, not improvised at 4:55 before the weekend.

Above the session is only half of it; the decision then has to live somewhere the agent can't miss it. Record the choice, the rejected alternatives, and the constraints that settled it in the project's durable, shareable context: the architecture decision record, the agent skill, and the agentic memory that the platform carries between sessions. That context is not write-once. It has to be maintained, kept current, and occasionally blown up and rebuilt, or it rots into noise the agent learns to ignore. Cloudflare's engineers hit this rebuilding Next.js with agents. Building against Next.js's own test suite, the agents reproduced its past CVEs along with its features, having absorbed the familiar patterns, vulnerable ones included, and the passing tests didn't surface them. Cloudflare's own takeaway was the missing piece: a durable instruction pointing at the right corpus, "review all past CVEs and ensure we're not vulnerable."

The record is never final, because the project argues back. Agentic development exposes flawed early assumptions fast: the scale curve you guessed wrong, the tenancy model that didn't survive contact, the auth choice that boxed you in. The decisions worth capturing are the ones worth revisiting, and the durable context is where that revision lands, so the next session starts from what you learned instead of what you first assumed.

And the opinionated part of this isn't always project-specific. The constraints, the tradeoff methodology, the right-tech-for-the-workload calls your org has already settled are organizational, and they don't belong exclusively in the IDE. Bake them into context architecture that lives above the implementation and can be shared across teams: plug-and-play agent skills, reference patterns, an MCP server exposing a context database every team's agents can read.

You Own the Result, So Don't Delegate the Why

The agentic era will keep tilting toward defaults. The training loop rewards popularity, and the pressure to prove AI productivity isn't going anywhere. So decide where your one-way doors are, make those calls deliberately, and write the decision down where your agents will read it tomorrow. You can make sensible tradeoffs between speed and rigor on everything else.

You can walk back most defaults, but the one-way door is the exception: the default you accept there doesn't get returned, it gets imported, depended on, and shipped. This is a place where speed and convenience aren’t free. Your agency is the one variable here you still control. The agent should be expanding your option space, not quietly collapsing it. If you delegate the how, make sure you still own the why.

Download as markdown