After 3,400 commits in the past year, most of them shipped through some combination of Claude Code, Codex CLI, PI, Gemini CLI, and OpenCode running alongside each other on the same working tree, I’ve formed strong opinions about what this is and what it isn’t. It is not autocomplete. It is not a “productivity boost.” It is a new collaborator model, and the teams that are treating it like autocomplete are leaving most of the value on the floor.

What autocomplete was

The autocomplete framing — Copilot’s original pitch, roughly — was: you’re still the engineer, the AI is a faster keyboard. You think, you type, it completes. The loop is tight, the context is local, the output is line-by-line.

That was useful. It is now a small fraction of what the current generation of tools does.

What the collaborator model is

A modern coding agent (Claude Code, Codex, Gemini, Cursor, OpenCode, whatever) is not completing your line. It’s taking on a task — sometimes with 50+ files in its context, sometimes running a shell, sometimes iterating for minutes, sometimes pausing to ask you a clarifying question it actually needed you to answer.

That’s not autocomplete. That’s a junior-to-mid engineer who works at 40× speed, has full read access to your codebase, and will happily do 90 minutes of research while you’re in a meeting.

The right analogy is having a pair programmer who’s cheap enough to run in parallel. And the implication of that analogy is: you need to plan the work differently than you did when you were line-completing.

Five things I changed in how I work

1. I spec before I implement, always

When I’m the one writing the code, I can sort of wing it. A lot of my best code is the kind that emerges from editing itself — I type a bad version, I look at it, I refactor, I end up somewhere decent.

That doesn’t work with an agent. An agent given a vague spec produces a vague implementation, plus three plausible-looking abstractions, and charges you the time of reviewing all of them. You have to front- load the clarity. “Build a thing that does X, with these constraints, using this existing pattern from file Y, and do not add a general- purpose abstraction unless I ask for one.”

My specs are now an order of magnitude more explicit than they were when I was writing the code myself. That alone has made my code better, not just my AI interactions.

2. I read the diff carefully. Every time.

There is a strong temptation, when the agent produces 500 lines of working-looking code, to skim it and accept. Do not skim. The 500 lines will frequently contain:

  • An abstraction you didn’t ask for.
  • A comment describing code that used to be different.
  • An invented function name that doesn’t match your codebase’s naming convention.
  • A “TODO” the agent wrote to itself and then didn’t do.

None of those are bugs in the agent. They’re the cost of delegation. The discipline is the same one you’d apply to a PR from a junior engineer: read it as if you’d have to maintain it forever, because you will.

3. I run multiple agents in parallel for unrelated tasks

This is the single biggest unlock. I run one agent in one tmux pane on a backend refactor, another in a second pane on a docs pass, another in a third pane evaluating a library. They don’t collide because I scoped them to disjoint parts of the repo. The swap between panes is where I’m doing the actual senior work — reviewing, directing, rejecting, pasting context between them.

The skill that multiplies here is work decomposition. Identifying three hours of genuinely parallel work — not faux-parallel, actually parallel — and setting it up with three agents is a weirdly specific skill that nobody taught me, and now I do it reflexively.

4. I keep a shared-skills layer across all agents

The different CLI agents (Claude Code, Codex, Gemini, PI, OpenCode) have different quirks. What they share is the project context — the conventions, the rule files, the test commands. I maintain that shared context in one place (rice-shared-skills) and every agent pulls from it. When I update a coding standard, every agent sees it within minutes.

This is the bit that makes multi-agent feasible. Without it, each agent has its own idiosyncratic view of your project, and you spend your senior time reconciling drift.

5. I swap agents mid-task when one gets stuck

Claude Code might get stuck in a loop reasoning about a particular refactor. I switch the same context to Codex, paste the state, and often get unstuck in minutes. Not because Codex is “smarter” in absolute terms — they’re all roughly comparable — but because the local minima of any individual model are different, and a different model with the same context escapes them cheaply.

This is one of the reasons I wrote raise: swapping between 17 AI tools via atomic symlink swaps, with credential preservation, because manually reconfiguring each one’s auth and settings was eating my day.

What this is not

A few things the collaborator model is not:

  • It is not a replacement for senior judgment. The agent generates; you review. If you can’t review credibly, you are publishing random code to production. The bar for “I can’t review this” should be “this PR doesn’t ship,” same as with a human.
  • It is not free. I pay for it, and the cost is non-trivial. The teams dismissing LLM-assisted engineering because of per-seat cost are doing spreadsheet math on the autocomplete framing and missing that the agent does 2-3 hours of work in 15 minutes on the tasks where it fits.
  • It is not uniform across task types. The agent is great at refactoring, writing boilerplate, generating tests, explaining unfamiliar code, and writing documentation. It is middling at architecture decisions, product judgment, and anything that requires negotiating with humans. Know the asymmetry.

The CV implication

If you’re hiring senior engineers in 2026 and the interview loop doesn’t probe how candidates work with AI tools, you are not interviewing for the actual job. The actual job is “can this person direct an agent’s work, recognise when the agent is wrong, and ship credible code on that basis.” Candidates who can’t will be slower than candidates who can, in ways that compound over quarters.

I’ll say the quiet part: this is where a lot of my currently-between-seats bandwidth is going. Running real workloads through real AI-assisted pipelines, writing down what works and what doesn’t, shipping tools that make the pattern more reliable. If you’re hiring for this skill, I’m here — and I have the commit graph to back up the claim.