Engineering in the AI Era
Summary
External and internal thinking on how AI agents change the nature of software engineering work — not just velocity, but what “good code” means, where the bottleneck now lives, what skills atrophy when offloaded, and what business-model survival looks like when code itself is commodified. Common thread: AI amplifies whatever is already there. Good processes get dramatically better; poor processes accumulate debt and slop faster. The work is in re-deriving the standard, not just running the existing standard faster.
Key Points
- Code generation is no longer the bottleneck — comprehension and review are. Intercom-style high-adoption teams ship 98% more PRs but review takes 91% longer (Osmani). The bottleneck didn’t disappear; it moved upstream. The implication is that effort should shift from execution toward defining success criteria and verification (~70/30 declarative vs imperative, per Osmani).
- “Good code” needs re-derivation, not reapplication. Hand-coded quality bars often don’t survive contact with AI-generated code: strict types are worth keeping for the iteration substrate they create, but additional tidying may be wasted effort. Some duplication is already accepted; the question is what else is now okay that wasn’t before. Architecture stays critical — sustainability (AI + humans can keep iterating without things getting worse) is the new test (Tim, May 7).
- Failure modes have shifted from syntactic to conceptual. AI agents now fail at assumption propagation, abstraction bloat, dead code accumulation, sycophantic agreement (Osmani). Reviewing surface-correct code is not the same skill as writing it — “rubber stamping” is the new risk.
- Skill atrophy is measurable, not theoretical. Anthropic RCT with 52 devs learning a new Python async library: AI-assisted group scored 17% lower on comprehension (50% vs 67%) without task-completion gains. How the AI is used is the variable — high performers asked for explanations and posed conceptual questions; low performers exhibited cognitive offloading. This has direct implications for junior-engineer deployment patterns.
- Bimodal adoption, not a smooth curve. Teams split between early adopters shipping dozens of AI PRs/day and the majority still hand-writing (Osmani). The middle is thin. Implies that “average uplift” numbers hide what’s actually happening on the ground.
- AI software factories are the next evolution of TDD. Humans write spec + tests defining success; agents generate implementation and iterate until tests pass. Some companies’ repos now contain no handwritten code — only specs and test harnesses (Diana Hu / YC). Strong DM is an example: scenario-based validations drive agents until a probabilistic satisfaction threshold is hit. The “1000x engineer” framing is really “one engineer surrounded by a system of agents.”
- Where this works vs where it doesn’t. Greenfield, MVPs, small teams — strong fit. Legacy systems with complex invariants — weak fit, and the failure modes are quietly expensive (Osmani). Sitemate’s own agentic engineering strategy lives on the legacy side of this line, which is why guardrails + instrumentation + foundations matter so much.
- Business-model implications are existential, not incremental. McCabe (Intercom): “The only way to have a place in the future is to destroy your past.” Intercom dedicated 80% of R&D to Fin despite initial single-digit revenue contribution, deliberately cannibalised ~400M ARR, Fin approaching $100M. Half-measures fail.
- Token-maxing, not headcount. Diana Hu / YC: be willing to run an uncomfortably high API bill — it’s replacing far more expensive headcount. One person with AI tools = what used to take a team. This reframes the cost conversation entirely.
- Conviction can’t be outsourced. “You need to develop it yourself by actually sitting with coding agents and using them until you start to break your own priors about what is now possible to build” (Diana Hu). Same April/May finding from Tim’s own work: leaders must stay in the tools — can’t delegate curiosity in a space moving this fast.
Implications for next chapter
- The blog series Tim is drafting (
outputs/2026-05-15-bottleneck-moved-upstream-macro-thesis.md,2026-05-18-agent-velocity-blog-draft.md,2026-05-18-four-modes-ai-assisted-work-brainstorm.md) sits on this cluster — the macro thesis is the synthesis of these external sources plus Sitemate’s own evidence in agentic engineering strategy. - The operational answer to “so what does the org actually do differently?” is increasingly the Forward Deployed Engineer role pattern — Vasuman’s “intelligence is commoditising, deployment is the edge” frames the FDE motion that AI labs and Applied AI shops are scaling now, and gives this macro thesis a concrete role-shaped answer rather than just a vibe shift.
- For incoming CTO / VPE / fractional conversations, the talking points are: comprehension debt is the real risk, declarative specs + automated verification are the operating model, and the org has to absorb the shift (see Org Design in the AI Era) — not just buy seats.
- For junior-engineer programs, the Anthropic skills study is a load-bearing reference. AI deployment should be designed to preserve learning, not just maximise throughput. The same RCT is now defensible evidence inside any “should we let juniors use Claude unsupervised” conversation.
- The “where it works vs where it doesn’t” line is the most important thing to communicate to non-engineering stakeholders. Legacy systems with complex invariants don’t get the same uplift as greenfield, and pretending they do produces slop.
Open threads
- OpenAI Codex Symphony — fetch was blocked HTTP 403 on May 13. Worth re-capturing in browser; open-source orchestration framework is potentially relevant to the software-factory pattern above.
Sources
- 2026-05-07 — Redefining “good code” in the AI era — Tim’s own thinking on the quality bar, slop vs gold-plating, sustainability test
- Intercom) — 80% R&D to Fin, deliberate cannibalisation, $400M ARR result
- 2026-05-15 — How AI assistance impacts coding skills formation (Anthropic RCT) — 17% lower comprehension, cognitive offloading, junior-engineer implications
- 2026-05-18 — The 80% Problem in Agentic Coding (Addy Osmani) — Evolution of mistakes, comprehension debt, productivity paradox, slopacolypse
- YC) — AI software factories, token-maxing, conviction
- 2026-05-15-bottleneck-moved-upstream-macro-thesis.md — Tim’s macro-thesis draft built on this cluster
- 2026-05-18-agent-velocity-blog-draft.md, draft 2 — blog drafts
Last Updated
2026-05-19