Agentic Engineering Strategy

Summary

Sitemate’s strategic vision for how product, design, and engineering must transform in the AI era. Captured across a quartet of internal blog posts by Hartley (Jun 2025 – Apr 2026) and Tim’s own brainstorm on accelerating the organisational shift. The arc: AI has collapsed the cost of code → agile is now a bottleneck → the company must go Forward Deployed → the system around agents is the competitive advantage → and the whole company runs on markdown (sitemate.md).

Key Points

The shift

Unit value of code has converged to zero — process latency is now the only real cost
PMF is now a treadmill (Elena Verna) — continuous, not a destination
Competition is 3D: traditional competitors + citizen developers everywhere
User expectations compressed: on-prem (years) → cloud (months) → AI era (weeks)
Must be 10X faster, 10X cheaper, 10X better than all alternatives

The system (6 pillars from “Agentic Engineering” post)

Context — repo-level, system-wide, decision records, product docs. Must be agent-discoverable. Wrong context worse than missing.
Spec files — source of truth for problem/solution/constraints/verification. Engineer writes, agent executes.
Design — codified in code, not implied. Figma no longer primary source of truth.
Codebase & architecture — shared types/schemas, clean boundaries, reduced tech debt, clear APIs.
Automated testing — the agent’s feedback signal. Coverage directly proportional to autonomous work capacity.
Delivery infrastructure — fast CI/CD, safe deploys, feature flags, observability. Minutes from PR to production. Industry-practice synthesis on environment topology, release patterns, and progressive delivery in the CD environments + release best-practices output — three fixed envs (dev/preview → staging → prod), promote-the-artifact, ephemeral preview envs per PR (the highest-leverage addition for agent-era work, mirrors Stripe Minions / Ramp Inspect sandboxes), risk-tiered gates, blue-green or canary, feature flags + auto-rollback.

Forward Deployed model (10 stages)

Continuous Listening — AI agents analyze every channel
Curation — most strategic role; filters signal from noise
Autonomous Wireframe Generation — AI drafts first versions
Self Selection — engineers choose what to pursue (biggest cultural shift)
POC Creation — harden, apply craft
Strategic Alignment Check — leadership keep/kill
Design Alignment Check — system coherence
Engineering Integration — senior engineers stress-test
Shipping — becomes a non-event
Closing The Loop — hypothesis + metric + time window

Forward Deployed Engineers (FDEs)

Builders embedded in revenue/adoption surface — compress insight → shipped value
CSMs become narrower (activation, education, retention)
Every high-context engineer becomes a revenue team member
Naming note: this is Hartley’s internal product flow meaning of “Forward Deployed” — an org pattern where Sitemate engineers sit close to the revenue surface. The external FDE role popularised by Palantir/Anthropic/OpenAI/Varick (customer-embedded Applied AI engineer billed out to a client) is captured separately on the Forward Deployed Engineer (FDE) page. Same name, different motion. Hartley’s Forward Deployed is org design; Vasuman’s FDE is a role description.

Compound effect

Every cycle leaves the system better — specs sharper, context richer, guardrails smarter
North star: non-engineers write specs, agents execute, engineers build/refine the system itself

Core design principle: Own the process, rent the tools

The system is ours and permanent — 6 pillars, spec chain, quality gates, context infrastructure
The tools underneath are rented and rotate — model, IDE, design app, sandbox runtime
Every AI integration must sit behind two surfaces: (1) a process we own (trigger, verification, spec-chain fit), (2) an interface we control (inputs/outputs in our formats)
Why: frontier is moving faster than any single vendor’s migration cycle. Claude Code slipping behind Codex in coding leaderboards + Claude Design launching mid-cycle are the current data points. Lock-in has a growing cost.
Applications:
- Devbox/sandbox — pluggable model behind the same spec+verify loop. Switch Claude Code ↔ Codex ↔ next as config, not rewrite. Matches Open SWE pattern (Stripe/Coinbase/Ramp converged on pluggable LLM).
- Design tool — tokens, component contracts, handoff artifacts live in the repo. Figma/Claude Design/next is the canvas, not the source of truth. Reinforces Pillar 3 (design codified in code, not implied in a tool’s file format).
Closes the multi-model-strategy gap Cursor flagged (divergence #4 in the Aman talk note) — multi-model becomes an option the architecture preserves, not a commitment we’ve made

Top 3 blockers (from Q126 response)

Insufficient testing/verification infrastructure
Missing structured context
Inconsistent types/code structure

External validation: Stripe Minions

Stripe built Minions — proprietary coding agents merging 1,000+ PRs/week, fully unattended (human review only)
Why custom: generic agents excel at greenfield but struggle in mature complex codebases (hundreds of millions of lines, proprietary libraries, regulatory stakes)
Key insight: “If it’s good for humans, it’s good for LLMs, too” — agents leverage the same developer productivity foundations (linting, testing, tooling) built for engineers
Architecture mirrors Hartley’s 6 pillars:
- Context → 400+ tools via central MCP server (“Toolshed”)
- Codebase → Sorbet typing, shared conventions, coding rules consumed by both humans and agents
- Testing → selective testing from 3M+ test suite, heuristic lint in <5 seconds, max 2 CI iterations
- Delivery infra → pre-warmed devboxes spin up in 10 seconds, isolated from production
Validates: the system around agents is the competitive moat, not the agents themselves

External validation: Ramp Inspect

Ramp built Inspect, a custom background coding agent combining code generation with comprehensive verification in sandboxed cloud environments
Core thesis: “Owning the tooling lets you build something significantly more powerful than an off-the-shelf tool will ever be”
Architecture: Modal cloud sandboxes (near-instant spinup, filesystem snapshots), OpenCode as agent foundation, Cloudflare Durable Objects for session persistence
Multiplayer as mission-critical: multiple users collaborate within single sessions — team QA, teaching non-engineers, code review workflows
Multi-surface: Slack (lowers barrier for non-engineers), web (hosted VS Code for manual edits), Chrome extension (visual element targeting)
Result: ~30% of merged frontend and backend PRs originate from the agent after just a few months
Key insight: “When background agents are fast, they’re strictly better than local: same intelligence, more power, and unlimited concurrency”
Validates same pattern as Stripe: custom agents built on strong foundations outperform generic tools. Adds the multiplayer dimension — agents as collaboration surfaces, not just solo tools.

External validation: Intercom (2x velocity in 9 months)

Intercom’s Senior Principal Engineer Brian Scanlan reports 2x engineering throughput over 9 months after rolling out Claude Code across the org
Preconditions mirror Hartley’s 6 pillars exactly: mature CI/CD, comprehensive test coverage, high-trust culture — then AI
“AI magnifies your strengths and weaknesses” — weak deployment infrastructure just ships broken code faster. Directly validates the “top 3 blockers” framing
Instrumentation: treat engineering like a product — skill invocations tracked in Honeycomb, anonymized sessions stored in S3 to see where adoption stalls
Guardrails, not restrictions: custom skills with built-in hooks enforce quality at creation (e.g. “Create PR” skill blocks direct gh usage, mandates rich context) — makes the optimal path the only path
Quality rose alongside velocity (Stanford partnership data) — compressed execution time freed capacity to fix tech debt and flaky tests, not accumulate more
Permission is the bottleneck, not tech: explicit authorization + accepting accountability for failures drives experimentation
Agent-friendly product design: Intercom’s own Fin product has an autonomous CLI that can sign up, verify email, and install without a human — “design for agents or customers will build workarounds”
Applies to Sitemate: (1) before pushing velocity targets, harden the foundations; (2) instrument agent usage to see stalled adoption; (3) design skills as guardrailed paths not suggestions; (4) Sitemate products themselves must be agent-operable (ties to MCP work in Global Integrations)

External validation & framework: Dex / HumanLayer RPI → QRISPY

Dex’s thesis for 2026: “no more slop” — aim for 2-3x speedup with craft, not 10x with rework
What most teams get wrong:
- “Don’t read the code” → six months later, rip-and-replace. Read the code.
- 1000-line plan files ≈ 1000 lines of code, not leverage
- One monolithic prompt with 85+ instructions — model skips instructions above ~150-200 (instruction budget problem)
The “dumb zone”: context-window quality degrades around 40% full; beginners cap at 40%, experts aggressively stay under 30%. Don’t use built-in compaction — put everything that matters into static artifacts so you can resume cleanly
Context engineering = better instructions + simpler tasks + smaller context windows — don’t use prompts for control flow, use control flow for control flow
QRISPY 7-stage split of the monolithic planning prompt, each <40 instructions:
1. Questions — generate research questions (ticket hidden)
2. Research — fresh context, objective facts about the code today, no opinions
3. Design — ~200-line artifact: current state, desired state, patterns, resolved decisions, open questions (human-agent alignment surface)
4. Structure outline — ~2 pages: phases, order, how to test along the way
5. Plan — tactical spot-check doc
6. Work/Implement
7. PR
Vertical plans > horizontal plans: models default to horizontal (all DB → all services → all API) and break at 1200 lines. Vertical (mock endpoint → wire frontend → mock services → migrate → integrate) creates natural checkpoints
Key leverage insight: don’t review 1000-line plans, review the 200-line design doc; share the design doc with teammates before code exists, to head off bad decisions cheaply
Directly fills the “quality gates” gap in Sitemate’s development workflow proposal — the 200-line design doc is the gate between solution-docs and RFC/spec
Reinforces “Own the process, rent the tools” — planning/design is the process (ours), LLM is the tool (rented)
Action for Cortex: audit CLAUDE.md + skills against the instruction-budget framing; consider splitting multi-step skills (esp. /dream) into focused sub-prompts vs single-shot

Three-tier framework for AI agents by architectural complexity:
1. Category 1: Deterministic Automation (60-70% of opportunities) — you define the workflow, AI handles content generation at specific steps. ~6 weeks to build. Tools: n8n, Zapier, Make.com.
2. Category 2: Reasoning and Acting / ReAct (25-30%) — AI autonomously decides next steps using available tools. ~3 months. Tools: LangGraph, CrewAI.
3. Category 3: Multi-Agent Networks — multiple specialized agents coordinate across domains. 6+ months. Should rarely be a starting point.
Key insight: teams treat fundamentally different architectures as equivalent, using generic prioritization that fails to account for vastly different complexity and timelines
Practical guidance: start with Category 1 for quick ROI and organizational confidence before advancing
Relevant to Sitemate: most internal automation opportunities (form processing, reporting, data sync) are likely Cat 1. Coding agents (Minions, Inspect) are Cat 2-3. Helps prioritize where to invest.

External validation: Cloudflare AI Code Review

Cloudflare built a CI-native AI code review system on OpenCode, deploying up to seven specialised reviewer agents (security, performance, code quality, documentation, release management, compliance) with a coordinator agent deduplicating findings and making final approval calls
Scale (30-day window): 131,246 review runs across 48,095 merge requests, median review time 3 min 39 sec, average cost $1.19/review, override rate 0.6%
Key optimisations: risk-tiered reviews (trivial/lite/full), diff filtering (skip lock files and generated code), 85.7% prompt cache hit rate, failback chains + circuit breakers for provider outages
Honest limitation: strong at bug-catching, weak on architectural context, cross-system impact, subtle concurrency — human review remains essential
Validates two Sitemate bets: (1) specialised agents > one generic model (mirrors Minions/Inspect pattern), (2) “Own the process, rent the tools” — plugin architecture across VCS and AI providers keeps them model-agnostic. Also a concrete counterpoint to the “Claude is not your architect” risk: AI review surfaces findings, humans own architecture
Applies to Sitemate: the review stage of the agentic workflow is a candidate for stage 2 automation once stage 1 coding agents are merging PRs — specialised reviewer agents are a natural fit for our compliance/safety-sensitive front-line domain

External validation: Aaron Levie (Box, 20VC Apr 2026)

Contrarian macro call: more developers and more lawyers in 5 years, not fewer — Silicon Valley is only ~10-15% of GDP, the other 85% (tractor companies, banks, pharma, law) still can’t access SV-grade engineering velocity. “Bottlenecks just move.”
New role — “agent operator”: 500K–1M jobs projected. Somewhat technical, understands MCPs, CLIs, skills, agents.md files. Embedded in marketing/legal/ops/research to get them leverage from agents. Redesigns workflows for agents (not humans), wires up data, handles prompt churn every time a new model drops. Directly maps to Sitemate’s FDE embedded-in-revenue-surface pattern — widen the lens: FDE-equivalents belong in every function, not just engineering
“The workflow needs to be redesigned for agents, not for people” — reinforces Pillar 3 (design codified in code) and Pillar 1 (context must be agent-discoverable). Button-heavy apps lose value; API-first platforms win. Headless-first is now the mandate
Token budgeting is an OPEX line item, not IT: compute moves from capped IT budget to line-of-business OPEX, roughly doubles enterprise tech spend (10-12% → ~20% of revenue). Patterns Levie sees: Shark Tank (teams pitch for budget, reviewed quarterly) vs stratified (top 5% unlimited, next 20% capped, rest on cheapest model). Implication for Sitemate internal ops: how we budget agent tokens for our own eng team is itself a strategic decision
Enterprise adoption will be slower than SV expects: Fortune 1000s are fragmented, regulated, and two decades of BYO-tooling deep. 10 years of work for “next-gen Accenture” to upgrade systems, organize data, describe workflows. Liability doesn’t disappear — “you’re not going to be able to blame Anthropic when something goes wrong.” Maps to Sitemate’s front-line thesis: the truth/auditability layer compounds in value as this plays out
Coding outcomes don’t generalize: people assume AI-coding velocity gains transfer to other knowledge work. They don’t — coding has idiosyncrasies (fast feedback, tests, deterministic output) that other domains lack
“Agents are the solution to the problem that agents have caused” — offensive AI outpaces defensive AI on patching; agentic security is a big market. Reinforces the “design for agents or customers build workarounds” guardrails posture from the Intercom note
Changed his mind on headless: 3 years ago agents couldn’t find the right doc in Box; tool-calling accuracy inflected in the past year. Confirms the window for pillar 1 (context) work is now

sitemate.md — Refactoring Sitemate for the Exponential (Hartley, Apr 13 2026)

Fourth blog in the series. Defines the operational architecture of Sitemate in the AI era — what sits on top of Forward Deployed delivery and Agentic Engineering. The company isn’t just described in markdown — it runs on markdown (knowledge.md, program.md, CLAUDE.md, state files): readable by humans and agents, version-controlled, PR-reviewed, queryable via the Knowledge Engine. Karpathy: “a research organisation is a set of markdown files that describe all the roles and how the whole thing connects.”

Pre-AI vs AI information flow:

Pre-AI: hierarchy as routing protocol (Roman army → railroads → McKinsey → Spotify squads). Span 3–8. Customer signal → CSM → manager → meeting → product — slow and lossy.
AI era: first time tech can replace what hierarchy does. Information routes itself in real time.

Signals not roadmaps. Roadmap itself is latency. Old: leadership decides → roadmap → execute. New: customer signal → intelligence → leadership judgment → build. Money/usage is the most honest signal. Everyone gets access; cadence is real-time.

Differentiation equation: Rate of improvement = (signal volume × loop speed) − bottlenecks. Code is commodity; differentiators are mission/vision/strategy/culture, signal density, taste, speed, operational efficiency, unique data, unique process understanding.

Two types of agents (permanent infrastructure, not co-pilots):

Analysers — read the world, extract signal from noise, never build
Builders — act on the world, compose product surfaces from filtered signal

Skill issue, not capability issue (Karpathy). Bad agent output = bad markdown/context/decomposition. Goal: maximise token throughput, remove humans as bottleneck.

The Sitemate Blueprint — 6 layers (bottom-up):

Noise (sources) — feedback, errors, Slack, search volume, leads, transactions, ratings, form submissions, tickets, Fathom/Granola transcripts. Volume grows exponentially.
Agent analysers — run 24/7 on each source. AI template feedback already approaching one-FTE volume; form fill not even live yet. Pattern: source → agent → thread.
Agent builders — compose features/fixes/PRs. Currently tethered to dev laptops; needs remote sandboxes (on-demand isolated envs where agents clone, build, run E2E, verify, ship). Output scales with compute, not headcount. Whole company becomes forward deployed.
Human ICs (product foundation) — durable skills: judgment, taste, creativity, craft. Three Dorsey roles: builders/operators (atomic primitives — forms, photos, AI templates, inspections, docs, reporting), player coaches (assignment not reporting structure), verification & guardrails (design system, CLAUDE.md, skills, QA standards encoded for agents). Markdown as infrastructure — writing/maintaining these files is one of the highest-leverage IC activities.
Human leadership (strategy + world model) — two jobs: ranking/order (not “what to build” but sequencing which signals surface) and owning the world model (company + customer). Knowledge Engine: every state/knowledge/doc file as markdown, git-versioned, PR-reviewed; graph DB indexes relationships, markdown is source of truth. Phasing: state files → code docs → org knowledge.
Vision (mission alignment) — Dorsey’s “extra checkpoint”: is what we’re doing aligned with what we want to be? “A perfectly optimised system producing the wrong output is worse than an imperfect system producing the right one.”

Seven actions now:

Build the analyser layer first — agent on every key signal
Expand noise sources — five-star ratings after every photo upload, form submission, Dashpivot/GearBelt interaction
Build the Knowledge Engine — Phase 1 state files → code docs → org knowledge
Build sandbox infrastructure — remote isolated envs, verification suites, deployment pipelines agents can operate
Encode the standards — CLAUDE.md, skills, QA standards, design system, conventions
Tighten the loops and let them compound
Run the alignment check at every layer

How this lands against the rest of the strategy:

Builder-layer needs remote sandboxes — directly the Pillar 6 / devbox/sandbox work in “Own the process, rent the tools”
Analyser layer is the operational form of the Forward Deployed “Continuous Listening” stage — now spec’d as permanent infra, not just a process step
Knowledge Engine is the productionisation of Pillar 1 (context — agent-discoverable, repo-level, decision records)
Markdown as infrastructure reframes IC work: maintaining CLAUDE.md/skills/specs is a top-leverage activity, not overhead

Local optimization curve (Sitemate.md / 2026-05-05 Managers Allhands)

Shared mental model framing the “hybrid autonomous + human-led” stance:
- First 60–70% (foundational): deeply human-led — are we building the right thing? Architecture, security, direction. Not a place for autonomous agents.
- Middle band (refinement): agents excel here. Prompt fixes, SME-knowledge tuning, simple UI fixes that align with the design system, monitoring/error triage with human-in-the-loop on escalation. This is where coding agents deploy.
- Capability jump points: humans rebase — turn agents off, refactor, re-plan, then re-deploy agents on the next local optimization run.
Resolves the false choice between “vibe-code everything” and “review every diff” — defines the middle.
Agents per product surface area (e.g. AI Platform first, Gearbelt second) locally optimize their own area.

Internal agent rollout pattern (osmosis)

AI Platform team = first guinea pig; Gearbelt second. Spreading by knowledge transfer (osmosis) from a centralised AI ops nucleus to each team.
End-state: each team has an engineer responsible for managing 1–2 agents’ output alongside their own work. Tooling can vary per team — the pattern must be uniform: feedback stream → listeners → signal extraction → autonomous system (full work or with human-in-the-loop escalation).
Differential AI impact across functions (Tim’s prediction): PDE/operational roles most affected → marketing/revenue ops next → GTM least affected short-term (relationship value of those roles increases in AI era). GTM agents focus on removing non-value-add work, not replacing the relationship.
Concrete near-term agents: Aaron — deck infrastructure agent (Sitemate Chef-style custom decks per template). Jordan — core review agents. AI Platform — first “fix agent” by week of May 8.

End-of-2026 prediction (full feedback→prod loop)

For a narrow class of small frontend-only features (no mobile/export implications):
- User submits feedback in Slack
- Review agent scopes the change
- Fix agent generates a few options
- Designer picks option in Slack thread
- Agent raises PR → prod
“Bet a very large amount of money this happens at least once before end of 2026.”
Founder-level speculation beyond that: SaaS rails surface this loop to the customer — customer iterates with the agent designer directly, high-degree-of-customization SaaS as the future direction.

Macro signal: easy-to-replace SaaS vs. sticky SaaS

Tim’s anecdote (May 2026): Sitemate has turned off all Cursor subs, will turn off all OpenAI subs by end-May/early-June. In the same period, zero times considered switching Salesforce.
Inversion of the prevailing narrative: “products easiest to turn off are copying all the dollars; the traditional ones we haven’t even considered replacing.” Implication: dev-workflow-adjacent SaaS is most exposed; deep-workflow vertical SaaS (Sitemate’s category) is much stickier than the SaaS-bundle bear case suggests.
Investor noise lumps all SaaS together; expected reset when a foundation lab misses a major revenue target (OpenAI just did; Anthropic on track).

Risks and counterpoints

AI must not own architecture (Holland, “Claude Is Not Your Architect”) — AI is pathologically agreeable, produces generic solutions for specific problems, short-circuits the messy disagreement where real architectural quality emerges
Risk of “Jira ticket pipeline”: AI designs → breaks into tickets → experienced engineers become ticket implementers instead of problem-solvers
Accountability gap: AI faces no consequences when systems fail
This aligns with Hartley’s model: engineers write specs and own design, agents execute. The danger is when teams invert this — letting agents architect and humans implement.

Development workflow proposal: idea to implementation

Standardises the flow from idea → agent-implemented code, addressing: RFCs mixing levels of detail, docs living outside the codebase, no clear collaboration triggers
Artifact chain: one-pager (Confluence, alignment) → solution docs (container repo, draft PR as collaboration surface) → epics (Jira) → RFC (Confluence, derived from solution docs) → decision logs → spec/state files (container repo, what agents need) → tickets (Jira) → agent implementation
Key design: container repo is source of truth for detailed thinking, Confluence is high-visibility collaboration surface
Critical open questions:
1. Agent context delivery — how do agents in product repo discover specs in container repo? Hard dependency on Nick’s Documentation Knowledge Engine
2. Architecture ownership gap — proposal has AI generating most artifacts with humans reviewing, which inverts the model Holland warns about. May need restructuring so humans write with AI assistance rather than AI generating with human review
3. Quality gates — only the one-pager has a gate. No defined “ready enough” threshold for solution docs → RFC or specs → tickets

Open questions

How to speed up organisational changes around agentic engineering (Tim’s brainstorm, 2026-04-14)
Development workflow proposal needs socialisation and pilot (see above)

Where uniqueness comes from

Mission, Vision, Culture
Insight supply and curation
Taste and craft

Sources

2026-04-14-how-sitemate-ships-now-era-of-ai.md — “On How Sitemate Ships Now” (Jun 2025)
2026-04-14-waterfall-whitewash-agile-anchor-fde.md — “Waterfall Whitewash, Agile Anchor, Forward Deployed” (Feb 2026)
2026-04-14-agentic-engineering-system-is-advantage.md — “Agentic Engineering: The System Is the Advantage” (Mar 2026)
2026-04-07-q126-response-summary.md — top 3 agentic blockers + agentic under-resourcing
2026-04-14-stripe-minions-coding-agents.md — Stripe Minions: 1,000+ agent-merged PRs/week at scale
2026-04-14-claude-is-not-your-architect.md — Holland: risks of letting AI own architecture
2026-04-16-ramp-inspect-background-agent.md — Ramp Inspect: custom background agent, 30% of PRs from agent, multiplayer
2026-04-15-ai-agents-framework-lenny.md — Lenny’s Newsletter: 3-tier agent categorization framework
2026-04-16-development-workflow-proposal.md — Development workflow: idea to agent-implemented code
2026-04-20-own-the-process-rent-the-tools.md — “Own the process, rent the tools” design principle + Slack draft
2026-04-16-cursor-aman-talk-agentic-alignment.md — Cursor divergence #4 (multi-model strategy), source for the principle
2026-04-21-intercom-2x-engineering-velocity-claude-code.md — Intercom: 2x velocity in 9 months, foundations + guardrails + instrumentation
2026-04-20-Everything-we-got-wrong-about-Research-Plan-Implement-Dexter-Horthy.md — Dex/HumanLayer QRISPY: instruction budget, dumb zone, vertical plans
2026-04-21-aaron-levie-20vc-more-developers-five-years.md — Aaron Levie/Box 20VC: agent operator role, token OPEX, headless-first, bottlenecks just move
2026-04-23-cloudflare-ai-code-review.md — Cloudflare: 7 specialised reviewer agents, 131K runs/month, $1.19/review, 0.6% override
2026-04-22-dev-agents-evolution-coding-to-qa.md — internal blog draft: three-stage agent evolution (code → run+verify → QA), two devbox decisions, three parallel investments (harness, evals, workflow)
2026-04-27-cicd-environments-and-release-best-practices.md — Pillar 6 deep-dive: 3-env topology, promote-the-artifact, blue/green vs canary, feature flags, auto-rollback, risk-tiered release gates
2026-05-05-managers-allhands-transcript.md — Managers Allhands: local-optimization curve, osmosis rollout pattern, end-of-2026 feedback→prod loop prediction, sticky-SaaS macro anecdote
2026-05-04-sitemate-md-refactoring-for-the-exponential.md — Hartley blog #4: company runs on markdown; analyser/builder agents; 6-layer Sitemate Blueprint; signals-not-roadmaps

Last Updated

2026-05-06

Cortex

Explorer

agentic-engineering-strategy

Agentic Engineering Strategy

Summary

Key Points

The shift

The system (6 pillars from “Agentic Engineering” post)

Forward Deployed model (10 stages)

Forward Deployed Engineers (FDEs)

Compound effect

Core design principle: Own the process, rent the tools

Top 3 blockers (from Q126 response)

External validation: Stripe Minions

External validation: Ramp Inspect

External validation: Intercom (2x velocity in 9 months)

External validation & framework: Dex / HumanLayer RPI → QRISPY

External validation: Cloudflare AI Code Review

External validation: Aaron Levie (Box, 20VC Apr 2026)

sitemate.md — Refactoring Sitemate for the Exponential (Hartley, Apr 13 2026)

Local optimization curve (Sitemate.md / 2026-05-05 Managers Allhands)

Internal agent rollout pattern (osmosis)

End-of-2026 prediction (full feedback→prod loop)

Macro signal: easy-to-replace SaaS vs. sticky SaaS

Risks and counterpoints

Development workflow proposal: idea to implementation

Open questions

Where uniqueness comes from

Sources

Last Updated

Graph View

Backlinks

Cortex

Explorer

agentic-engineering-strategy

Agentic Engineering Strategy

Summary

Key Points

The shift

The system (6 pillars from “Agentic Engineering” post)

Forward Deployed model (10 stages)

Forward Deployed Engineers (FDEs)

Compound effect

Core design principle: Own the process, rent the tools

Top 3 blockers (from Q126 response)

External validation: Stripe Minions

External validation: Ramp Inspect

External validation: Intercom (2x velocity in 9 months)

External validation & framework: Dex / HumanLayer RPI → QRISPY

Agent categorization framework (Lenny’s Newsletter)

External validation: Cloudflare AI Code Review

External validation: Aaron Levie (Box, 20VC Apr 2026)

sitemate.md — Refactoring Sitemate for the Exponential (Hartley, Apr 13 2026)

Local optimization curve (Sitemate.md / 2026-05-05 Managers Allhands)

Internal agent rollout pattern (osmosis)

End-of-2026 prediction (full feedback→prod loop)

Macro signal: easy-to-replace SaaS vs. sticky SaaS

Risks and counterpoints

Development workflow proposal: idea to implementation

Open questions

Where uniqueness comes from

Sources

Last Updated

Graph View

Backlinks