Forward Deployed Engineer (FDE)

Summary

Synthesis of Vasuman Sharma’s three-part series on agents and the FDE role (2026-05-21). The thesis: as model intelligence commoditises, the only competitive edge is how and where a company deploys it - and the role that owns that determination is the Forward Deployed Engineer. Origin is Palantir’s 2010 Special Forces deployment template; current revival is being driven by Anthropic, OpenAI, Google, and Applied AI shops (Varick et al.) who can’t sell pure capability and instead sell installed outcomes inside customer environments. The role is a rare three-way combination: deep customer-problem understanding + writes code into an unfamiliar codebase + communicates business impact to non-technical decision makers. Vasuman calls it a “million-dollar hire,” which sounds like marketing but is consistent with the salary bands now appearing at frontier labs.

This page is the external/industry FDE pattern. It is not the same thing as Hartley’s internal “Forward Deployed” 10-stage product flow captured under Agentic Engineering Strategy - that’s a Sitemate-shaped delivery model that happens to share the name. They are related (both put builders close to the signal surface), but the page below is about the external-customer Applied AI motion, not the internal product workflow.

Key Points

The thesis - commoditised intelligence makes deployment the edge

“There is no competitive advantage in intelligence alone.” (Vasuman) Strong claim, and the load-bearing one for the whole role.
If models converge in capability, the differentiator is not which lab you use but how well the agent is fitted to a specific environment: workflows, data shape, edge cases, escalation paths, integration surface, change-management.
Businesses hire Applied AI firms to install this fit. The FDE is the unit of installation.
Contrarian to the dominant AI vendor pitch, which is still “better model = better outcome.” Levie’s parallel point from the Engineering in the AI Era page applies: bottlenecks just move - and they’ve moved to deployment.

Origin - Palantir, 2010, SOF in Afghanistan

Palantir’s CTO line: “Cannot build products for an environment without being in the environment itself.” This is the founding assertion of the FDE practice.
2010 model: Special Forces operators on mission during the day, FDEs shipping code overnight against feedback from that mission. Tight loop, on-site, no abstraction layer between the operator and the engineer.
Current AI labs are explicitly copying this template. The bet is the same: deep installed software is built next to the user, not over a Zoom screenshare.
Implication for senior eng leaders: this is the opposite of the “platform-team-builds-paved-roads” model. It’s customer-embedded, project-shaped, and high-touch.

The role anatomy - Audit / Evals / Deployment

Three phases, each with its own discipline.

Audit - figure out what to automate (and what not to)

Weeks on-site, rotating through teams (Vasuman’s example: 2 weeks RevOps, 1 week procurement, 1 month finance). The “embedded” part is non-negotiable.
Goal of each rotation: workflow map, bottleneck list, candidate-agent shortlist.
Decision rules for “is this an agent or just code?”:
- Rules distillable + inputs vary across modalities (email / PDF / scan / chat) + work involves tool calls → agent.
- Rules + inputs both predictable → just write code (faster, cheaper, more auditable).
- Pattern recognition + domain expertise required → leave manual, agent will create more problems than it solves.
Volume rule: ROI doesn’t clear agent infra and eval cost unless volume is real. A workflow that fires five times a month doesn’t justify the build. This is the rule most demos quietly fail.
“Don’t over-use AI”: most automations are a series of tool calls plus one LLM call as orchestrator. Excess LLM calls compound token cost and degrade output quality. Treats the model as the expensive part of the stack, which it is.

Evals - prove value to the customer (not just to engineering)

The audience for FDE evals is the executive paying the bill, not the dev iterating on the agent. The framing has to be “is this delivering ROI” not “is this passing tests.”
Two specific techniques worth lifting:
- Trace the human’s multi-step process and grade the AI on each checkpoint. Multi-step problem solving means each step is a gradeable artifact - don’t collapse to a final-answer score.
- Start with golden examples of “great” output (sit with a top performer, capture what their answer looks like) and measure everything against them. ~100 examples is enough for a narrow task.
Evals are how executives move from “we believe in AI” to “we trust this specific agent in this specific workflow.” The trust gap is the actual sales surface.
Tactical underlay from Agents 102: evals must answer three questions - (a) does it act when it should, (b) does it abstain when it shouldn’t or lacks info, (c) does it perform on the cases that actually matter. Don’t collapse to a single accuracy number; 90% accuracy on a workload that’s 90% trivial is hiding the entire problem.

Deployment - install without breaking the environment

Don’t migrate data. Clients have already burned millions of dollars and multiple years on ERP migrations. Wrap APIs over existing data stores (SharePoint, DBs, ERP) and put the model on top as orchestrator. This is the largest cost-avoidance lever in the whole motion.
Build a sandbox inside the client’s own infra for safe testing before production. Same idea as Sitemate’s devbox/sandbox pattern in Agentic Engineering Strategy.
Smallest unit of autonomy first; only then graduate capability. Vasuman’s example: start with an agent that catches bugs, investigates, and writes a ticket summary. If that works, only then let it write code and push a PR. Generalisable rule: separate “agent can reason and report” from “agent can act on the world” and ship them in that order.
This mirrors Hartley’s local-optimisation curve and the “human-in-the-loop on escalation” guidance from the Agentic Engineering Strategy page.

Production hardening - the six pillars (from Agents 102)

The FDE doesn’t just ship a demo; they ship a system that survives unattended operation. Vasuman’s framing (and his honest aside: “a lot of this is just software engineering”):

Input validation and normalisation before the agent’s core logic ever sees the data. Reject malformed inputs explicitly. Normalise common typos / abbreviations / case variations.
Graceful degradation - explicit behaviour when the agent can’t complete the task: partial result with what’s missing, or ask the user for clarification. “Worst possible outcome is an agent that silently produces garbage.”
Timeout and retry logic that distinguishes transient (rate limit, network blip) from permanent (invalid credentials) failures. Exponential backoff for the former; fail fast on the latter.
State checkpointing at natural workflow boundaries. Persist enough state to resume - inputs received, decisions made, work remaining. No restart-from-scratch.
Rate limiting and resource caps on every dimension the agent consumes: API calls/minute, DB writes/task, tokens/session, execution time/workflow. Runaway cost is a real and frequent failure mode.
Audit logging that captures every decision with full context: input state, retrieved external data, reasoning trace, chosen action and parameters, outcome, errors. Queryable, not just stored.

Plus the evaluation-on-CI requirement: golden dataset runs on every PR that touches agent logic; regression below threshold blocks merge. Treats eval as critical-path infrastructure, not a nice-to-have. This is one of the sharper Vasuman points and the one most internal teams skip.

Knowledge infra - the part that compounds

Agents depend on knowledge (product info, policies, historical context, domain expertise). How this is managed determines whether the agent gets sharper over time or quietly degrades.
Storage mix: vector DB for semantic/unstructured, relational or document DB for structured facts, config files for explicit policies. Production systems use all three.
Version every knowledge change. Without versioning, behavioural drift is undebuggable - you can’t ask “did the agent get worse because we changed the model or because we changed a policy doc?” This is a debuggability requirement, not a governance one.

Multi-agent orchestration - driven by context-window limits, not elegance

The honest reason to go multi-agent is that a single agent can’t hold enough context to span a complex workflow. Not because “specialists are better” - because the context window forces the split.
Pipeline architecture: agent A classifies, agent B processes, agent C formats. Sequence with hand-offs.
Parallel architecture: legal review agent + financial analysis agent + technical assessment agent all examining the same input, results merged. Useful when aspects are genuinely independent.
Cost: every agent added is more eval surface, more failure modes. Don’t go multi-agent until single-agent is genuinely insufficient.

Why AI labs are hiring FDEs right now

They can’t sell pure intelligence. Capability is converging across Anthropic / OpenAI / Google / Meta and the price of tokens is dropping.
Customers are stuck. Most enterprises can’t translate “we have access to a frontier model” into “we have changed how a department operates.” That gap is where Applied AI revenue lives.
FDEs are the bridge: they sit with the customer, build the agent against real data and real context, and prove the ROI before walking away. Without them, the model is shelfware.
This is consistent with Levie’s “10 years of work for next-gen Accenture” framing on the Agentic Engineering Strategy page - same observation, different language.

Feeder backgrounds and the gaps each one has to close

Three backgrounds where FDEs come from:

Consultants / PMs - already have the “translate data to ROI” muscle. Gap: engineering depth. Closed by portfolio (production-ready agent, RAG pipeline on a domain dataset, custom eval framework, MCP server bridging an LLM to legacy software).
Software Engineers - already have the build muscle. Gap: communication. “If you can’t translate AI capability to a non-technical VP, you can’t be an FDE.” Closed by being able to explain every component of what was built - tech stack, iterations, business outcome, the pain point that justified the build in the first place.
The role compresses three skill sets that are rarely in one person. That’s why labs are paying so much for it.

Implications for Tim’s next chapter

This is where the cluster earns its keep. Treating the FDE pattern as data, not just a role description, here’s what falls out:

The AI-era PDE leadership story now has a clean external reference point. When pitching CTO/VPE narratives, “intelligence is commoditising, deployment is the edge” is a cleaner one-line thesis than the bottleneck-moved-upstream framing alone. The FDE pattern is the operational answer to “so what does the org actually do differently?” - it’s a role shape, not just a vibe shift.
The “agent fits when rules are distillable but inputs vary AND tool calls are involved” rule is the cleanest filter for triaging the agent-roadmap conversations that every leadership team is currently bad at. Worth lifting into cover-letter framing and any “how I’d think about your AI roadmap” interview answer. Most companies are about to spend money on the wrong third of that test.
The “don’t migrate data, wrap APIs” principle is a direct counter to the consulting-led “let us replatform you” pitch. For Tim’s positioning against generic transformation consultancies, this is a sharp differentiator: the FDE motion is lower risk and cheaper than the migration motion, not higher.
The “smallest unit of autonomy first, capability second” pattern matches Sitemate’s “review agent → code agent → PR agent” trajectory described in the Agentic Engineering Strategy page. The internal evidence and the external playbook align - this is the rollout pattern to advocate for.
The role gap analysis (Consultant/PM vs SWE) is useful for hiring framing. If Tim’s next chapter involves building a team that does internal-FDE-equivalent work (embedded with revenue, ops, or other functions), the gap-by-background framing predicts which background needs which development investment.
Naming collision to manage: the wiki now has both Hartley’s internal “Forward Deployed” model (10-stage product flow) and the external Palantir/Varick FDE role (customer-embedded Applied AI). When using “FDE” in conversation or writing, specify which. Easiest discipline: “Forward Deployed (internal product flow)” vs “FDE (customer-embedded engineer).”

Open threads

Internal-FDE-equivalent role design. If the FDE pattern is right for AI deployment, what’s the analogue inside a vertical SaaS like Sitemate? Levie’s “agent operator” role on the Agentic Engineering Strategy page is one answer; Sitemate’s existing FDE concept is another. Worth a dedicated page once Tim’s next chapter clarifies whether he’s operating inside one company or across many.
The role’s career stability. FDE looks structurally similar to the early-stage solutions-architect role at enterprise SaaS shops. Worth checking whether it survives once labs build self-serve installation tooling, or whether (per the Palantir history) it persists indefinitely because environments stay messy.
Vasuman’s “Agents 103” if/when it ships - he flagged composition and hierarchies as the natural next post.

Sources

2026-05-21-vasuman-agents-fde-thread.md - source note with links to Vasuman’s three X posts (Agents 101, Agents 102, Forward Deployed Engineering)
Agentic Engineering Strategy - Sitemate-internal “Forward Deployed” 10-stage product flow (different concept, same name); Levie’s agent-operator role; local-optimisation curve; review-agent → code-agent → PR-agent rollout pattern
Engineering in the AI Era - macro thesis on where the bottleneck moved; comprehension debt; the “bottlenecks just move” framing the FDE role operationalises
Org Design in the AI Era - DRI + player-coach roles, queryable company - the org-shaped counterpart to the FDE role-shaped argument

Last Updated

2026-05-25

Cortex

Explorer

forward-deployed-engineer