AI SDK Harnesses Make Agents Swappable
Vercel is adding HarnessAgent to AI SDK 7. The practical lesson is that agent runtimes are becoming app infrastructure, not just CLI tools.
Vercel's new AI SDK harness abstraction is a small API announcement with a bigger product signal: coding agents are starting to look like runtime infrastructure. If you are building tools around Claude Code, Codex, Pi, or whatever agent comes next, the interesting part is not another wrapper. It is the separation between model choice and agent-runtime choice.
That distinction matters because a serious agent is no longer just generateText() with a bigger prompt. It has a workspace, tools, permissions, sessions, compaction, skills, approvals, and a way to stream what is happening back to a user. Those behaviours are sticky. Once your product depends on one harness, switching becomes hard unless the harness itself is behind a stable boundary.
What Vercel shipped#
The Vercel changelog says AI SDK 7 introduces HarnessAgent, a single API for running established agent harnesses, initially including Claude Code, Codex, and Pi. Vercel's phrasing is the cleanest summary: AI SDK already let you switch models without rewriting your agent; now it wants to let you switch the harness the same way.
In the docs, a harness is defined as a complete agent runtime: workspace access, built-in coding tools, native session state, compaction, permission flows, runtime configuration, and adapter-specific behaviour. HarnessAgent then exposes that runtime through familiar AI SDK shapes: generate(), stream(), GenerateTextResult, StreamTextResult, and stream parts that can be consumed by existing UI surfaces.
The initial adapter list is deliberately familiar:
- Claude Code, via
@ai-sdk/harness-claude-code; - Codex, via
@ai-sdk/harness-codex; - Pi, via
@ai-sdk/harness-pi.
More adapters are listed as coming soon, including Amp, DeepAgents, Goose, Mastra, and OpenCode. Whether those exact adapters matter is less important than the pattern: the framework is treating agent runtimes like provider integrations.
Why this is different from model abstraction#
Model abstraction is old news. The AI SDK, LiteLLM, OpenRouter, Vercel AI Gateway, and half the internal platforms in the world already make it easier to swap gpt-*, Claude, Gemini, Grok, or open-weight models behind one interface.
Harness abstraction is a different layer. A model call answers a prompt. A harness runs a working session.
That session may contain:
- a sandboxed filesystem;
- shell and file-editing tools;
- approval prompts for risky actions;
- session memory and compaction;
- skills or reusable instructions;
- pending work that needs to be resumed later;
- provider-specific events that need to be rendered somewhere.
That is why the docs repeatedly separate providers from harnesses. Providers expose models to AI SDK Core functions. Harnesses expose agent runtimes to HarnessAgent. If you blur those layers, you end up pretending that agent behaviour is just model behaviour with a bigger context window. It is not.
The practical builder lesson#
If you are adding an agent to a product, do not start by asking which model is smartest. Start by asking which layer owns the work.
For a lightweight chatbot, a normal model call with tools is probably enough. You control the loop, tool schemas, memory, structured output, retry logic, and UX. That is still the right choice when the agent is part of your product's own workflow.
For a coding assistant, repository fixer, migration bot, test-repair worker, or sandboxed technical operator, a harness often makes more sense. The harness already knows how to inspect files, edit code, run commands, manage a workspace, and handle permissions. Rebuilding all of that with custom tools is expensive and easy to get subtly wrong.
Sessions are the real API surface#
The HarnessAgent API docs make one detail hard to miss: sessions matter. A harness session owns the runtime, sandbox, working directory, native conversation history, and pending approvals. That is different from a normal chat route where the app usually sends the whole message history back to the model each turn.
For builders, this changes the architecture:
- you need to create, persist, resume, detach, stop, or destroy sessions deliberately;
- your UI should not assume message replay is enough;
- your backend needs durable resume state if a user leaves and returns;
- your observability should capture workspace changes and tool events, not just text output.
The AI SDK UI harness guide shows the same point from the frontend side. Harness streams can be rendered through useChat(), but the route should resume or create a HarnessAgentSession for the chat id instead of replaying all previous UI messages into a model. That is a meaningful mental-model shift.
Sandboxes and approvals become product features#
One line in the overview is easy to skim past: AI SDK agent harnesses operate in a sandbox. That should not be treated as implementation trivia. It is the difference between "an agent can run commands" and "an agent can run commands inside a bounded environment with a lifecycle you control."
The tools documentation also separates built-in harness tools from host-executed AI SDK tools. Built-ins are executed by the harness runtime. Host tools run in your application process, and can receive a restricted sandbox handle. Approvals are split too: permissionMode controls adapter-native built-ins, while toolApproval controls host-executed tools.
That split is exactly where product teams should slow down. A coding agent demo can look magical when every command is allowed. A production feature needs a much narrower answer to questions like:
- which tools can read files?
- which tools can edit files?
- which commands need explicit approval?
- what happens if the user closes the tab mid-turn?
- where do we store resume state?
- how do we show diffs, command output, and irreversible actions?
Those are not model-selection questions. They are product, security, and infrastructure questions.
Do not hide harness differences too aggressively#
The danger with any abstraction is pretending every backend has the same semantics. AI SDK seems aware of that. The docs say harness configuration follows AI SDK patterns where they fit and diverges where the runtime has different behaviour. That is the right stance.
Claude Code, Codex, and Pi may all expose a stream and a session, but they will not behave identically. Their native tools, permission flows, compaction strategies, terminal behaviour, and failure modes will differ. A good harness layer should make common cases portable without flattening the parts users actually care about.
For app builders, I would design the UI around stable concepts rather than specific agent brands:
- session state;
- workspace files;
- tool calls;
- approvals;
- diffs;
- logs;
- final answer;
- resumability.
Then expose the harness choice as an implementation detail until there is a real user reason to make it visible.
What I would build with it first#
I would not start with a general-purpose "agent in my app" feature. I would start with bounded technical workflows where a real coding harness is obviously useful:
- inspect a repo and explain the test setup;
- fix a failing test inside a sandbox;
- generate a small migration diff;
- review a pull request and produce patch suggestions;
- run a docs update against a known codebase;
- turn a support ticket into a reproducible failing case.
Those tasks have clear workspaces, clear outputs, and natural approval points. They also benefit from harness-native behaviour. A model with a few custom tools can inspect files, but a real coding agent runtime is built around that job.
The bigger signal#
The web-agent story is splitting into layers. Browser-facing standards like WebMCP are about giving agents safer ways to operate inside websites. Backend model APIs are about raw reasoning and generation. Harness APIs are about packaging agent runtimes as something applications can call, stream, resume, and sandbox.
That is the part worth paying attention to. The long-term winner may not be a specific adapter, CLI, or model. It may be the product architecture that treats agents as stateful workers with permissions, sessions, tools, and UI events from day one.
If you are building with agents now, the immediate takeaway is simple: do not let one runtime's quirks leak through your entire app. Draw a boundary around the session, the workspace, the approval model, and the stream. Then you can change the harness later without rebuilding the product around it.