TL;DR: An in-admin AI agent can build WordPress integrations safely if it works plan-first: the model proposes an ordered plan of typed steps as JSON, and the plugin executes them locally under your permission model.
- A strict JSON envelope (message + clarifying questions + plan) replaces native tool calling — so the loop works over any transport, including the WordPress 7.0 AI Client.
- Safety is structural: new webhooks are created disabled, going live or deleting requires explicit confirmation, and every mutation snapshots prior state onto an undo stack.
- Steps execute one at a time; a failure halts the run without touching later steps.
- A dev trace (model, latency, full prompt, parse status) is the only way to debug an agent you cannot see.
/ Why
Why put an AI agent inside wp-admin?
Because the agent can touch what an external chatbot cannot: the site's real state. An in-admin agent building a webhook integration reads the actual trigger catalog, inspects captured payloads from hooks that already fired, sees the webhooks that already exist, and probes destination endpoints from the server — so it works from real data instead of guesses. This is the design behind Build with AI, the agent coming in the next major release of the Webhook Actions plugin: describe an integration in plain language, and the agent proposes, builds, and tests the webhooks, field mappings, conditions, and chains.
The catch is obvious: a language model with write access to a production site is a liability unless the architecture makes unsafe actions structurally hard. That is what plan-first execution is for.
/ The loop
What is a plan-first agent loop?
A loop where the model never mutates anything during the conversation. Each turn, the model receives the transcript plus a system prompt describing the available typed operations, and replies with a proposal: a short assistant message, optional clarifying questions, and an optional ordered plan of steps. The plugin normalizes the plan, shows it to the user for review and editing, and only then executes it — step by step, locally, through its own code paths.
The system prompt makes the contract explicit: the agent must never claim to have changed anything itself; it proposes, the plugin disposes. It is also told to prefer action over interrogation — when the goal is clear, propose the full plan with sensible defaults, leave genuinely unknown fields blank, and ask only for those.
/ Protocol
How does a JSON envelope replace tool calling?
The model is instructed to reply with a single JSON object and nothing else. That envelope is the whole protocol — which means the loop runs identically over the Anthropic, OpenAI, and Google APIs and over the WordPress 7.0 AI Client, which has no native tool-calling at all.
JSON — the envelope the model must return
{
"assistant_message": "I'll send new WooCommerce orders to your n8n workflow.",
"clarifying_questions": ["What is the n8n webhook URL?"],
"plan": [
{
"id": "step_1",
"ability": "create_webhook",
"summary": "Create the order webhook (disabled)",
"input": { "name": "Orders to n8n", "endpoint_url": "" }
},
{
"id": "step_2",
"ability": "test_dispatch",
"summary": "Send a test delivery",
"input": { "webhook_id": "{{step_1.id}}" }
}
]
} Three details make this production-grade rather than a demo. Steps are typed: each names one ability from a fixed catalog, and its input must match that ability's JSON Schema — the model cannot invent operations. Steps can reference earlier results: {{step_1.id}} is substituted with the real id at run time, so the model can chain a build without knowing database ids. And parsing is defensive: strip code fences, fall back to the outermost brace span, and if the reply still is not valid JSON, treat the whole text as a plain assistant message with no plan — a malformed reply degrades to conversation, never to a broken mutation.
/ Safety
How does step-by-step execution stay safe?
Safety lives in the executor, not in the prompt. The prompt asks the model to behave; the executor makes misbehaviour inert:
- New webhooks are created disabled. The agent can build a complete integration, but nothing fires until a human flips it live.
- Destructive steps carry confirmation metadata. Enabling, deleting, or editing a live webhook always pauses for explicit confirmation; an endpoint probe pauses only when its HTTP method is unsafe.
- Steps run one at a time. The frontend advances the plan step by step; missing required input, an unmet prerequisite, or a needed confirmation pauses the run at that exact step.
- Failure halts, never cascades. A failed step stops the run with the error attached; later steps are never attempted against a half-built state.
Everything also passes the same capability checks as the admin UI — the agent holds no special powers, and credential values stay in a write-only vault the agent can reference but never read.
The model never calls tools natively — it proposes typed steps, and the plugin executes them locally. The site stays in control. — the core contract
/ Undo
How do undo and revert work?
Before the executor mutates anything, it snapshots the object's prior state onto the step record. Undo then walks the applied steps backwards: the last still-applied revertible step is found, its pre-state restored — or the object it created deleted — and the step marked reverted. Repeated calls walk further back, giving you a real undo stack rather than a single "oops" button.
One subtle but important touch: each undo is recorded into the conversation transcript. The model sees what was undone on its next turn, so it does not confidently reference a webhook that no longer exists.
/ Debugging
How do you debug an agent you cannot see?
With a trace, or not at all. Every model call records the provider and model that answered, latency, temperature, the full system prompt, the exact message array sent, the raw response, whether the JSON envelope parsed, and how many plan steps it contained. When a user reports "the AI did something weird", that trace is the difference between a fix and a shrug: a parse failure, a provider quota error, and a genuinely bad plan all look identical from the chat UI but completely different in the trace.
The trace matters double when your transport can silently switch models. With cross-provider fallback in play, "which model actually produced this plan?" must be answerable per request, not per configuration.
/ Toolset
What does the agent actually control?
A fixed registry of 17 typed abilities — the same toolset the plugin publishes to the WordPress Abilities API for external clients like Claude Code over MCP. One catalog of operations, one permission story, three consumers: the in-admin agent, the REST surface, and MCP tooling (the Abilities API article covers that side).
The system prompt also injects a compact catalog of the webhooks that already exist on the site, flagged with their numeric ids — so when the user says "rename the order webhook", the agent proposes an update to webhook #12 instead of creating a duplicate. Grounding the model in current state is as much a safety feature as the confirmation gates.
| Concern | Chatbot that "just does it" | Plan-first agent (Build with AI) |
|---|---|---|
| Mutations | Model output parsed and applied immediately | Typed plan, reviewed and edited, stepped through |
| Destructive actions | One hallucination from deleting live config | Created disabled + explicit confirm to go live or delete |
| Failure mid-task | Half-applied state, no rollback | Run halts; undo stack restores pre-state snapshots |
| Auditability | A chat log | Per-step activity log + full model trace |
This architecture powers Build with AI, coming in the next major release of the Webhook Actions plugin: describe the integration you want, review the plan, and it creates queued, retried, fully logged webhooks — disabled until you say go.