You can't just prompt-engineer your way to a great customer experience.
Building with agents for internal productivity is one thing. Building agents for customers is a different challenge—one that requires cross-functional orchestration before a sketch, roadmap, or single line of code is written.
The biggest shift in the agentic era isn't a new Figma plugin. It's the shift from designing static interfaces to designing coordinated behavior—across signals, guardrails, channels, and handover points that most teams haven't mapped yet. I built the Agent UX Canvas to make that work concrete.

Each block surfaces a decision that can't be made by one person or one team. Expect to revisit them across multiple sessions with product, engineering, legal, and design before they're truly complete.
Define what the agent is trying to accomplish and for whom
Intent is the north star of your agent. It answers: if this agent runs perfectly, what state of the world has changed? A well-formed intent distinguishes the agent's immediate output from the actual outcome. Intent also sets scope—what the agent is responsible for vs. what sits outside its remit.
Identify the inputs the agent reads to form context and decisions
Signals are the raw material an agent uses to understand its situation — structured data, unstructured content, user-provided context, and environmental state. A critical distinction: required signals (without which the agent cannot proceed) vs. enrichment signals that improve quality but aren't blockers.
Specify the conditions that initiate agent behavior
Triggers define activation logic: what observed state or event causes the agent to begin a task. They can be event-driven, schedule-driven, threshold-driven, or manually initiated. Explicitly mapping triggers prevents agents that fire too eagerly on ambiguous signals — and agents that sit idle waiting for conditions never clearly met.
Catalog the tools, operations, and capabilities the agent can invoke
Actions are the agent's vocabulary of behavior — operations it executes against the world. API calls, file reads and writes, form submissions, database queries, communication sends, calls to sub-agents. Each action carries a cost profile (latency, failure rate) and a risk level (read-only vs. destructive).
Calibrate how much the agent decides vs. defers to humans
Autonomy is a spectrum. At one end: the agent executes entirely without review. At the other: it surfaces recommendations and waits for explicit approval. The right level depends on the stakes, the reliability of the agent, and the cost of a mistake. Map each action to an autonomy tier and identify the escalation path when confidence is low.
Define who grants permission and what the agent is allowed to access
Authorization is distinct from autonomy. Autonomy governs how often humans review decisions; authorization governs what the agent is permitted to do at all. This block surfaces the consent model — who must opt in, what data is readable vs. storable, what actions require a second party. Also covers revocation: how does a user withdraw agent access, and how quickly?
Design the surfaces and interaction patterns users engage with
Widgets are the touchpoints where agent and user meet. Inventory the interface surfaces: chat threads, dashboard panels, email digests, in-app notifications, API webhooks, voice interfaces. For each surface, define the interaction modality — does the user provide input upfront, respond mid-task, or receive a completed output?
Set the constraints that prevent harmful, wrong, or off-scope behavior
Guardrails are the system's immune response. Hard constraints (actions the agent must never take), soft constraints (behaviors that trigger review), content filters, rate limiters, and output validators. Design around failure mode analysis: what are the realistic ways this agent could behave badly, and what check catches each one?
Define the metrics and instrumentation that reveal agent performance
You cannot improve what you don't measure. Define outcome metrics (did the user accomplish their goal?) and operational metrics (latency, error rate, token cost, task completion rate). Identify leading indicators that predict problems before users notice. Instrumentation design is part of this: which events must be logged, at what granularity, and how is ground truth established?
Build the feedback infrastructure that drives continuous improvement
Measurements tell you what happened; the feedback loop determines what you do about it. Design the pipeline from raw telemetry to actionable improvement: user ratings, implicit signals (retry rate, abandonment, manual edits), and automated eval suites. How does the agent learn — through prompt updates, fine-tuning, retrieval refreshes? Define a deployment cadence and rollback protocol.