Skip to main content
This guide covers the most common operator issues in AgentRuntime. Start from Command Center for run-level triage, then use the sections below by symptom.

Failed runs

Find the error

  1. Open Command Center → Failed recently (last 24 hours)
  2. Click Open run to view the event log in Workflow Studio
  3. Find the first step with Failed status and read the error message
  4. Check other steps on the canvas — Cancelled and Skipped (Blocked) are expected on fail-fast runs (not bugs)

Step status on failed runs

When one step fails, the run is terminal failed. Other steps are labeled explicitly:
Studio badgeMeaningAction
FailedRoot cause stepFix model, credential, tool args, or graph
CancelledWas running in parallel when the run failedNo fix needed — audit only
Skipped (Blocked)Never ran — blocked by failed dependencyNo fix needed — will run on a new run after upstream succeeds
SuccessFinished before the failureOutputs remain in run context for inspection
If the Events panel still shows “Processing…” on a Cancelled step after a refresh, reload the run — the event log should include a step_cancelled event. See Runs and Command Center.

Failure codes

CodeMeaningWhat to do
MCP_TOOL_FAILEDMCP tool returned an errorCheck tool args, connection binding, external API status
LLM_REQUEST_FAILEDLLM call failedVerify provider key in Providers; confirm model ID
INSUFFICIENT_CREDITSOut of creditsTop up PAYG or upgrade plan — see Billing
RUN_STALLED_OR_TIMED_OUTNo progress within timeoutIncrease timeout_s; check for hung external API
DRAIN_TIMEOUTGraceful stop took too longUse immediate stop; investigate slow in-flight step
RUN_STOPPEDOperator stopped the runExpected — start a new run if needed
HUMAN_TASK_FAILEDHuman task step errorCheck task payload; re-run with fixed graph
UPSTREAM_ERRORInternal service unavailableRetry later; contact support if persistent
DEPENDENCY_UNAVAILABLECould not reach workflow serviceCheck platform status; retry

Retry after fixing

Failed runs are terminal. After fixing the root cause:
  1. Publish a new workflow version if the graph changed
  2. Start a new run — do not expect the old run to resume
Cancelled and Skipped (Blocked) steps on the old run are historical — they record what happened on that attempt. A new run executes the graph from the beginning (or from a future checkpoint API when shipped). For transient API errors, add retry_count on flaky steps. See Workflow patterns. Planned continue-from-failure (checkpoint retry, same-run retry) is documented in Run recovery (roadmap) — not available yet.

Run appears stuck

SymptomLikely causeAction
Status pausedOperator paused or awaiting resumeClick Resume in Command Center
Pending approvalhuman_task waitingApprove or reject in Command Center
Status drainingGraceful stop in progressWait, or escalate to immediate stop
Step running for a long timeSlow MCP/LLM callCheck external API; increase timeout; stop if needed
Status queued / startingStartup delayWait 30s; check credits and validation errors
A run waiting on human approval is working as designed, not stuck.

MCP binding errors

Symptoms: MCP_TOOL_FAILED on the first tool step, validation errors mentioning bindings, or “instance not resolved” in dry-run.
1

Validate the instance

Go to MCP, open Instance config for the instance, confirm the connection is wired and the profile is active. Optionally run MCP validation at /mcp/{server_id}/validate.
2

Check the connection

Confirm the bound connection exists and credentials are current. OAuth connections showing Reconnect need re-authorization.
3

Check workflow overrides

If the workflow uses connection overrides, confirm the override points to a valid connection for this environment.
4

Dry-run validate

In Workflow Studio, click Validate to surface binding errors before running.

Per-connector guides

Credit exhaustion

Symptoms: INSUFFICIENT_CREDITS failure code, runs fail to start, billing warnings in Console.
1

Check balance

Go to Settings → Billing → Usage or GET /v1/billing/usage.
2

Identify spend

Review Analytics credits consumed and usage history for heavy workflows.
3

Add credits

PAYG top-up from Billing, or upgrade plan if included credits are insufficient.
Credit spend order: trial → included → PAYG. See Billing and credits.

OAuth reconnect

Symptoms: Reconnect banner on a connection, 401/403 from Google or LinkedIn MCP tools, token refresh failures.
1

Reconnect

On Connections, open the Google account card (or the relevant provider) and Reconnect / re-authorize.
2

Re-validate MCP instances

Validate all instances bound to that connection.
3

Check admin policy

For Google Workspace, confirm your admin allows third-party app access and required scopes.
Rotating credentials does not retroactively fix failed runs — start new runs after reconnect.

API 401 and 403 errors

401 Unauthorized

CauseFix
Missing Authorization headerAdd Bearer pat_... or session cookie
Expired sessionRe-login to Console; refresh token
Revoked PATCreate a new PAT

403 Forbidden

CauseFix
Insufficient project roleNeed project_contributor for writes; project_viewer for reads only
Missing PAT scopeAdd workflow:run or mcp:execute to the token
Wrong tenant/project contextSend X-Tenant-Id and X-Project-Id headers
Tenant admin action as contributorBilling mutations need tenant_admin
See API authentication and Roles and permissions.

Validation errors (422)

Dry-run validation (POST /v1/workflows/{id}/validate) catches issues before execution:
ErrorFix
Dependency cycleRemove circular depends_on
Missing tool_name / server_urlComplete MCP step configuration
Lua syntax errorFix script in lua_script steps
Unknown MCP instanceInstall instance, fix reference, or use Import to remap server_url bindings
Invalid templateCheck {{steps.id.result}} references exist

Template fields look wrong after editing

SymptomWhat to do
Nested or duplicated {{ / }} in a prompt or start-input fieldClear the field; re-insert variables from the variable picker or Template intellisense
Variable left as literal text in run outputConfirm the path matches a real step Step ID and field; use Validate (dry-run) before running
Undo (Ctrl+Z) left the field in a odd stateClear and re-pick from suggestions, or use Variables tab to copy a clean path
See Feature availability for preview vs GA features.

Memory errors (preview)

ErrorFix
memory kernel not configuredMemory not enabled in this environment
queue unavailableExtract infrastructure down — contact support
Empty search resultsConfirm index job completed — see Memory

Still blocked?

  1. Gather: workflow ID, run ID, failure code, step ID, timestamp
  2. Check Command Center and the run event log
  3. Email support@agentruntime.io with the details