Skip to content

Risk Tiers and Confirmation

Every plan generated by the planner carries a risk assessment. This determines how much friction sits between the plan and execution.

Commands that read state without modifying it.

Examples: ls, git status, pwd, cat, echo, df, ps

Behavior: the Run Plan button is immediately available. No confirmation step.

Commands that modify local state in recoverable ways.

Examples: git checkout, npm install, mv file.txt backup/, kill <pid>, chmod

Behavior: depends on the “Confirm medium-risk commands” setting:

  • Enabled (default): a checkbox appears: “I understand the risks of this medium-risk command.” Run Plan is disabled until checked.
  • Disabled: behaves like low risk.

Commands that are destructive, escalate privileges, or have broad impact.

Examples: rm -rf, sudo, chmod -R 777, dd, mkfs, DROP TABLE

Behavior: confirmation checkbox always required, regardless of settings. The checkbox reads: “I understand the risks of this high-risk command.”

The planner (Ollama or mock) analyzes the generated command and sets several safety flags:

FlagMeaning
destructiveCommand deletes or irreversibly modifies data
touchesFilesCommand reads or writes filesystem
touchesNetworkCommand makes network requests
escalatesPrivilegesCommand uses sudo or equivalent
requiresConfirmationPlanner recommends explicit approval

The risk field is set by the planner based on these flags:

  • destructive or escalatesPrivileges → high
  • touchesFiles with write operations → medium
  • Read-only operations → low

The plan review payload includes safetyFlags — an array of warning strings:

  • DESTRUCTIVE_OPERATION — the command destroys data
  • PRIVILEGE_ESCALATION — the command uses elevated permissions
  • NETWORK_ACCESS — the command accesses the network

These flags are generated by the backend and included in the plan review context.

This is the fundamental safety contract: no AI-generated command executes without explicit user approval. The risk tier system adds graduated friction, but the base rule is absolute — the user always sees the command and always chooses to execute it.