Risk Tiers and Confirmation

Every plan generated by the planner carries a risk assessment. This determines how much friction sits between the plan and execution.

The three tiers

Low risk

Commands that read state without modifying it.

Examples: ls, git status, pwd, cat, echo, df, ps

Behavior: the Run Plan button is immediately available. No confirmation step.

Medium risk

Commands that modify local state in recoverable ways.

Examples: git checkout, npm install, mv file.txt backup/, kill <pid>, chmod

Behavior: depends on the “Confirm medium-risk commands” setting:

Enabled (default): a checkbox appears: “I understand the risks of this medium-risk command.” Run Plan is disabled until checked.
Disabled: behaves like low risk.

High risk

Commands that are destructive, escalate privileges, or have broad impact.

Examples: rm -rf, sudo, chmod -R 777, dd, mkfs, DROP TABLE

Behavior: confirmation checkbox always required, regardless of settings. The checkbox reads: “I understand the risks of this high-risk command.”

How risk is determined

The planner (Ollama or mock) analyzes the generated command and sets several safety flags:

Flag	Meaning
`destructive`	Command deletes or irreversibly modifies data
`touchesFiles`	Command reads or writes filesystem
`touchesNetwork`	Command makes network requests
`escalatesPrivileges`	Command uses sudo or equivalent
`requiresConfirmation`	Planner recommends explicit approval

The risk field is set by the planner based on these flags:

destructive or escalatesPrivileges → high
touchesFiles with write operations → medium
Read-only operations → low

Safety review

The plan review payload includes safetyFlags — an array of warning strings:

DESTRUCTIVE_OPERATION — the command destroys data
PRIVILEGE_ESCALATION — the command uses elevated permissions
NETWORK_ACCESS — the command accesses the network

These flags are generated by the backend and included in the plan review context.

The approval boundary

This is the fundamental safety contract: no AI-generated command executes without explicit user approval. The risk tier system adds graduated friction, but the base rule is absolute — the user always sees the command and always chooses to execute it.