Back to overview
// reference

The Manual.

Every command, flag, slash command and config knob in Knockout. If you're after the guided introduction, start with the 5-minute tutorial.

// 01

Installation

Knockout ships as a single Go binary called ko. The installer autodetects your OS (Linux, macOS) and architecture (amd64, arm64), downloads the matching release, drops it in /usr/local/bin/ko and marks it executable.

install
# One-liner
curl -fsSL https://fightclub.pro/install.sh | bash

# Verify
ko version

If the installer can't reach the release archive (e.g. air-gapped machine), it falls back to go install from source, which requires Go 1.22+ on the PATH. For fully manual installs, grab a release binary from https://fightclub.pro/releases/ko-<os>-<arch>, make it executable, move it into your PATH.

All of ko's state lives in ~/.ko/: credentials, session history, memory, skills and settings. Projects can also have a local .ko/directory for project-scoped memory and snapshots.

// 02

Authentication

Ko uses scoped API tokens. ko auth login is the interactive path, it opens your browser at the Fight Club platform, you sign in, grant the CLI scope, and a token is written to ~/.ko/credentials.json.

auth
ko auth login       # browser OAuth — for personal machines
ko auth token <tok> # paste an existing token — for CI/headless
ko auth status      # show which account + wallet balance

For CI, servers, or containers where browser-based OAuth isn't possible, create an API token from the web dashboard at /ko/settings, then runko auth token <token> to cache it locally.

The API token itself doesn't expire, revoke it from the platform dashboard if needed. For unlocking your BYOK provider keys so ko can decrypt them, see Vault (BYOK unlock).

// 03

Vault (BYOK unlock)

Your provider API keys (Anthropic, OpenAI, etc.) are stored encrypted on the server under your vault password. Before ko can use those keys it needs the vault unlocked. You do this once per machine with ko vault unlock. The derived vault key is cached locally in ~/.ko/credentials.json (perms 0600) so every subsequent ko invocation silently reactivates the CLI vault session, no prompts, no timers.

vault
ko vault unlock   # prompts for vault password (once per machine)
ko vault status   # show whether the vault is locked/unlocked + local cache
ko vault lock     # revoke the CLI session + wipe the cached key

Web and CLI sessions are independent. Unlocking from ko does not open the web session, and the web UI's 4-hour idle lock does not affect the CLI, the CLI session has no expiration.

If the machine is lost or compromised: log into the web UI and change your vault password. This re-encrypts every BYOK entry under a new vault key; any cached key on the lost machine is now useless (ko's auto-restore checks the vault verifier before re-creating the session and silently fails against a stale key).

If you never use your own provider keys (only fc:* platform-pool models), you don't need to run ko vault unlock at all, those models route through platform-managed credentials that never touch the vault.

// 04

Sessions

A session is a conversation between you and ko, scoped to the current working directory. Every message, tool call and file edit is persisted to the server so you can resume from any machine with the same account.

sessions
ko                      # start a new session in $(pwd)
ko "quick question"     # start with an initial prompt
ko -p "one-shot"        # non-interactive: print answer, exit
ko -c                   # continue the most recent session in $(pwd)
ko -r                   # pick from a list of recent sessions
ko --resume-id <id>     # jump straight to a specific session
ko sessions             # list all past sessions
ko resume <id>          # same as --resume-id

Sessions are bound to their origin directory. If you start ko in ~/project-a and later run ko -c from ~/project-b, you'll get a fresh session for B, ko won't mix them up.

The first time you run ko in a new directory, it auto-registers the folder as a ko project. Subsequent runs pick up the same project record.

// 05

Pipelines

A pipeline is the program ko runs on each task: an ordered sequence of steps, each with its own model, system prompt, tool allowlist, temperature, retry policy and conditional next-step logic. Switching pipelines is how the same ko becomes a coder, a legal analyst, a financial bull-vs-bear debater, or a copywriter.

The shipped pipelines cover the main verticals:

KO_default

The default agent. Handles any task, code, research, writing, analysis. Self-retries on failure. What you get when you just run ko.

KO_consensus

Plan, then 3 frontier models from 3 vendors implement in parallel, a judge picks the best, then verify. For critical code where quality matters more than cost.

KO_legal

Legal generalist, drafts, reviews, researches, redlines. Saves output to a file with disclaimers.

KO_legal_adversarial_review

5-step adversarial contract review: read → argue party A → argue party B → synthesize risks → write report.

KO_business

Business strategy, market research, competitive analysis, GTM.

KO_finance

Investment analysis, valuation, financial modeling, stress tests.

KO_research

Literature review, methodology critique, research writing with citations.

KO_hr

HR policies, hiring, performance, compensation, conflict guidance.

KO_creative

Copywriting, long-form, editing, ideation with multiple variants.

Swap pipelines mid-session with the /pipeline slash command. Your conversation history is preserved; the next task uses the new flow. Build custom pipelines in the visual editor at /dashboard/pipelines.

~
/pipeline              # list available pipelines (TUI)
/pipeline KO_consensus # switch to a named pipeline

Switching a pipeline mid-session also resets any model override you picked with /model, so the new pipeline runs its own per-step models. The TUI prints "Model reset to pipeline default"to make the reset explicit.

During a pipeline run, each step transition banner shows which model is handling that phase, e.g. ─── Plan → Implement: Step Implement · claude-sonnet-4-6 ───. The top-bar model label also swaps to the step's model as soon as streaming starts.

Per-step caps

Each step in the editor has three caps. The pipeline editor surfaces them as Max $, Max In, Max Out (hover the icons next to each label for the live definitions). In short:

Max $ (maxCostUsd)

Per-step spend ceiling. Reactive, a call in flight can't be stopped before we know its cost. Enforced between models in fallback strategy (the next model won't start if the cap is already hit) and used to flip the step outcome to budget_exceeded at step end so transitions can branch on it. Parallel strategies (consensus, present_all) fire all N models concurrently, so this only affects transitions for them, not mid-flight spend.

Max In (maxInputTokens)

Input-token cap for the prompt. When the estimated context exceeds this, older messages are dropped (system + tail preserved) until the prompt fits ~70% of the cap, leaving headroom for the response.

Max Out (maxOutputTokens)

Output-token cap per call. Passed to the provider as max_tokens (hard-stops generation at the API) and feeds the pre-call wallet check, we refuse to start a request whose worst-case cost would exceed your wallet. Defaults to 32k when blank.

// 06

Pipeline design

Pipelines are where ko goes from "generic agent" to "opinionated workflow that beats a single-model baseline." Design matters: the same three frontier models wired differently (parallel vote vs. plan→implement→verify vs. adversarial A/B debate) produce wildly different quality, latency and cost profiles. This section unpacks every field in the editor at /dashboard/pipelines.

Anatomy of a step

Every step in a pipeline has the following fields. Most are surfaced as form controls in the visual editor at /dashboard/pipelines; all of them show up in the JSON if you export or edit a pipeline config directly.

name

Unique identifier within the pipeline (e.g. Plan, Implement, Verify). Referenced by transitions.goto, shown in the step-change banner during execution, and used for loop protection (any one name can be visited at most 3 times per session to prevent runaway loops).

role

Free-form label for the step's purpose, planner, implementer, reviewer, tester, verifier. Doesn't affect execution, purely documentation for whoever reads the pipeline next.

models

Array of model refs. One entry = the step is a normal single-model LLM call. Two or more entries = multi-model step; multiModelStrategy decides how they're combined. Refs can be slot:<name> (abstract, resolved by the platform), fc:<provider>/<model-id> (direct Fight Club / OpenRouter), or user:<config-id>:<model-id> (your own provider keys).

systemPrompt

The per-step system prompt, the primary lever you pull to shape behavior. A "planner" step and an "implementer" step can use the same model and differ only in this field. Ko prepends its own base prompt (tool descriptions + ground rules) before injecting yours, so your text should describe the step's role, inputs, expected output and failure modes. Don't re-explain what the tools do.

tools

Allowlist of tool names this step is permitted to call (from the 15 in the Agent tools section). Shorter lists = more focused behavior. A planner with [Read, Glob, Grep, Lsp] can explore but not modify; an implementer adds [Write, Edit, Patch, Bash]; a verifier with only [Bash] can run tests but not fix them. Empty array = text-only mode (no tools at all).

temperature

Sampling temperature, 0-2 range. Controls how deterministic the model is. 0-0.3 for code implementation and verification (want the same answer every time); 0.3-0.7 for planning, analysis, and review (need some exploration); 0.7-1.0 for creative writing and ideation. Above 1.0 is rarely useful and mostly produces chaos.

maxRetries

How many times a single LLM call is retried on provider-level errors (network, rate limit, transient 5xx) before the step fails. Independent of transition-level retry via goto: _self. Typical values: 0 (fail fast), 2 (default resilience), 5 (for flaky providers). Retries reuse the same model; switch to the fallback multi-model strategy if you want retries against a different model.

maxCostUsd

Per-step USD spend ceiling. Reactive, a single call in flight can't be stopped mid-stream, but enforced between iterations and between fallback models, and used to flip the step outcome to budget_exceeded so your transitions can branch. See the tooltip in the editor for per-strategy semantics.

maxInputTokens

Input-token cap for the prompt. When the estimated context exceeds this, ko drops older messages (system + tail preserved) until the prompt fits ~70% of the cap. Use this to keep long conversations from ballooning cost on cheap summarizer steps.

maxOutputTokens

Output-token cap per call. Hard-stopped by the provider via max_tokens, and fed into the pre-call wallet check (we refuse to start a request whose worst-case cost would exceed your wallet). Blank = 32k default.

transitions

Array of { condition, goto } rules evaluated top-to-bottom after the stage runs; the first match wins and its goto decides the next stage. Conditions match the stage's derived outcome string (success, failure, the verdict on a Verify stage, etc.). The outcome is computed by the stage's outcomeRule from the structured artifact the model emits. The special condition default matches everything and belongs as the last rule. Covered in detail below.

hooks

Optional { pre: [...], post: [...] } of bash commands that run before/after the step, executed through the agent's tool flow (so they also require permission). Useful for environment setup (git pull, pnpm install), in-pipeline verification (pnpm test, ruff check), or cleanup. Hook failures are logged as tool output but don't short-circuit the step.

multiModelStrategy

Only relevant when models.length > 1. Picks the multi-model flow: fallback, present_all, consensus, or chain. Single-model steps ignore this field. Full breakdown in the next subsection.

judge

Optional { model, prompt? } config used by present_all and consensus strategies to auto-pick a winner when the user doesn't choose within consensusTimeoutSec. If omitted, the timeout falls back to "first valid candidate". The judge's prompt is a standard instruction like "Pick the answer that best satisfies X; respond only with a JSON object { "pick": <index>, "reason": "..." }."

consensusTimeoutSec

How long the picker waits for user input before auto-picking via judge (or first-valid fallback). Defaults to 30s. Shorter = less interactive; longer = more human-in-the-loop.

Multi-model strategies

When models has more than one entry, the step runs all N through one of four strategies, chosen by multiModelStrategy. Each has different cost, latency and quality trade-offs:

fallback

Try models sequentially until one succeeds. Use when: you want a primary + backup in case of provider errors or rate limits. Cost: pays only for models that actually run (typically just the first). Latency: serial. Honors maxCostUsd between models; won't start the next one if the running total exceeds the cap.

present_all

Run all N in parallel; user or judge picks the winner. Use when: you want human-in-the-loop quality selection. The TUI shows all candidates side-by-side and times out after consensusTimeoutSec.Cost: pays for all N. Latency: max of the N models.

consensus

Run all N in parallel; auto-pick if similar, prompt if divergent. Use when: you want a speed/quality check, if frontier models agree you ship fast, if they disagree you stop and look. Cost: pays for all N. Latency: max of the N models + tiny similarity check.

chain

Pipe each model's output into the next as refinement input. Use when: you want iterative improvement (draft → tighten → polish). Cost: pays for all N, each with growing input context. Latency: serial sum. Diminishing returns past 3 models.

Transitions

Transitions are the state-machine layer. After a stage emits its structured artifact, an outcome string is derived from the artifact via the stage's outcomeRule (e.g. simpleSuccess returns success or no_work; fieldEnum:verdict reads the artifact's verdict field and returns it directly). That outcome is then matched top-to-bottom against the stage's transitions array; the first match's goto picks the next stage.

The outcome rules ship with the platform; the common ones are:

  • simpleSuccess, returns no_work when the artifact set no_work: true, otherwise success.
  • implementOutcome, returns success if any executed step in the artifact reported completed, else failure.
  • fieldEnum:verdict, reads the artifact's verdict field (pass / fail on Verify stages).
  • fieldEnum:outcome, reads the artifact's outcome field (Test / Review stages emit this directly).

The goto side supports three reserved targets plus any step name:

_next

The step immediately after this one in the array (normal flow).

_self

Rerun the same step. Loop protection caps any step at 3 visits to prevent runaway cost.

_done

End the pipeline successfully.

<step-name>

Jump to any named step (useful for branching: on tests_fail go back to Implement).

The special condition default matches everything and is almost always the last rule, it's the fallback when no named condition matched.

example transitions
transitions:
  - { condition: "tests_fail",       goto: "Implement" }   # loop back
  - { condition: "security_issue",   goto: "SecurityFix" } # branch
  - { condition: "budget_exceeded",  goto: "_done" }       # stop on cap
  - { condition: "default",          goto: "_next" }       # otherwise advance

Common patterns

A few recurring shapes worth copying:

Router → Level pipeline

A fast, cheap model classifies the task's complexity (1-5) and hands off to a tailored level pipeline. KO_default uses this shape, level 1 is answered by the router itself; level 5 triggers the full planner → implementer → verifier chain.

Plan → Implement → Verify

Three steps, three models. Planner uses a strong reasoner at temperature 0.5 with read-only tools to produce a plan document. Implementer uses a fast coder at temperature 0 with write/edit/bash to execute the plan. Verifier uses a cheap model with bash-only to run tests and emit tests_fail if anything broke, transition loops back to Implement.

Consensus planning

KO_consensus's Plan step uses multiModelStrategy: consensus across three frontier models. If they agree the pipeline auto-picks and moves on; if they diverge the user is shown all three plans to choose from. Catches the "one model silently went off the rails" failure mode.

Adversarial review

KO_legal_adversarial_review's pattern: one step argues for party A, one step argues for party B (same model, opposite system prompts), a synthesizer step reads both and ranks risks. Works well whenever you want a devil's advocate baked into the process.

Design principles

  • Shorter tool allowlists beat longer ones. A planner that can't write files won't try to implement. A reviewer that can only Read + Grep won't "fix" problems instead of reporting them.
  • Use slots, not fixed model addresses. slot:code_frontier lets the platform swap the concrete model without touching your pipeline. Your KO_custom stays current through frontier model upgrades automatically.
  • Cheaper verifiers, expensive planners. Planning errors cascade; verification errors are local. Spend the budget where a bad decision is costliest to undo.
  • Branch on the artifact, not on text. When a stage's outcome has more signal than success/failure ("tests failed", "security issue found", "needs human review"), pick an outcomeRule that reads the structured field carrying that signal (e.g. fieldEnum:verdict for a Verify stage, fieldEnum:outcome for Review). The model emits the field; the orchestrator does the routing.
  • Cap ambition with maxCostUsd. Even if the cap fires reactively (after the step finishes), it's still the fastest way to keep a runaway _self loop from draining a wallet.
  • Test pipelines on small tasks. Edit your pipeline in the visual editor, switch to it with /pipeline <name>, and ask something cheap like "summarize README.md" to watch the flow execute before committing to an expensive run.
// 07

Models & slots

Ko uses model slots, abstract roles like code_frontier, code_planner, code_verifier, budget_fast. The platform admin maps each slot to a concrete model address (e.g. fc:anthropic/claude-sonnet-4.6). Pipelines reference slots, not concrete models, so the same pipeline works across model upgrades without edits.

19 providers are supported, routed through the platform pool for Fight Club models and directly for user-managed API keys:

~
OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI (Grok),
Cohere, Groq, Together, Fireworks, Cerebras, SambaNova,
Perplexity, OpenRouter, Ollama, Azure OpenAI, Bedrock,
HuggingFace, Replicate

Override the model for a single session with -m, or swap mid-session with the /model command inside the TUI:

~
ko -m fc:openai/gpt-4o "summarize src/"
ko -m user:<config-id>/mistral-large "write a README"
/model   # list models; pick a number to switch for the rest of this session

The fc: prefix uses Fight Club's platform wallet; the user: prefix uses your own vault-encrypted provider config. Add your own provider keys at /settings on the dashboard.

Model picks are session-scoped, they are not persisted to disk. Closing the TUI and reopening it starts fresh with the pipeline's own per-step models. Swapping pipelines mid-session also drops any active override (see Pipelines). If you want a sticky preference, pass-m on every launch or configure it platform-side in your slot mapping.

// 08

Agent tools

Ko agents have 15 tools. Each pipeline step declares which subset it can use.

Bash

Run a shell command and capture stdout/stderr. Subject to permission prompts unless auto-allowed.

Read

Read a file at an absolute path.

Write

Create or overwrite a file. Creates pre-edit snapshots for /rollback.

Edit

Replace an exact string in a file. Safer than Write for targeted changes.

Patch

Apply a unified diff (git diff / diff -u format) atomically, handles multi-hunk, multi-file changes in one call. Preferred over chained Edits when the change spans many locations. Pre-validates paths, snapshots every touched file for /rollback.

Glob

Find files by glob pattern (e.g. src/**/*.ts).

Grep

Ripgrep-backed content search with file-type and glob filters.

Lsp

Language-server queries, definition, references, hover, workspaceSymbol. Uses gopls, typescript-language-server, pyright, or rust-analyzer depending on file extension. Semantic, finds type-scoped uses of a symbol that Grep can't. Reports clearly when the required server isn't installed.

WebFetch

Download a URL's content. Useful for fetching docs or small files.

WebSearch

Search the web, current events, unfamiliar libraries, benchmarks.

Diff

Show a git diff between refs or working tree.

Clipboard

Read/write the system clipboard.

Docker

Run containerized commands, builds, tests, isolated workloads.

SQL

Execute queries against a configured database URL.

AskUser

Ask the user a free-form question mid-task when guessing would waste significant work. The next typed line is the answer; ESC cancels with an explicit "user cancelled" response. Degrades to an error in --print mode so batch runs don't hang.

Pipelines also use auto-generated Emit{Schema} tools (one per stage, named after the stage's declared artifactSchema: EmitPlan, EmitImplement, EmitVerify, etc.). Each stage ends when the model calls its emit tool with a structured artifact; the orchestrator validates the artifact against the schema and the stage's contract rule, then transitions. You don't define these tools; the orchestrator wires them in per stage based on artifactSchema.

// 09

Skills

Skills are prompt + tool bundles that teach ko a specific domain or workflow (e.g. "write Next.js Server Components correctly", "apply these brand guidelines", "run this CI workflow"). The marketplace at /dashboard/skills lists installable community skills.

skills
ko skills                    # list installed skills
ko skills install <id>       # install from the store
ko skills uninstall <name>   # remove a skill
ko skills update             # update all installed skills
ko skills create <name>      # scaffold a new skill in the current project

Skills are sourced from ~/.ko/skills/ (global) and .ko/skills/ (project-local). Global skills apply to every session; project skills only apply when ko runs in that directory.

Skill Seekers are a specialized skill that read external documentation URLs and cache them as an in-agent reference, point at any library's docs site and ko becomes an instant expert in that library.

// 10

Memory

Memory is a lightweight, file-based long-term store. Two scopes:

~/.ko/memory/

Global memory. Facts that apply across every project, your name, preferences, coding style, tools you always reach for.

.ko/memory/

Project memory. Scoped to a single directory. Project quirks, team conventions, in-flight decisions, known bugs.

memory
ko memory list           # list all memory files (global + project)
ko memory edit           # open ~/.ko/memory/MEMORY.md in $EDITOR
ko memory edit <file>    # open a specific memory file

Inside a session, ko writes to memory autonomously when you tell it something worth remembering ("I always use pnpm, not npm"), it'll save a memory file and recall it in future sessions. The index file MEMORY.md is loaded into every session's system prompt; individual memory files are loaded on demand.

// 11

RAG features

The memory section above covers human-edited memory files. This section covers the vector-indexed surfaces ko manages for you: workspace index, cross-session memory, session scratchpad and audit log. Chunks are retrieved by meaning rather than by exact string match, which is the difference between "find the auth handler" landing the right file and returning nothing.

All four sit on the same Ringside vector_stores API that ships to external customers, so ko is its own biggest user of that infrastructure.

Workspace (ko workspace)

Indexes the repo so the agent can answer concept queries. A one-shot walk on first init, plus an fsnotify watcher that re-embeds changes with a 5-second debounce. The walker honors .gitignore and an optional .koignore for ko-specific exclusions. It skips files larger than 1 MB and sniffs the first 4 KB for null bytes to filter binaries. Files matching common secret patterns (.env, *.pem, id_rsa*, secrets.*) are never embedded.

workspace
ko workspace init           # walk the repo, embed every kept file
ko workspace init --no-watch   # skip the background fs-watcher
ko workspace sync           # one-shot reconcile after offline edits
ko workspace status         # show indexed/pending/skipped counts
ko workspace drop           # delete the workspace store entirely

The differentiator vs Cursor or Claude Code is that retrieval is semantic, not lexical. Ask "where do we rate-limit incoming webhooks" and the agent pulls the right middleware even if the file never uses the word "webhook". Internally the agent calls a VectorSearch tool, so retrieval is paid only when used and the chunks come back as cite-able tool output.

Cross-session memory (ko memory)

Opt-in. When enabled, every completed session is summarized by a cheap model (Gemini Flash or Haiku) and the summary is embedded into your personal episodic store. New sessions get the top 5 relevant past sessions prepended to the system prompt, and the agent can VectorSearch it mid-session if it wants more.

memory
ko memory enable             # turn on episodic indexing, backfill the last 100 sessions
ko memory disable            # stop indexing new sessions, keep what's already there
ko memory status             # show enabled state, episode count, retention
ko memory search "<query>"   # forensic search across your own session history
ko memory export <path>      # dump every indexed episode as JSON
ko memory wipe --yes         # drop the store and soft-delete the episodes

Episodes older than 365 days detach from the vector store but stay in your archive (so memory export still returns them). Exclude patterns on the user record skip sessions matching a regex, --no-episodic on any individual ko -p run opts that one out.

The store lives server-side, so the memory follows the account rather than the laptop. Switch machines, sign in, the agent still knows what you were doing.

Session scratchpad (always on)

Tool outputs above 8 KB get uploaded to a per-session vector store and replaced in the running context with a short stub: [scratchpad:<file_id>] preview: ... (47213 bytes offloaded). The agent gets two retrieval paths back:

ReadScratchpad(file_id)

Fetch the complete original output for a specific stub, verbatim, no chunking, free.

VectorSearch(session_store_id, query)

Semantic search across every offloaded output in the session, returns chunks plus their originating file_id.

No flag, no setting. The scratchpad is an internal optimization. Long sessions stop paying input tokens on cold context and quality stops sliding late in the session. The per-session store self-destructs 24 hours after the session ends.

Audit (ko audit)

Opt-in. When enabled, every tool call from every session indexes one chunk into your audit store: session_id, timestamp, tool name, redacted args, success/error, a 256-char preview of the output and the scope tags the session ran under. Not the full output (that lives in the session log), just the needle-in-haystack lookup row.

audit
ko audit enable                            # provision the store, backfill last 90 days
ko audit search "<query>"                  # natural-language search across all your sessions
ko audit search "<query>" --scope=release=production
ko audit search "<query>" --scope=*=*       # global, ignore the current session's scope
ko audit export --since=2026-04-01 > audit.jsonl
ko audit disable                           # stop indexing; raw events still persist in ko_session_events

Default retention is 90 days, configurable up to several years per scope. Sessions tagged release=production can carry a 2555-day (7-year) retention while everything else expires fast. The admin dashboard exposes a cross-user view at /admin/ko/audit/search for abuse review and compliance evidence.

Concrete use: you finished a 4-hour session yesterday and want to confirm the agent never read prod_secrets.env. ko audit search "prod_secrets" returns every chunk that mentions it. Zero matches is a real answer, not a hope.

Scopes (ko scope)

A scope is an ordered set of (category, value) tags that classifies what a session is doing. project=fightclub, release=production, customer=rubin, severity=p0 or any custom category you want. Every indexed chunk carries the session's scope tags, retention is set per scope, and searches can filter by any subset.

Resolution at session start, later overrides earlier:

  1. workspace default (project=<workspace.display_name>)
  2. .ko/scope.yaml at the repo root (per-branch overrides allowed)
  3. KO_SCOPE env var
  4. --scope CLI flag
scope
ko scope                                                 # show the effective scope for cwd
ko scope list                                            # every scope you've ever touched, with use counts
ko scope settings release=production --audit-retention=2555   # 7-year retention for prod
ko scope settings customer=rubin --audit-enabled=true         # always audit Rubin work
ko scope settings project=secrets --scope-widening-locked=true # hard wall, see below

Scope as a privacy boundary. Default behavior treats scope as a convenience filter, the agent can widen beyond the session's scope on its own VectorSearch calls. Setting --scope-widening-locked=true flips that: when a session runs in that scope, the web tier rejects any scope_mode: global or scope_mode: override with a 403, and silently constrains scope_filter to the current scope. Enforced server-side at /api/ko/rag/search, not in the agent prompt, so a model that decides to try anyway gets blocked.

Retrieval is a tool, not an injection

Cursor and Continue.dev inject retrieved chunks into every model call. Linear cost in turns multiplied by context size, even when the agent didn't need RAG that turn, and the agent can't reason about why text appeared in its context. ko ships a VectorSearch tool the agent decides to call. You pay for retrieval only when it's used, and the chunks come back as tool output the agent can cite by source.

The system prompt lists which stores are available this session (workspace, episodic, session, audit) and when to prefer VectorSearch over Grep. Grep is faster and more precise for exact symbols, VectorSearch handles the synonym queries that grep misses.

// 12

MCP servers

Model Context Protocol (MCP) is how IDEs and other agents connect to external tools. Ko speaks MCP in both directions: it can expose itself as an MCP server for IDEs to consume, and it can consume other MCP servers as tools.

Ko as an MCP server. Any MCP-compatible client (Cursor, Windsurf, JetBrains with MCP plugin, VS Code with MCP extension, Claude Desktop) can use Knockout as a coding backend:

mcp
ko mcp show         # print the MCP config snippet for your IDE

# The snippet to paste into your IDE's MCP config:
{
  "knockout": {
    "command": "ko",
    "args": ["mcp-serve"]
  }
}

ko mcp-serve reads JSON-RPC from stdin, writes to stdout; logs go to stderr so they don't interfere with the protocol channel. Your IDE spawns it as a subprocess.

Ko consuming other MCP servers. Configure external servers in ~/.ko/settings.json under mcpServers. They appear as extra tools the agent can call, alongside the built-in 15.

// 13

Sync & encryption

Ko sync moves work between machines without exposing your code to the server. Snapshots are encrypted locally before upload with a key only you hold; the server stores opaque blobs and cannot read your code.

sync
ko sync push                # encrypt the working tree and upload
ko sync pull [snapshot-id]  # download and decrypt a snapshot (latest if omitted)
ko sync snapshots           # list available snapshots

# Also as slash commands inside the TUI:
/sync push
/sync pull
/sync list

The encryption key is derived from your vault password. Lose the vault password and there is no recovery, the snapshots are unreadable. Set a vault password at /settings on the dashboard; it's never sent to the server.

Snapshots skip .git, node_modules, .next, __pycache__, vendor, .ko, .venv, dist, build, .cache, target, and binary artifacts (.exe, .dll, .so, .pyc, etc.).

Pipeline configs and session messages are also vault-encrypted on the server if you've set a vault password. When you open the dashboard, you unlock the vault once per browser session (30-minute TTL) to view or edit them.

// 14

Slash commands

Commands typed inside the TUI, starting with /.

/help

Show all available slash commands.

/exit, /quit

Exit the TUI.

/clear

Clear chat history. System prompt is preserved.

/compact

Summarize the current conversation into a short memo. Frees context while keeping continuity.

/model

Show the model currently in use for the active step.

/pipeline

List available pipelines. /pipeline <name> switches mid-session.

/status

Show connection status (SSE stream health).

/cost

Show cumulative cost of the current session.

/history

Show recent input history.

/thinking

Toggle tool-output visibility: /thinking on or /thinking off.

/plan

Show the current structured task plan with per-task evidence trail.

/rollback

Restore files from pre-edit snapshots, undo the last task's file edits.

/sync push, /sync pull, /sync list

Trigger sync operations without leaving the TUI.

// 15

Project templates

65 project templates that aren't rigid starters. An interactive wizard at /dashboard/templates asks about your business, audience and style, then generates a bespoke project design. The result is a template ID (e.g. tp_abc123) you hand to the CLI:

build
ko build tp_abc123  # fetch the configured template and start a session

Categories cover business sites, e-commerce, SaaS, blogs, community, education, health, productivity, creative and food. Each template has 4-10 variants.

// 16

Configuration

All ko config lives in ~/.ko/settings.json (global) and optionally .ko/settings.json (project-local, overrides global). Open in your $EDITOR:

~
ko config

Known keys:

trustedDirs

Array of absolute directory paths. Ko will auto-trust these without prompting on each run ("trust this directory?" dialog is skipped).

autoAllow

Array of tool patterns that skip the permission prompt. Defaults: Read, Glob, Grep, Bash(ls), Bash(cat), Bash(git), Bash(pwd), Bash(which), Bash(echo).

mcpServers

Object mapping server names to MCP connection configs. See the MCP section.

Note: the older defaultModel key is no longer honored. Model picks are session-scoped and stale persisted values are wiped on startup. Pass -m or use /model instead.

~/.ko/settings.json
{
  "trustedDirs": [
    "/home/me/work/project-a",
    "/home/me/work/project-b"
  ],
  "autoAllow": [
    "Read", "Glob", "Grep",
    "Bash(ls)", "Bash(cat)", "Bash(git)",
    "Bash(pwd)", "Bash(which)", "Bash(echo)"
  ]
}
// 17

Cancellation & billing safety

Ko has multiple layers of defense to make sure you never pay for work you didn't authorize, or, conversely, that OpenRouter never bills us for requests whose cost we can't recover from you. Practical rundown of how they interact:

ESC cancellation

Hit ESC in the TUI at any time. The client POSTs to /api/ko/sessions/:id/abort without dropping the SSE socket. Server-side, the per-turn AbortController fires, the in-flight provider request is aborted, and you're charged only for the tokens that actually streamed before the abort, never the full budget of what the model would have generated.

A synthetic "[Turn cancelled by the user. Abandon any partial work above…]" marker is appended to the conversation so the next turn doesn't silently continue the cancelled task. If you then ask an unrelated question, the model treats it as fresh.

Live cost ticking

The Session $ counter in the status bar updates roughly every 750ms during streaming, not just at end-of-turn, so you can watch spend accumulate on long calls and hit ESC before a runaway model runs away further.

Hard per-call wallet cap

Before every fightclub-billed call we estimate the worst-case cost (input tokens plus the step's maxOutputTokens, or 32k default) and refuse to start the request if it could exceed your wallet balance. This prevents the bad case where OpenRouter bills us for tokens that atomic chargeUsage then refuses to charge you. Applied to both single-call and multi-model steps.

Idle watchdog

The CLI sends a heartbeat ping every 30 seconds while connected. If the server goes 90 seconds without any client activity (heartbeat, message, tool result, …) while a turn is in flight, it aborts the turn. Catches network drops, laptop sleep and browser refresh, anything that would otherwise leave the agent loop billing into the void.

// 18

Troubleshooting

Vault is locked

Run ko vault unlock once, the CLI session has no expiration and ko caches the key locally, so this is a one-time prompt per machine. Web sessions still lock after 4 h idle; the CLI is independent. See Vault (BYOK unlock).

Model unavailable

The slot the step requested isn't mapped, the provider is down, or your wallet balance is zero. Check ko auth status for balance; check /ko/settings for provider configs.

Session returns HTTP 500

Usually a transient platform issue, rerun. Persistent 500s on the same session suggest the session is wedged; start fresh with ko (no -c).

Insufficient wallet balance

Top up at /settings → wallet. Or switch to the KO_economy pipeline (if configured) for ~80% cost savings via budget models.

Ko keeps calling the same failing command

Ko's insanity guard breaks the loop after 3 identical tool calls. If you see this, the model is confused, try /clear or rephrase the task.

Can't find ~/.ko/

Ko lazy-creates ~/.ko/ on first run. If it's missing after ko auth login, permissions are probably wrong on your home directory, check the error output from ko.

Reset everything

Nuclear option: rm -rf ~/.ko and re-run ko auth login. You'll lose local memory, skills and cached sync config, server-side history survives.

// 19

Keyboard shortcuts

Inside the TUI:

Enter

Send the current input.

Shift+Enter

Newline within the input (multi-line prompts).

Tab

Autocomplete the highlighted slash command or file path.

Esc

Cancel the current in-flight task. Safe, already-made edits are preserved.

↑ / ↓

Walk through input history.

Ctrl+C

Interrupt. Second press within 2s exits.

Ctrl+L

Repaint the screen.

/

Open the slash command picker at the start of an empty input.