// reference

The Manual.

You type a sentence. Something reads your code, writes the change, runs the tests and tells you what it did. That is ko.

Two things about it are unusual, and both explain most of what follows. ko asks before it acts, and it does not hand your code to us in readable form. Reads and greps go through unannounced, but the moment the agent wants to write a file or run a command it stops and waits for you to answer. So the model never touches a file until you say yes, and the diff lands on your filesystem. The session history that lands on our side is encrypted under a password we never receive.

This page is the reference. Every command, flag and config key, plus the slash commands you only meet inside the TUI. Never run ko before? The 5-minute tutorial gets you there faster.

// 01

Installation

Knockout ships as a single Go binary called ko. The installer autodetects your OS (Linux, macOS) and architecture (amd64, arm64), downloads the matching release, drops it in /usr/local/bin/ko and marks it executable.

install

# One-liner
curl -fsSL https://fightclub.pro/install.sh | sh

# Verify
ko version

The installer needs curl or wget and network access to the release archive; it doesn't build from source. For a fully manual install, grab the binary for your platform from https://fightclub.pro/api/ko/releases/<version>/ko-<os>-<arch> (read <version> from https://fightclub.pro/ko/latest-version.txt), make it executable, move it into your PATH.

To upgrade an installed binary, run ko update. It checks https://fightclub.pro/ko/latest-version.txt, and if a newer release exists it downloads and replaces the running binary in place. Already current, it says so and exits. ko also nudges you when a newer version is out.

update

ko update   # check for a newer ko and upgrade in place

All of ko's state lives in ~/.ko/: credentials, session history, memory, skills and settings. Projects can also have a local .ko/directory for project-scoped memory and snapshots.

// 02

Authentication

Ko uses scoped API tokens. ko auth login is the interactive path, it opens your browser at the Fight Club platform, you sign in, grant the CLI scope, and a token is written to ~/.ko/credentials.json.

auth

ko auth login       # browser OAuth, for personal machines
ko auth token <tok> # paste an existing token, for CI/headless
ko auth status      # show which account + wallet balance

For CI, servers, or containers where browser-based OAuth isn't possible, create an API token from the web dashboard at /dashboard/settings. The preferred way to hand it to ko in an automated environment is KO_TOKEN_FILE: point it at a file holding the token and ko reads it on startup, no on-disk credentials cache and no token in the process environment or shell history.

ci-auth

# preferred: token in a file, referenced by env var
echo "$KO_TOKEN" > /run/secrets/ko-token
KO_TOKEN_FILE=/run/secrets/ko-token ko -p "run the checks"

# interactive/one-off: cache a pasted token under ~/.ko
ko auth token <tok>

ko auth token <tok> still works and caches the token in ~/.ko/credentials.json. The older KO_API_TOKEN env var is deprecated, it prints a warning at runtime and points you at KO_TOKEN_FILE; don't use it in new setups.

The API token itself doesn't expire, revoke it from the platform dashboard if needed. For unlocking your BYOK provider keys so ko can decrypt them, see Vault (BYOK unlock).

// 03

Vault (BYOK unlock)

Your provider keys sit on our server, encrypted under a password we never receive. When you run ko vault unlock, the password stays on your machine. scrypt derives a key from it locally at N=131072, r=8, p=1, ko sends only the derived key, and the server holds that key just long enough to decrypt your Anthropic key and make the call.

So we cannot reset your vault password. No recovery path. There is no plaintext copy anywhere on our side to recover from, which is the same property that stops an attacker who walks off with our database from reading your keys. Forget the password and you reset the vault and paste your keys in again.

The derived key caches in ~/.ko/credentials.json at mode 0600, and the CLI vault session it opens on the server locks after 4 hours idle. Every real turn rolls that window forward, because only actual cryptographic use extends it, not status checks. Leave a terminal open overnight and the next turn prompts you again.

vault

ko vault unlock   # prompts for vault password (once per machine)
ko vault status   # show whether the vault is locked/unlocked + local cache
ko vault lock     # revoke the CLI session + wipe the cached key

Web and CLI sessions are independent. Unlocking from ko does not open the web session. The web vault locks after 1 hour idle (with a 12 hour absolute cap); the ko CLI vault locks after 4 hours idle, with no absolute cap beyond that idle window.

If the machine is lost or compromised: log into the web UI and change your vault password. This re-encrypts every BYOK entry under a new vault key; any cached key on the lost machine is now useless (ko's auto-restore checks the vault verifier before re-creating the session and silently fails against a stale key).

If you never use your own provider keys (only fc:* platform-pool models), you don't need to run ko vault unlock at all, those models route through platform-managed credentials that never touch the vault.

// 04

Sessions

Close the terminal on a long piece of work and you normally lose the thread. ko doesn't. Every message, tool call and file edit is written to the server as it happens, so ko -c tomorrow morning on a different laptop picks up mid-thought.

Sessions are pinned to the directory they started in. Start one in ~/project-a, run ko -c from ~/project-b, and you get a fresh session for B rather than project A's context bleeding into unrelated code. The binding is the feature.

sessions

ko                      # start a new session in $(pwd)
ko "quick question"     # start with an initial prompt
ko -p "one-shot"        # non-interactive: print answer, exit
ko -c                   # continue the most recent session in $(pwd)
ko -r                   # pick from a list of recent sessions
ko --resume-id <id>     # jump straight to a specific session
ko sessions             # list all past sessions
ko resume <id>          # same as --resume-id

The first time you run ko in a new directory, it auto-registers the folder as a ko project. Subsequent runs pick up the same project record.

Root flags

-m, --model

Override the model for this session. Session-scoped, never persisted.

--effort low|medium|high

How much time the agent is allowed to spend before it gives up on a task.

--system-prompt

Replace the base system prompt for this run.

--no-cache

Turn off Anthropic prompt-cache markers. Costs more; use it when you are debugging cache behaviour.

--no-episodic

Keep this one session out of cross-session memory. --no-auto-episodic keeps it out of the automatic path only.

--npa

Auto-approve every tool call with no prompt. See Cancellation & billing safety before you use it.

--listen / --port

Serve a browser UI alongside the TUI on the same session.

Running the loop in-process (`ko local`)

ko local runs the whole agent loop in-process on your machine. The loop, tool execution and your BYOK provider key all stay on the box. ko fetches the pipeline and system-prompt assignment from the platform once, then everything after that is local.

local

ko local "review the diff"     # run the agent loop in-process
ko local --detach "long job"   # background process that survives terminal close
ko local --attach <id>         # reattach to a detached (running or finished) session

Web UI alongside the TUI (`--listen`)

--listen <address> starts a browser UI next to the terminal one on the same session (--listen 127.0.0.1 for local only, --listen 0.0.0.0 to expose it on the network). --port pins the bind port; left off, ko picks a free one and prints the URL.

// 05

Pipelines

Ask three good engineers to fix the same bug and you get three shapes of answer. One reads the whole module before touching anything, another writes a failing test first, and the third patches it and runs the suite to see what screams. A pipeline is you picking which of those ko is, before it starts.

Each step carries its own model, system prompt and tool allowlist. The allowlist does more work than people expect. A planner that physically cannot call Write won't drift into implementing, because the option isn't on the table, so you get a plan back instead of a half-finished patch.

Ten pipelines seed to a new account:

KO_default

The default agent. Handles any task, code, research, writing, analysis. Self-retries on failure. What you get when you just run ko.

KO_consensus

Plan, then 3 frontier models from 3 vendors implement in parallel, a judge picks the best, then verify. For critical code where quality matters more than cost.

KO_consultant_multipart

Brainstorm to a written spec, break it into atomic tasks, then implement against checkpoints and verify. The one to reach for on a feature too big to hold in one prompt.

KO_legal

Legal generalist, drafts, reviews, researches, redlines. Saves output to a file with disclaimers.

KO_legal_adversarial_review

5-step adversarial contract review: read → argue party A → argue party B → synthesize risks → write report.

KO_business

Business strategy, market research, competitive analysis, GTM.

KO_finance

Investment analysis, valuation, financial modeling, stress tests.

KO_research

Literature review, methodology critique, research writing with citations.

KO_hr

HR policies, hiring, performance, compensation, conflict guidance.

KO_creative

Copywriting, long-form, editing, ideation with multiple variants.

Swap pipelines mid-session with the /pipeline slash command. Your conversation history is preserved; the next task uses the new flow. Build custom pipelines in the visual editor at /dashboard/pipelines.

/pipeline              # list available pipelines (TUI)
/pipeline KO_consensus # switch to a named pipeline

Switching a pipeline mid-session also resets any model override you picked with /model, so the new pipeline runs its own per-step models. The TUI prints "Model reset to pipeline default"to make the reset explicit.

During a pipeline run, each step transition banner shows which model is handling that phase, e.g. ─── Plan → Implement: Step Implement · claude-sonnet-4-6 ───. The top-bar model label also swaps to the step's model as soon as streaming starts.

Per-step caps

Each step in the editor has three caps. The pipeline editor surfaces them as Max $, Max In, Max Out (hover the icons next to each label for the live definitions). In short:

Max $ (maxCostUsd)

Per-step spend ceiling. Reactive, a call in flight can't be stopped before we know its cost. Enforced between models in fallback strategy (the next model won't start if the cap is already hit) and used to flip the step outcome to budget_exceeded at step end so transitions can branch on it. Parallel strategies (consensus, present_all) fire all N models concurrently, so this only affects transitions for them, not mid-flight spend.

Max In (maxInputTokens)

Input-token cap for the prompt. When the estimated context exceeds this, older messages are dropped (system + tail preserved) until the prompt fits ~70% of the cap, leaving headroom for the response.

Max Out (maxOutputTokens)

Output-token cap per call. Passed to the provider as max_tokens (hard-stops generation at the API) and feeds the pre-call wallet check, we refuse to start a request whose worst-case cost would exceed your wallet. Defaults to 32k when blank.

// 06

Pipeline design

Pipelines are where ko goes from "generic agent" to "opinionated workflow that beats a single-model baseline." Design matters: the same three frontier models wired differently (parallel vote vs. plan→implement→verify vs. adversarial A/B debate) produce wildly different quality, latency and cost profiles. This section unpacks every field in the editor at /dashboard/pipelines.

Anatomy of a step

Every step in a pipeline has the following fields. Most are surfaced as form controls in the visual editor at /dashboard/pipelines; all of them show up in the JSON if you export or edit a pipeline config directly.

name

Unique identifier within the pipeline (e.g. Plan, Implement, Verify). Referenced by transitions.goto, shown in the step-change banner during execution, and used for loop protection (any one name can be visited at most 3 times per session to prevent runaway loops).

role

Free-form label for the step's purpose, planner, implementer, reviewer, tester, verifier. Doesn't affect execution, purely documentation for whoever reads the pipeline next.

models

Array of model refs. One entry = the step is a normal single-model LLM call. Two or more entries = multi-model step; multiModelStrategy decides how they're combined. Refs can be slot:<name> (abstract, resolved by the platform), fc:<provider>/<model-id> (direct Fight Club / OpenRouter), or user:<config-id>:<model-id> (your own provider keys).

systemPrompt

The per-step system prompt, the primary lever you pull to shape behavior. A "planner" step and an "implementer" step can use the same model and differ only in this field. Ko prepends its own base prompt (tool descriptions + ground rules) before injecting yours, so your text should describe the step's role, inputs, expected output and failure modes. Don't re-explain what the tools do.

tools

Allowlist of tool names this step is permitted to call (from the 19 in the Agent tools section). Shorter lists = more focused behavior. A planner with [Read, Glob, Grep, Lsp] can explore but not modify; an implementer adds [Write, Edit, Patch, Bash]; a verifier with only [Bash] can run tests but not fix them. Empty array = text-only mode (no tools at all).

temperature

Sampling temperature, 0-2 range. Controls how deterministic the model is. 0-0.3 for code implementation and verification (want the same answer every time); 0.3-0.7 for planning, analysis, and review (need some exploration); 0.7-1.0 for creative writing and ideation. Above 1.0 is rarely useful and mostly produces chaos.

maxRetries

How many times a single LLM call is retried on provider-level errors (network, rate limit, transient 5xx) before the step fails. Independent of transition-level retry via goto: _self. Typical values: 0 (fail fast), 2 (default resilience), 5 (for flaky providers). Retries reuse the same model; switch to the fallback multi-model strategy if you want retries against a different model.

maxCostUsd

Per-step USD spend ceiling. Reactive, a single call in flight can't be stopped mid-stream, but enforced between iterations and between fallback models, and used to flip the step outcome to budget_exceeded so your transitions can branch. See the ⓘ tooltip in the editor for per-strategy semantics.

maxInputTokens

Input-token cap for the prompt. When the estimated context exceeds this, ko drops older messages (system + tail preserved) until the prompt fits ~70% of the cap. Use this to keep long conversations from ballooning cost on cheap summarizer steps.

maxOutputTokens

Output-token cap per call. Hard-stopped by the provider via max_tokens, and fed into the pre-call wallet check (we refuse to start a request whose worst-case cost would exceed your wallet). Blank = 32k default.

transitions

Array of { condition, goto } rules evaluated top-to-bottom after the stage runs; the first match wins and its goto decides the next stage. Conditions match the stage's derived outcome string (success, failure, the verdict on a Verify stage, etc.). The outcome is computed by the stage's outcomeRule from the structured artifact the model emits. The special condition default matches everything and belongs as the last rule. Covered in detail below.

hooks

Optional { pre: [...], post: [...] } of bash commands that run before/after the step, executed through the agent's tool flow (so they also require permission). Useful for environment setup (git pull, pnpm install), in-pipeline verification (pnpm test, ruff check), or cleanup. A pre-hook is a precondition gate: if any pre command exits non-zero the step is skipped and the pipeline halts (pre_hook_failed). Post-hook failures are logged as tool output but don't short-circuit the step.

multiModelStrategy

Only relevant when models.length > 1. Picks the multi-model flow: fallback, present_all, consensus, or chain. Single-model steps ignore this field. Full breakdown in the next subsection.

judge

Optional { model, prompt? } config used by present_all and consensus strategies to auto-pick a winner when the user doesn't choose within consensusTimeoutSec. If omitted, the timeout falls back to "first valid candidate". The judge's prompt is a standard instruction like "Pick the answer that best satisfies X; respond only with a JSON object { "pick": <index>, "reason": "..." }."

consensusTimeoutSec

How long the picker waits for user input before auto-picking via judge (or first-valid fallback). Defaults to 30s. Shorter = less interactive; longer = more human-in-the-loop.

Multi-model strategies

When models has more than one entry, the step runs all N through one of four strategies, chosen by multiModelStrategy. Each has different cost, latency and quality trade-offs:

fallback

Try models sequentially until one succeeds. Use when: you want a primary + backup in case of provider errors or rate limits. Cost: pays only for models that actually run (typically just the first). Latency: serial. Honors maxCostUsd between models; won't start the next one if the running total exceeds the cap.

present_all

Run all N in parallel; user or judge picks the winner. Use when: you want human-in-the-loop quality selection. The TUI shows all candidates side-by-side and times out after consensusTimeoutSec.Cost: pays for all N. Latency: max of the N models.

consensus

Run all N in parallel; auto-pick if similar, prompt if divergent. Use when: you want a speed/quality check, if frontier models agree you ship fast, if they disagree you stop and look. Cost: pays for all N. Latency: max of the N models + tiny similarity check.

chain

Pipe each model's output into the next as refinement input. Use when: you want iterative improvement (draft → tighten → polish). Cost: pays for all N, each with growing input context. Latency: serial sum. Diminishing returns past 3 models.

Transitions

Transitions are the state-machine layer. After a stage emits its structured artifact, an outcome string is derived from the artifact via the stage's outcomeRule (e.g. simpleSuccess returns success or no_work; fieldEnum:verdict reads the artifact's verdict field and returns it directly). That outcome is then matched top-to-bottom against the stage's transitions array; the first match's goto picks the next stage.

The outcome rules ship with the platform; the common ones are:

simpleSuccess, returns no_work when the artifact set no_work: true, otherwise success.
implementOutcome, returns success if any executed step in the artifact reported completed, else failure.
fieldEnum:verdict, reads the artifact's verdict field (pass / fail on Verify stages).
fieldEnum:outcome, reads the artifact's outcome field (Test / Review stages emit this directly).

The goto side supports three reserved targets plus any step name:

_next

The step immediately after this one in the array (normal flow).

_self

Rerun the same step. Loop protection caps any step at 3 visits to prevent runaway cost.

_done

End the pipeline successfully.

<step-name>

Jump to any named step (useful for branching: on tests_fail go back to Implement).

The special condition default matches everything and is almost always the last rule, it's the fallback when no named condition matched.

example transitions

transitions:
  - { condition: "tests_fail",       goto: "Implement" }   # loop back
  - { condition: "security_issue",   goto: "SecurityFix" } # branch
  - { condition: "budget_exceeded",  goto: "_done" }       # stop on cap
  - { condition: "default",          goto: "_next" }       # otherwise advance

Common patterns

A few recurring shapes worth copying:

Router → Level pipeline

A fast, cheap model classifies the task's complexity (1-5) and hands off to a tailored level pipeline. KO_default uses this shape, level 1 is answered by the router itself; level 5 triggers the full planner → implementer → verifier chain.

Plan → Implement → Verify

Three steps, three models. Planner uses a strong reasoner at temperature 0.5 with read-only tools to produce a plan document. Implementer uses a fast coder at temperature 0 with write/edit/bash to execute the plan. Verifier uses a cheap model with bash-only to run tests and emit tests_fail if anything broke, transition loops back to Implement.

Consensus planning

KO_consensus's Plan step uses multiModelStrategy: consensus across three frontier models. If they agree the pipeline auto-picks and moves on; if they diverge the user is shown all three plans to choose from. Catches the "one model silently went off the rails" failure mode.

Adversarial review

KO_legal_adversarial_review's pattern: one step argues for party A, one step argues for party B (same model, opposite system prompts), a synthesizer step reads both and ranks risks. Works well whenever you want a devil's advocate baked into the process.

Design principles

Shorter tool allowlists beat longer ones. A planner that can't write files won't try to implement. A reviewer that can only Read + Grep won't "fix" problems instead of reporting them.
Use slots, not fixed model addresses. slot:code_frontier lets the platform swap the concrete model without touching your pipeline. Your KO_custom stays current through frontier model upgrades automatically.
Cheaper verifiers, expensive planners. Planning errors cascade; verification errors are local. Spend the budget where a bad decision is costliest to undo.
Branch on the artifact, not on text. When a stage's outcome has more signal than success/failure ("tests failed", "security issue found", "needs human review"), pick an outcomeRule that reads the structured field carrying that signal (e.g. fieldEnum:verdict for a Verify stage, fieldEnum:outcome for Review). The model emits the field; the orchestrator does the routing.
Cap ambition with maxCostUsd. Even if the cap fires reactively (after the step finishes), it's still the fastest way to keep a runaway _self loop from draining a wallet.
Test pipelines on small tasks. Edit your pipeline in the visual editor, switch to it with /pipeline <name>, and ask something cheap like "summarize README.md" to watch the flow execute before committing to an expensive run.

// 07

Models & slots

Ko uses model slots, abstract roles like code_frontier, code_planner, code_verifier, budget_fast. The platform admin maps each slot to a concrete model address (e.g. fc:anthropic/claude-sonnet-4.6). Pipelines reference slots, not concrete models, so the same pipeline works across model upgrades without edits.

19 providers are supported, routed through the platform pool for Fight Club models and directly for user-managed API keys:

OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI (Grok),
Cohere, Groq, Together, Fireworks, Cerebras, SambaNova,
Perplexity, OpenRouter, Ollama, Azure OpenAI, Bedrock,
HuggingFace, Replicate

Override the model for a single session with -m, or swap mid-session with the /model command inside the TUI:

ko -m fc:openai/gpt-4o "summarize src/"
ko -m user:<config-id>/mistral-large "write a README"
/model   # list models; pick a number to switch for the rest of this session

The fc: prefix uses Fight Club's platform wallet; the user: prefix uses your own vault-encrypted provider config. Add your own provider keys at /dashboard/settings.

Model picks are session-scoped, they are not persisted to disk. Closing the TUI and reopening it starts fresh with the pipeline's own per-step models. Swapping pipelines mid-session also drops any active override (see Pipelines). If you want a sticky preference, pass-m on every launch or configure it platform-side in your slot mapping.

// 08

Agent tools

Ko agents have 19 tools. Each pipeline step declares which subset it can use, and that allowlist is the strongest control you have over how a step behaves. Shorter lists produce more focused work.

Bash

Run a shell command and capture stdout/stderr. Subject to permission prompts unless auto-allowed.

Read

Read a file at an absolute path.

Write

Create or overwrite a file. Creates pre-edit snapshots for /rollback.

Edit

Replace an exact string in a file. Safer than Write for targeted changes.

Patch

Apply a unified diff (git diff / diff -u format) atomically, handles multi-hunk, multi-file changes in one call. Preferred over chained Edits when the change spans many locations. Pre-validates paths, snapshots every touched file for /rollback.

Glob

Find files by glob pattern (e.g. src/**/*.ts).

Grep

Ripgrep-backed content search with file-type and glob filters.

Lsp

Language-server queries, definition, references, hover, workspaceSymbol. Uses gopls, typescript-language-server, pyright, or rust-analyzer depending on file extension. Semantic, finds type-scoped uses of a symbol that Grep can't. Reports clearly when the required server isn't installed.

WebFetch

Download a URL's content. Useful for fetching docs or small files.

WebSearch

Search the web, current events, unfamiliar libraries, benchmarks.

Diff

Show a git diff between refs or working tree.

Clipboard

Read/write the system clipboard.

Docker

Run containerized commands, builds, tests, isolated workloads.

SQL

Execute queries against a configured database URL.

AskUser

Ask the user a free-form question mid-task when guessing would waste significant work. The next typed line is the answer; ESC cancels with an explicit "user cancelled" response. Degrades to an error in --print mode so batch runs don't hang.

KoPlan

Write and update the structured task plan for multi-part work. /plan in the TUI renders whatever this has produced.

Classify

Score a task against a rubric and return a structured verdict. Router steps use it to decide which level pipeline handles the work.

VectorSearch

Semantic search over any store available this session, workspace, episodic, session scratchpad or audit. Retrieval is a call the agent chooses to make, so you pay for it only when it is used.

ReadScratchpad

Fetch one offloaded tool output back in full, verbatim, no chunking and no cost.

Pipelines also use auto-generated Emit{Schema} tools (one per stage, named after the stage's declared artifactSchema: EmitPlan, EmitImplement, EmitVerify, etc.). Each stage ends when the model calls its emit tool with a structured artifact; the orchestrator validates the artifact against the schema and the stage's contract rule, then transitions. You don't define these tools; the orchestrator wires them in per stage based on artifactSchema.

// 09

Skills

Skills are prompt + tool bundles that teach ko a specific domain or workflow (e.g. "write Next.js Server Components correctly", "apply these brand guidelines", "run this CI workflow"). The marketplace at /dashboard/skills lists installable community skills.

skills

ko skills                    # list installed skills
ko skills install <id>       # install from the store
ko skills uninstall <name>   # remove a skill
ko skills update             # update all installed skills
ko skills create <name>      # scaffold a new skill in the current project
ko skills audit              # inventory every loaded skill with hash + signature status

ko skills audit is the supply-chain view. It lists every loaded skill (pre-installed, global and project) with its source, a SHA-256 hash of the content and its signature status, verified for a valid Ed25519 signature, INVALID for one that doesn't check out, unsigned or bundled otherwise. The hash lets you spot silent content drift between installs; the signature column tells you whether a skill is what its author signed.

Skills are sourced from ~/.ko/skills/ (global) and .ko/skills/ (project-local). Global skills apply to every session; project skills only apply when ko runs in that directory.

Skill Seekers, coming soon, will be a specialized skill that reads external documentation URLs and caches them as an in-agent reference. Point it at any library's docs site and ko becomes an instant expert in that library.

// 10

Memory

Memory is a lightweight, file-based long-term store. Two scopes:

~/.ko/memory/

Global memory. Facts that apply across every project, your name, preferences, coding style, tools you always reach for.

.ko/memory/

Project memory. Scoped to a single directory. Project quirks, team conventions, in-flight decisions, known bugs.

memory

ko memory list             # list all memory files (global + project)
ko memory show <name>      # print a memory file
ko memory edit <name>      # open a memory in $EDITOR (creates from a template if missing)
ko memory delete <name>    # remove a memory and reindex
ko memory reindex          # re-derive MEMORY.md from frontmatter (self-heal)

edit, show, delete and reindex all take a <name> argument, the bare command errors. Scope the operation with --global (~/.ko/memory/) or --project (.ko/memory/). ko memory edit MEMORY.md is refused unless you pass --force: MEMORY.md is the auto-derived index that the agent's MemoryWrite regenerates, so hand edits get overwritten on the next save. Edit the individual memory files, not the index.

Inside a session, ko writes to memory autonomously when you tell it something worth remembering ("I always use pnpm, not npm"), it'll save a memory file and recall it in future sessions. The index file MEMORY.md is loaded into every session's system prompt; individual memory files are loaded on demand.

Cross-device memory sync

File memories are local by default. Enroll a scope in cross-device sync and ko keeps its .ko/memory/ in step across your machines through the vault, encrypted, so the server stores opaque blobs it can't read.

memory sync

ko memory sync on          # enroll this scope for cross-device sync
ko memory sync status      # enrollment state + local/remote memory counts
ko memory sync off         # opt out (wipes the vault copy, keeps local files)
ko memory sync             # manual full pull+push now

# add --global to any of the above to operate on ~/.ko/memory instead of the project

Promoting memories into the Brain (`ko brain`)

File memories are sealed, encrypted so the platform can't read them. The semantic Brain is the opposite: the agent retrieves promoted memories by relevance, which means the platform embeds their plaintext. Promotion therefore de-seals, and it's gated behind a one-time confirmation.

brain

ko brain import-memory                 # promote eligible file memories into the Brain (DE-SEALS)
ko brain import-memory --dry-run       # list what would promote, write nothing
ko brain import-memory --project       # limit to <cwd>/.ko/memory  (--global for ~/.ko/memory)
ko brain import-memory --name <name>   # promote a single named memory
ko brain import-memory --prune         # drop promoted Brain docs whose source memory is gone
ko brain auto [off|manual|auto]        # show or set the memory→Brain promotion mode

Eligibility follows each memory's promote frontmatter flag (project and reference memories default to promotable; user and feedback memories don't), and anything that looks like it carries a secret is skipped server-side. The first real import prints a de-seal warning and asks for confirmation; --yes skips that prompt once you understand the tradeoff.

// 11

RAG features

The memory section above covers human-edited memory files. This section covers the vector-indexed surfaces ko manages for you: workspace index, cross-session memory, session scratchpad and audit log. Chunks are retrieved by meaning rather than by exact string match, which is the difference between "find the auth handler" landing the right file and returning nothing.

All four sit on the same Ringside vector_stores API that ships to external customers, so ko is its own biggest user of that infrastructure.

Workspace (`ko workspace`)

Indexes the repo so the agent can answer concept queries. A one-shot walk on first init, plus an fsnotify watcher that re-embeds changes with a 5-second debounce. The walker honors .gitignore and an optional .koignore for ko-specific exclusions. It skips files larger than 1 MB and sniffs the first 4 KB for null bytes to filter binaries. Files matching common secret patterns (.env, *.pem, id_rsa*, secrets.*) are never embedded.

workspace

ko workspace init           # walk the repo, embed every kept file
ko workspace init --no-watch   # skip the background fs-watcher
ko workspace sync           # one-shot reconcile after offline edits
ko workspace status         # show indexed/pending/skipped counts
ko workspace drop           # delete the workspace store entirely

The differentiator vs Cursor or Claude Code is that retrieval is semantic, not lexical. Ask "where do we rate-limit incoming webhooks" and the agent pulls the right middleware even if the file never uses the word "webhook". Internally the agent calls a VectorSearch tool, so retrieval is paid only when used and the chunks come back as cite-able tool output.

Cross-session memory (`ko memory`)

Opt-in. When enabled, every completed session is summarized by google/gemini-2.5-flash and the summary is embedded into your personal episodic store. New sessions get the top 5 relevant past sessions prepended to the system prompt, and the agent can VectorSearch it mid-session if it wants more.

memory

ko memory enable                    # turn on episodic indexing
ko memory enable --backfill 90      # also index the last N days: 0 | 30 | 90 | 365 | all
ko memory disable                   # stop indexing new sessions, keep what's already there
ko memory status                    # show enabled state, episode count, retention
ko memory search "<query>"          # forensic search across your own session history (--limit, --scope)
ko memory export --since=2026-01-01 > episodes.jsonl   # stream indexed episodes (--format, --scope)
ko memory wipe --yes                # drop the store and soft-delete the episodes

Episodes older than 365 days detach from the vector store but stay in your archive (so memory export still returns them). Exclude patterns on the user record skip sessions matching a regex, --no-episodic on any individual ko -p run opts that one out.

The store lives server-side, so the memory follows the account rather than the laptop. Switch machines, sign in, the agent still knows what you were doing.

Session scratchpad (always on)

Tool outputs above 8 KB get uploaded to a per-session vector store and replaced in the running context with a short stub: [scratchpad:<file_id>] preview: ... (47213 bytes offloaded). The agent gets two retrieval paths back:

ReadScratchpad(file_id)

Fetch the complete original output for a specific stub, verbatim, no chunking, free.

VectorSearch(session_store_id, query)

Semantic search across every offloaded output in the session, returns chunks plus their originating file_id.

On by default, and you rarely want it off. Long sessions stop paying input tokens on cold context, and quality stops sliding late in the session for the same reason. If a scope genuinely needs it disabled, ko scope settings <cat=val> --scratchpad-disabled=true turns it off there. The per-session store self-destructs 24 hours after the session ends.

Audit (`ko audit`)

Opt-in. When enabled, every tool call from every session indexes one chunk into your audit store: session_id, timestamp, tool name, redacted args, success/error, a 256-char preview of the output and the scope tags the session ran under. Not the full output (that lives in the session log), just the needle-in-haystack lookup row.

audit

ko audit enable                            # provision the store, backfill 90 days
ko audit enable --backfill-days=365        # or pick your own window, 0-365
ko audit status                            # enabled state, store id, retention
ko audit search "<query>"                  # natural-language search across all your sessions (--limit)
ko audit search "<query>" --scope=release=production
ko audit export --since=2026-04-01 > audit.jsonl   # also --session, --format
ko audit disable                           # stop indexing; raw events still persist in ko_session_events

Default retention is 90 days, configurable up to several years per scope. Sessions tagged release=production can carry a 2555-day (7-year) retention while everything else expires fast. The admin dashboard exposes a cross-user view at /admin/ko/audit for abuse review and compliance evidence.

Concrete use: you finished a 4-hour session yesterday and want to confirm the agent never read prod_secrets.env. ko audit search "prod_secrets" returns every chunk that mentions it. Zero matches is a real answer, not a hope.

Scopes (`ko scope`)

A scope is an ordered set of (category, value) tags that classifies what a session is doing. project=fightclub, release=production, customer=rubin, severity=p0 or any custom category you want. Every indexed chunk carries the session's scope tags, retention is set per scope, and searches can filter by any subset.

Resolution at session start, later overrides earlier:

workspace default (project=<workspace.display_name>)
.ko/scope.yaml at the repo root (per-branch overrides allowed)
KO_SCOPE env var
--scope CLI flag on ko (coming soon). Not yet wired on the root command, so ko --scope=... "prompt" currently errors with an unknown flag. Use KO_SCOPE to set a session's scope from the shell today.

scope

ko scope show                                            # show the effective scope for cwd
ko scope list --category=customer                        # every scope you've touched, with use counts
ko scope drop customer=rubin                             # remove a tag
ko scope settings release=production --audit-retention=2555   # 7-year retention for prod
ko scope settings customer=rubin --audit-enabled=true         # always audit Rubin work
ko scope settings project=secrets --scope-widening-locked=true # hard wall, see below

Scope as a privacy boundary. Default behavior treats scope as a convenience filter, the agent can widen beyond the session's scope on its own VectorSearch calls. Setting --scope-widening-locked=true flips that: when a session runs in that scope, the web tier rejects any scope_mode: global or scope_mode: override with a 403, and silently constrains scope_filter to the current scope. Enforced server-side at /api/ko/rag/search, not in the agent prompt, so a model that decides to try anyway gets blocked.

Retrieval is a tool, not an injection

Cursor and Continue.dev inject retrieved chunks into every model call. Linear cost in turns multiplied by context size, even when the agent didn't need RAG that turn, and the agent can't reason about why text appeared in its context. ko ships a VectorSearch tool the agent decides to call. You pay for retrieval only when it's used, and the chunks come back as tool output the agent can cite by source.

The system prompt lists which stores are available this session (workspace, episodic, session, audit) and when to prefer VectorSearch over Grep. Grep is faster and more precise for exact symbols, VectorSearch handles the synonym queries that grep misses.

// 12

MCP servers

Model Context Protocol (MCP) is how IDEs and other agents connect to external tools. Ko speaks MCP in both directions: it can expose itself as an MCP server for IDEs to consume, and it can consume other MCP servers as tools.

Ko as an MCP server. Any MCP-compatible client (Cursor, Windsurf, JetBrains with MCP plugin, VS Code with MCP extension, Claude Desktop) can use Knockout as a coding backend:

mcp

ko mcp show            # pick your IDE from the list
ko mcp show vscode     # or name it: vscode | cursor | windsurf | jetbrains | claude | generic

# What 'ko mcp show generic' prints:
{
  "knockout": {
    "command": "ko",
    "args": ["mcp-serve"]
  }
}

Each IDE alias prints the config in that editor's own shape, the exact file path it belongs in and the quirks worth knowing. The IDE setup page walks all six, plus the troubleshooting for when the IDE cannot find the binary.

ko mcp-serve reads JSON-RPC from stdin, writes to stdout; logs go to stderr so they don't interfere with the protocol channel. Your IDE spawns it as a subprocess.

Ko consuming other MCP servers. Configure external servers in ~/.ko/settings.json under mcpServers. They appear as extra tools the agent can call, alongside the built-in 19.

// 13

Sync & encryption

Ko sync moves work between machines without exposing your code to the server. Snapshots are encrypted locally before upload with a key only you hold; the server stores opaque blobs and cannot read your code.

sync

ko sync push                # encrypt the working tree and upload
ko sync pull [snapshot-id]  # download and decrypt a snapshot (latest if omitted)
ko sync snapshots           # list available snapshots

# Also as slash commands inside the TUI:
/sync push
/sync pull
/sync list

The encryption key is derived from your vault password. Lose the vault password and there is no recovery, the snapshots are unreadable. Set a vault password at /dashboard/settings; it's never sent to the server.

Snapshots skip .git, node_modules, .next, __pycache__, vendor, .ko, .venv, dist, build, .cache, target, and binary artifacts (.exe, .dll, .so, .pyc, etc.).

Pipeline configs and session messages are also vault-encrypted on the server if you've set a vault password. When you open the dashboard, you unlock the vault once per browser session (1-hour rolling idle TTL, 12-hour absolute cap) to view or edit them.

// 14

Slash commands

Commands typed inside the TUI, starting with /.

/help

Show all available slash commands.

/exit, /quit, /q

Exit the TUI.

/clear

Clear chat history. System prompt is preserved.

/compact

Summarize the current conversation into a short memo. Frees context while keeping continuity.

/model, /models

List models and switch for the rest of this session. The override is session-scoped and a pipeline switch drops it.

/unlock, /lock

Unlock or lock the vault without leaving the TUI.

/memory list

List memory files visible to this session.

/memory sync on|off|status

Control cross-device memory sync for this scope.

/pipeline

List available pipelines. /pipeline <name> switches mid-session.

/status

Show connection status (SSE stream health).

/cost

Show cumulative cost of the current session.

/history

Show recent input history.

/thinking

Toggle tool-output visibility: /thinking on or /thinking off.

/plan

Show the structured task plan for pipelines that emit one (e.g. the multipart consultant). Single-step pipelines have no plan to show.

/rollback

Restore files from pre-edit snapshots, undo the last task's file edits.

/sync push, /sync pull, /sync list

Trigger sync operations without leaving the TUI.

// 15

Project templates

106 project templates that aren't rigid starters. An interactive wizard at /dashboard/templates asks about your business, audience and style, then generates a bespoke project design. The result is a template ID (e.g. tp_abc123) you hand to the CLI:

build

ko build tp_abc123  # fetch the configured template and start a session

Twelve categories: business, e-commerce, SaaS, blogs, community, education, health, productivity, creative, food, real estate and nonprofit. Each template carries several variants, so the wizard's answers change what gets built rather than just which starter you cloned.

// 16

Configuration

All ko config lives in ~/.ko/settings.json (global) and optionally .ko/settings.json (project-local, overrides global). Open in your $EDITOR:

ko config

Known keys:

trustedDirs

Array of absolute directory paths. Ko will auto-trust these without prompting on each run ("trust this directory?" dialog is skipped).

autoAllow

Array of tool patterns that skip the permission prompt. Defaults: Read, Glob, Grep, Bash(ls), Bash(cat), Bash(git), Bash(pwd), Bash(which), Bash(echo).

mcpServers

Object mapping server names to MCP connection configs. See the MCP section.

hideCost

Hide the wallet and session cost from the status bar. For screen shares and pairing, not for saving money.

Note: the older defaultModel key is no longer honored. Model picks are session-scoped and stale persisted values are wiped on startup. Pass -m or use /model instead.

~/.ko/settings.json

{
  "trustedDirs": [
    "/home/me/work/project-a",
    "/home/me/work/project-b"
  ],
  "autoAllow": [
    "Read", "Glob", "Grep",
    "Bash(ls)", "Bash(cat)", "Bash(git)",
    "Bash(pwd)", "Bash(which)", "Bash(echo)"
  ]
}

// 17

Cancellation & billing safety

Ko has multiple layers of defense to make sure you never pay for work you didn't authorize, or, conversely, that OpenRouter never bills us for requests whose cost we can't recover from you. Practical rundown of how they interact:

ESC cancellation

Hit ESC in the TUI at any time. The client POSTs to /api/ko/sessions/:id/abort without dropping the SSE socket. Server-side, the per-turn AbortController fires, the in-flight provider request is aborted, and you're charged only for the tokens that actually streamed before the abort, never the full budget of what the model would have generated.

A synthetic "[Turn cancelled by the user. Abandon any partial work above…]" marker is appended to the conversation so the next turn doesn't silently continue the cancelled task. If you then ask an unrelated question, the model treats it as fresh.

Live cost ticking

The Session $ counter in the status bar updates roughly every 750ms while a model is streaming rather than at end of turn, so you can watch spend accumulate on a long call and hit ESC while it still matters.

Hard per-call wallet cap

Before every fightclub-billed call we estimate the worst-case cost (input tokens plus the step's maxOutputTokens, or 32k default) and refuse to start the request if it could exceed your wallet balance. This prevents the bad case where OpenRouter bills us for tokens that atomic chargeUsage then refuses to charge you. Applied to both single-call and multi-model steps.

Idle watchdog

The CLI sends a heartbeat ping every 30 seconds while connected. If the server goes 90 seconds without any client activity (heartbeat, message, tool result, …) while a turn is in flight, it aborts the turn. Catches network drops, laptop sleep and browser refresh, anything that would otherwise leave the agent loop billing into the void.

`--npa`: no permissions asked

--npa turns off every permission prompt for the run. Each tool call and each shell command is auto-approved with no confirmation, so the agent can write files, run Bash and touch anything in the working directory without stopping to ask. It is the same auto-approve behavior -p / --print already uses for non-interactive runs, exposed as its own flag.

Use it only where you accept the blast radius, a scratch directory, a container, a CI job you already trust. On a real working tree it removes the one gate that stops a confused model from running a destructive command. The billing caps above still apply, but the human check-in does not. Leave it off for anything you would want to see a prompt for.

// 18

Troubleshooting

Vault is locked

Run ko vault unlock once and ko caches the key locally. The CLI vault locks after 4 h idle (no absolute cap); each real turn rolls the window forward, so an active session stays unlocked. Web sessions are independent and lock after 1 h idle (12 h absolute cap). See Vault (BYOK unlock).

Model unavailable

The slot the step requested isn't mapped, the provider is down, or your wallet balance is zero. Check ko auth status for balance; check /dashboard/settings for provider configs.

Session returns HTTP 500

Usually a transient platform issue, rerun. Persistent 500s on the same session suggest the session is wedged; start fresh with ko (no -c).

Insufficient wallet balance

Top up at /dashboard/credits. Or run the task on a cheaper model with ko -m fc:<budget-model> instead of a frontier default, and cap the expensive steps with maxCostUsd in your pipeline.

Ko keeps calling the same failing command

Ko's insanity guard breaks the loop after 3 identical tool calls. If you see this, the model is confused, try /clear or rephrase the task.

Can't find ~/.ko/

Ko lazy-creates ~/.ko/ on first run. If it's missing after ko auth login, permissions are probably wrong on your home directory, check the error output from ko.

Reset everything

Nuclear option: rm -rf ~/.ko and re-run ko auth login. You'll lose local memory, skills and cached sync config, server-side history survives.

// 19

Keyboard shortcuts

Inside the TUI (internal/ui2, the only shipped front-end):

Enter

Send the current input. While the @ file overlay is open on a folder, Enter descends into that folder instead of sending.

Tab

Accept the highlighted completion, a slash command when the input starts with /, or a file entry in the open @ overlay.

@

Type @ followed by a path to open a file picker. The selected file's contents are inlined into your prompt when you send. See @file references below.

Esc

Abort the current in-flight turn. Already-made edits are preserved. If an AskUser prompt is open, Esc cancels that instead.

↑ / ↓

Walk through input history, or move the highlight when the @ or slash overlay is open.

Ctrl+C

Abort the turn and exit immediately. A single press quits, there is no second-press confirmation.

/

At the start of an empty input, opens the slash command picker.

@file references

Type @ and a path anywhere in your prompt and ko attaches that file for the turn. As you type, an overlay lists matching entries; Tab or Enter accepts the highlighted one, folders expand so you can drill in. On send, ko reads each referenced file and inlines its contents into the message, labelled [TARGET] when it's the operand of your verb or [CONTEXT] when it's reference material. Large files are truncated at 10 KB. Example: refactor @src/auth/vault.ts to use the new key format.

// 20

Command index

Every command the binary answers to, alphabetically, with the section that explains it. Root flags live in Sessions and Cancellation & billing safety; slash commands have their own section.

Command	Does	Section
`ko [prompt]`	Start a session in the current directory, optionally with a first prompt	sessions
`ko audit disable`	Stop indexing new tool calls; raw events still persist	rag
`ko audit enable`	Provision the audit store and backfill (--backfill-days, default 90)	rag
`ko audit export`	Stream matching audit rows as JSONL (--since, --session, --format)	rag
`ko audit search`	Natural-language search across every tool call you have made	rag
`ko audit status`	Show whether auditing is on, plus store and retention state	rag
`ko auth login`	Browser OAuth; writes a scoped token to ~/.ko/credentials.json	auth
`ko auth status`	Which account, wallet balance, vault state	auth
`ko auth token <tok>`	Cache a token pasted from the dashboard	auth
`ko brain auto`	Show or set the memory to Brain promotion mode	memory
`ko brain import-memory`	Promote file memories into the semantic Brain. De-seals them	memory
`ko build <tp_id>`	Start a session from a configured project template	templates
`ko config`	Open ~/.ko/settings.json in $EDITOR	config
`ko local [prompt]`	Run the whole agent loop in-process on your machine	sessions
`ko mcp show [ide]`	Print MCP config for vscode, cursor, windsurf, jetbrains, claude or generic	mcp
`ko mcp-serve`	Speak MCP over stdio so an IDE can drive ko	mcp
`ko memory delete <name>`	Remove a memory file and reindex	memory
`ko memory disable`	Stop indexing sessions; keep the episodes already stored	rag
`ko memory edit <name>`	Open a memory in $EDITOR (--global, --project, --force)	memory
`ko memory enable`	Turn on cross-session memory (--backfill 0\|30\|90\|365\|all days)	rag
`ko memory export`	Stream indexed episodes to stdout (--format, --since, --scope)	rag
`ko memory list`	List memory files, global and project	memory
`ko memory reindex`	Rebuild MEMORY.md from the individual memory files	memory
`ko memory search`	Search your own session history (--limit, --scope)	rag
`ko memory show <name>`	Print one memory file	memory
`ko memory sync on\|off\|status`	Cross-device memory sync through the vault	memory
`ko memory wipe --yes`	Drop the episodic store and soft-delete the episodes	rag
`ko resume <id>`	Jump straight to a past session	sessions
`ko scope drop <cat=val>`	Remove a scope tag	rag
`ko scope list`	Every scope you have used, with counts (--category)	rag
`ko scope settings`	Per-scope retention, audit and privacy settings	rag
`ko scope show`	The effective scope for the current directory	rag
`ko sessions`	List past sessions	sessions
`ko skills audit`	Every loaded skill with a SHA-256 hash and signature status	skills
`ko skills create <name>`	Scaffold a new skill in the current project	skills
`ko skills install <id>`	Install a skill from the store	skills
`ko skills uninstall <name>`	Remove an installed skill	skills
`ko skills update`	Update every installed skill	skills
`ko sync pull [id]`	Download and decrypt a snapshot	sync
`ko sync push`	Encrypt the working tree locally and upload it	sync
`ko sync snapshots`	List available snapshots	sync
`ko update`	Check for a newer release and replace the running binary	install
`ko vault lock`	Revoke the CLI vault session and wipe the cached key	vault
`ko vault status`	Vault state plus local cache state	vault
`ko vault unlock`	Derive the vault key and open a CLI vault session	vault
`ko version`	Print the installed version	install
`ko workspace drop`	Delete the workspace index (--yes, --repo)	rag
`ko workspace init`	Walk and embed the repo (--no-watch, --dry-run)	rag
`ko workspace list`	Every workspace registered to your account	rag
`ko workspace status`	Indexed, pending and skipped counts	rag
`ko workspace sync`	One-shot reconcile after offline edits	rag

Environment variables: KO_TOKEN_FILE (preferred token source for CI), KO_SCOPE (set the session scope from the shell), KO_QUIET (suppress the workspace nudge). KO_API_TOKEN still works and prints a deprecation warning; don't reach for it in new setups.

// end of manual

Start the 5-minute tutorial·Open Dashboard