test-it
Multi-project test aggregator for Rust and Node workspaces. Runs unit, integration, and external tests across project groups, boots Docker Compose scopes, merges LCOV coverage, diffs changed files against git, and produces terminal plus HTML reports.
review-it is a companion CLI that runs changed-file code review only — same test.yaml, no tests, Docker, Vitest, or coverage. See Two tools: test-it vs review-it.

Table of contents
- Two tools: test-it vs review-it
- Features
- Quick start
- Prerequisites
- Installation
- Usage
- review-it (review-only CLI)
- Rules catalog
- Environment variables and compatibility
- Auto-scaffold
- Architecture
- Repository layout
- Configuration (
test.yaml) - Package naming
- Integration tests
- Isolated run directories
- Recovering interrupted review runs
- Reports
- Sample projects
- Limitations
- Development
- For AI agents
Two tools: test-it vs review-it
| test-it | review-it | |
|---|---|---|
| Purpose | Full test workflow: Rust/Node tests, Docker scopes, coverage, HTML report | Changed-file code review only |
| Config | test.yaml (canonical) |
review.yaml preferred; falls back to test.yaml. Ignores test-only keys when using test.yaml |
| Default run dir | target/test-it/runs/<run-id>/ |
target/review-it/runs/<run-id>/ |
| Review output | reports/review.txt, review.json (after tests) |
reports/review.txt, review.json (review-only run) |
| Skips | Use --no-review or TEST_IT_SKIP_REVIEW=1 to skip review |
Does not run tests, Docker, Vitest, llvm-cov, or coverage |
Use test-it for CI and local runs where you need tests, coverage, and reports. Use review-it when you only want to review git branch changes against diff_base — for example a fast pre-push review or a dedicated review job without booting Docker or running Vitest.
review-it can use a dedicated review.yaml (same schema: defaults + projects) or fall back to test.yaml. When both exist in the config root, review-it loads review.yaml only (no merge). Use review-it init to scaffold review.yaml in a repo without test config.
When the review feature is enabled in the build, test-it still runs review by default after tests when defaults.code_review is true (or equivalent env). Feature-off test-it builds (--no-default-features) omit review integration entirely.
Features
- Rust + Node/Vitest —
cargo testandllvm-covfor Rust crates; Vitest with LCOV export for TypeScript packages - Auto-scaffold — first run in any repo discovers packages and writes
test.yaml - Scoped Docker — multi-project compose boot for cross-service integration tests
- LCOV merge — per-package, per-kind, and workspace-wide merged coverage
- Git-diff coverage — changed-file coverage against
diff_base(staged, unstaged, untracked, commits since base) - HTML report — foldable tests and diffs, dark/light theme, coverage accents on diff lines
- Standalone CLI — install to
~/.local/binand run in any checkout without vendoring the harness
Quick start
In this repository:
cd mini && pnpm install
cd ..
cargo test-it
In any repository (standalone install):
curl -fsSL https://lmrr-cc.pages.dev/install/test-it.sh | bash
export PATH="$HOME/.local/bin:$PATH"
cd ~/my-monorepo
test-it
On first run without test.yaml, test-it scans the repo, writes config, runs tests, and opens the HTML report.
Prerequisites
| Tool | Required for | Notes |
|---|---|---|
| Rust (edition 2021) | All Rust packages and the CLI | rustup.rs |
| Node.js + pnpm | Vitest packages (mini-ts-*) |
Run pnpm install in project dirs |
| cargo-llvm-cov | Rust coverage | Optional; tests run without it |
| Docker | Scoped integration tests | Without Docker, scoped tests are skipped (not failed) |
cargo install cargo-llvm-cov
rustup component add llvm-tools-preview
Skip coverage explicitly:
TEST_IT_SKIP_COVERAGE=1 cargo test-it
cargo test-it --no-coverage
Installation
Homebrew (macOS and Linux)
brew tap lmrr/tap https://lmrr-cc.pages.dev
brew install test-it
brew install review-it
Binaries are built from tagged releases and hosted at lmrr.cc. See lmrr.cc for the tap and release pipeline.
0. Install review skills
Code review loads rules from .cursor/rules (primary) and merges nested .cursor/skills/*/rules/ when present (declared in skills.toml). After cloning test-it:
skills install --locked
cargo build -p test-it
Without .cursor/rules (and without nested skill rules under .cursor/skills), review fails fast unless you disable it (code_review: false or TEST_IT_SKIP_REVIEW=1).
1. Cargo alias (in-repo)
.cargo/config.toml defines:
[alias]
test-it = "run --manifest-path test/test-it/Cargo.toml --bin cargo-test-it --"
Use cargo test-it from the workspace root.
2. Local path install
cargo install --path test/test-it --bin test-it
3. Standalone install (any machine)
Use this when Homebrew is not available. Default install is test-it only:
curl -fsSL https://lmrr-cc.pages.dev/install/test-it.sh | bash
export PATH="$HOME/.local/bin:$PATH"
Optional flags:
./install.sh # test-it only (default)
./install.sh --with-review-it # test-it + review-it
./install.sh --review-it-only # review-it only
./install.sh --install-dir ~/.local/bin
Install from this checkout during development:
TEST_IT_REPO="$PWD" ./install.sh
TEST_IT_REPO="$PWD" ./install.sh --with-review-it
Install review-it from a local checkout without test-it:
cargo install --path test/review-it --bin review-it
Update an existing install
test-it update # test-it only (default)
test-it update --with-review-it # test-it + review-it
test-it update --branch main
TEST_IT_REPO="$PWD" test-it update
Sample project (mini branch)
The bundled Docker/Rust/Node sample workspace is on the mini branch, not on main. Check it out with git checkout mini (or clone with -b mini), then run cargo test-it mini --no-open. Compare sample-only changes with git diff main..mini.
Usage
From the workspace root (or any repo with test.yaml):
cargo test-it # all project groups
cargo test-it mini # only packages under mini/
cargo test-it mini-rs-api # single package by name
cargo test-it -- --nocapture # forward args to cargo test
cargo test-it --no-coverage # skip llvm-cov / Vitest coverage
cargo test-it --no-html # skip HTML report
cargo test-it --no-open # do not open browser
cargo test-it --root /path/to/repo
The standalone binary accepts the same run flags:
test-it mini --no-open
| Flag | Purpose |
|---|---|
<filter> |
Project group id (e.g. mini) or package name (e.g. mini-rs-api) |
--root |
Override repo root (default: auto-detect from cwd) |
--init-only |
Scaffold test.yaml and exit without running tests |
--force-init |
Regenerate test.yaml even when it exists |
--no-coverage |
Skip coverage collection |
--no-html |
Skip HTML report generation |
--no-open |
Do not open HTML report in browser |
-q / --quiet |
Less output |
--no-review |
Skip changed-file code review |
--debug |
Include review run diagnostics in HTML #review (default troubleshooting: review.txt) |
--recover <RUN_ID_OR_PATH> |
Resume review into an existing run directory (review-only; see recovery) |
--review-llm |
Force agent review when a provider is configured |
-- … |
Extra args forwarded to cargo test |
Subcommands:
test-it update [--repo URL] [--branch main] [--with-review-it]
test-it rules list|install|status
When defaults.code_review is true and review is compiled in, test-it runs review after tests unless you pass --no-review or set TEST_IT_SKIP_REVIEW=1.
Environment variables
| Variable | Purpose |
|---|---|
TEST_IT_WORKSPACE_ROOT |
Repo root override (same as --root) |
TEST_IT_SKIP_COVERAGE=1 |
Skip coverage |
TEST_IT_SKIP_DOCKER=1 |
Skip Docker boot; use env for service endpoints |
TEST_IT_SKIP_REVIEW=1 |
Skip changed-file code review |
TEST_IT_CURSOR_DIR |
Override .cursor root (parent of rules/, prompts/, skills/) |
TEST_IT_RULES_DIR |
Override rules catalog directory only |
TEST_IT_REVIEW_PROMPT_FILE |
Review prompt path (highest precedence when the file exists) |
TEST_IT_REVIEW_AGENT=1 |
Enable agent review (set 0 to disable when review_agent is true) |
TEST_IT_REVIEW_LLM=1 |
Alias for enabling agent review |
TEST_IT_REVIEW_PROVIDER |
Agent provider (openai, anthropic, claude, or cursor); auto-detects when unset |
TEST_IT_REVIEW_MODEL |
Override agent model name |
TEST_IT_REVIEW_AGENT_CONCURRENCY |
Max parallel agent calls (default 4) |
TEST_IT_REVIEW_AGENT_MODE |
Batch mode: file (default), rule, or pair (rule_file alias) |
TEST_IT_REVIEW_AGENT_MAX_RULES |
Optional cap on agent requests scheduled per run |
TEST_IT_REVIEW_AGENT_MAX_RULES_PER_REQUEST |
File mode: opt-in max rules per request when chunking prompts |
TEST_IT_REVIEW_AGENT_MAX_FILES_PER_REQUEST |
Rule mode: max files bundled per request (default 10) |
TEST_IT_REVIEW_AGENT_CACHE |
Opt-in agent evaluation cache (1 / true; default off) |
TEST_IT_REVIEW_AGENT_CACHE_DIR |
Agent cache root (default .test-it/cache/review under workspace) |
TEST_IT_REVIEW_PROFILE |
Review profile (legacy, full, or ci; default legacy) |
TEST_IT_REVIEW_FAIL_ON_INCOMPLETE |
Exit non-zero when review status is incomplete (1 / true) |
TEST_IT_REVIEW_REQUIRE_AGENT_PROVIDER |
Require a provider when enabled LLM pairs exist (1 / true; implied for ci unless set to false) |
TEST_IT_REVIEW_PROGRESS |
Write review-progress.jsonl during agent review (set 0 / false to disable) |
TEST_IT_REVIEW_PROGRESS_FILE |
Override progress JSONL path (default {run}/reports/review-progress.jsonl) |
TEST_IT_REVIEW_DEBUG=1 |
Include review run diagnostics in HTML #review (same as --debug) |
RUST_LOG |
Enable tracing diagnostics (e.g. RUST_LOG=test_it=info) |
TEST_IT_NO_OPEN=1 |
Do not open HTML report |
TEST_IT_RUN_DIR |
Set by test-it during a run |
TEST_IT_REPORT_DIR |
Set by test-it during a run |
TEST_IT_{PROJECT}_{SERVICE}_URL |
Service base URL after Docker boot |
NO_COLOR |
Disable terminal ANSI colors |
Repo root is detected by walking up from cwd for test.yaml (or fallbacks). If none is found at the Cargo workspace root, test-it scans immediate child directories for per-project config (e.g. mini/test.yaml). A single nested config is used automatically; multiple configs run sequentially unless --root or a directory-name filter selects one.
review-it (review-only CLI)
review-it discovers review.yaml (or test.yaml as fallback), diffs changed files against diff_base, and writes review artifacts. It does not run tests, Docker, Vitest, llvm-cov, or coverage collection.
Config discovery walks up from cwd (or --root). If no config is found in ancestors, it scans immediate child directories (same idea as test-it). When multiple child configs exist, pass --root <dir>.
review-it init # create review.yaml (review-only defaults)
review-it init --root /path/to/mini --force
review-it # review all projects in config
review-it mini # filter by project id
review-it --root /path/to/repo
review-it --diff-base main
review-it --no-agent # disable agent for this run
review-it --review-llm # enable LLM review (matches review_llm semantics)
review-it -q
review-it --debug --no-open # HTML includes profile/planning/LLM diagnostics (default: review.txt only)
review-it --recover 20260528-191307-dc8f28
review-it --recover ./target/review-it/runs/20260528-191307-dc8f28
Default output directory: target/review-it/runs/<run-id>/reports/ (review.txt, review.json, review.html). When defaults.runs_base is set in config, review-it honors it. See Recovering interrupted review runs for --recover.
Fresh checkout without config (e.g. /tmp/mini copied without test.yaml): run review-it init, ensure the tree is a git repo (or set --diff-base), and install rules with skills install --locked (or point TEST_IT_CURSOR_DIR at a checkout with .cursor).
Cursor agent in new directories: review-it invokes cursor agent --print --trust by default so headless runs work in paths like /tmp/mini. Disable with TEST_IT_REVIEW_CURSOR_TRUST=0, or trust once manually: cursor agent --print --trust "Reply with OK".
Review HTML tree: lists every instrumentable changed file under project → module → file. Files with findings show severity counts; click a flagged line or its gutter label (1 critical, 1 major, 1 minor) to expand Problem · Impact · Suggestion (hover the label for the problem). Findings are deduped per file/line/rule.
Rules subcommand (same catalog logic as test-it rules):
review-it rules list
review-it rules install
review-it rules status
Install review-it with ./install.sh --with-review-it or ./install.sh --review-it-only, or cargo install --path test/review-it --bin review-it.
Rules catalog
Both test-it and review-it expose rules {list,install,status}. They delegate to the same rules catalog and installer in test-it-review — the catalog was not rewritten for the split.
test-it rules list
test-it rules install
test-it rules status
review-it rules list
review-it rules install
review-it rules status
Rules load from .cursor/rules (and nested .cursor/skills/*/rules/ when present). Install with skills install --locked after clone.
Environment variables and compatibility
test-it uses the TEST_IT_* namespace for all runtime settings, including review:
TEST_IT_REVIEW_*andTEST_IT_SKIP_REVIEWcontrol review when the review feature is compiled in.REVIEW_IT_*does not affect test-it. Keep review-it-specific env out of test-it runs.
review-it precedence: CLI flags → REVIEW_IT_* → TEST_IT_REVIEW_* / TEST_IT_SKIP_REVIEW fallback (compatibility for existing CI env).
| Namespace | Used by | Notes |
|---|---|---|
TEST_IT_SKIP_REVIEW, TEST_IT_REVIEW_* |
test-it (review); review-it (fallback) | Primary compatibility namespace |
REVIEW_IT_* |
review-it only | Preferred for review-it-only jobs |
TEST_IT_WORKSPACE_ROOT, TEST_IT_CURSOR_DIR, TEST_IT_RULES_DIR |
test-it | Not primary for review-it config discovery (--root / cwd walk-up) |
See the table below for the full TEST_IT_* list used by test-it. review-it honors the review-related subset plus REVIEW_IT_* aliases documented in AGENTS.md.
Auto-scaffold
When test.yaml is missing or empty (or with --force-init), test-it scans the repo and writes config:
test-it --init-only
test-it --force-init
Discovery rules (test/test-it/src/scaffold.rs):
- Rust — any
Cargo.tomlexcept harness crates (test-it,test-it-core,test-it-macros);[unit]or[unit, integration]iftests/contains#[test_it - Node —
package.jsonwith Vitest dependency or script;runtime: node, runnerpnpm exec vitest run --coverage - Projects — first path segment (e.g.
mini) or package name prefix - Compose —
{project}/docker-compose.test.ymlordocker-compose.yml; infersapianddbservices - Scopes — from
#[test_it(scope = "…")]and.project("…")in integration tests
Max scan depth: 6 directory levels. Skips target, node_modules, .git, hidden dirs.
Architecture
flowchart LR
subgraph test_it [test-it]
cli[test-it CLI] --> init[Init and preflight]
init --> docker[Docker scope boot]
init --> run[Package runners]
run --> rust[Rust cargo test / llvm-cov]
run --> node[Node Vitest]
docker --> run
run --> cov[LCOV merge]
cov --> report[Terminal + HTML report]
report --> review[Optional review pass]
end
subgraph review_it [review-it]
rcli[review-it CLI] --> rdisc[Config + git diff]
rdisc --> rpipe[Review pipeline]
end
common[test-it-common] --> init
common --> rdisc
treview[test-it-review] --> review
treview --> rpipe
test-it run flow:
- Load or scaffold
test.yaml - Init — resolve projects, packages, integration tests; build preflight plan
- Create isolated run dir under
target/test-it/runs/{run-id}/(ordefaults.runs_base) - Boot Docker scopes required by integration tests
- Run packages — Rust unit/integration, Node Vitest, optional external functional/ui commands
- Merge LCOV, build report snapshot, write HTML and terminal summary
- Run changed-file review when enabled (skipped with
--no-reviewor feature-off builds) - Teardown Docker scopes; update
target/test-it/latestsymlink
review-it run flow: discover test.yaml → git diff against diff_base → review pipeline → write review.txt / review.json under target/review-it/runs/{run-id}/reports/ (or defaults.runs_base).
Repository layout
test-it/
├── Cargo.toml # test-it workspace (test/test-it* + review-it)
├── install.sh # standalone install script
├── docs/
│ └── screenshot-report.png # sample HTML report screenshot
├── .cargo/config.toml # cargo test-it alias
├── mini/
│ ├── test.yaml # sample project aggregator config
│ ├── Cargo.toml # standalone workspace when copied in isolation
│ ├── docker-compose.test.yml
│ ├── mini-rs-api/ # Axum BFF + integration tests
│ ├── mini-rs-lib-mcap/ # BTC/USDT price API (mcap service)
│ ├── mini-ts-lib-price/ # shared USDT display (Vitest)
│ ├── mini-ts-web-react/ # React wrapper (Vitest)
│ └── mini-ts-web-vue/ # Vue wrapper (Vitest)
└── test/
├── test-it/ # full test CLI (test-it, cargo-test-it); default-on review feature
├── test-it-common/ # shared git diff, discovery, config/env helpers
├── test-it-review/ # review pipeline and rules installer
├── test-it-core/ # integration runtime, Docker, TestEnv
├── test-it-macros/ # #[test_it] proc macro
└── review-it/ # standalone review-only CLI
Configuration (test.yaml)
Config lives beside the project it describes (e.g. mini/test.yaml). test-it loads test.yaml in the config root (fallback: test.yml, test.toml, deprecated test-it.toml). review-it prefers review.yaml (review.yml, review.toml), then falls back to the same test-it filenames. When both review.yaml and test.yaml exist, review-it uses review.yaml only (no merge). Create review-only config with review-it init; use test-it --init-only for full test scaffolding.
On startup, test-it prints an init summary:
test-it init: 1 projects, 5 packages, 1 integration test(s) (1 scoped)
mini/mini-rs-api — unit, integration
mini/mini-rs-lib-mcap — unit
Schema
| Section | Purpose |
|---|---|
defaults |
coverage, html_report, open_report, runs_base, scope, diff_base, code_review, review_ignore, review_rules_dir, review_prompt, review_fail_on, review_agent, review_agent_mode, review_agent_cache, review_agent_cache_dir, review_llm, review_default_severity |
projects.<id> |
path, compose, services, packages, optional diff_base |
projects.<id>.packages.<name> |
kinds, optional runtime: node, manifest, unit runner |
scopes.<name> |
Docker scope for integration tests (projects: [mini]) |
Example (this repo)
defaults:
coverage: true
html_report: true
runs_base: target/test-it/runs
scope: global
diff_base: main
code_review: true
review_ignore: .reviewignore
review_rules_dir: .cursor/rules
review_prompt: .cursor/prompts/review.md
review_fail_on: none
review_agent: true
review_agent_mode: file
review_llm: false
review_default_severity: major
projects:
mini:
path: .
diff_base: main
compose: docker-compose.test.yml
packages:
mini-rs-api:
kinds: [unit, integration]
mini-rs-lib-mcap:
kinds: [unit]
mini-ts-lib-price:
runtime: node
manifest: mini-ts-lib-price/package.json
kinds: [unit]
unit:
command: pnpm
args: [exec, vitest, run, --coverage]
cwd: mini-ts-lib-price
services:
api:
service: mini-rs-api
port: 3000
health: /health
mcap:
service: mini-rs-lib-mcap
port: 3001
health: /health
db:
service: mini-db
port: 5432
kind: postgres
scopes:
mini_wallet:
projects: [mini]
External Node runner block (runtime: node + unit:) replaces the older standalone ui: runner pattern for Vitest packages.
Optional per-project env: {project}/.env.test.
Package naming
Sample packages follow mini-{lang}-{kind}-{name}:
| Package | Pattern |
|---|---|
mini-rs-api |
Rust API service |
mini-rs-lib-mcap |
Rust library/service (BTC/USDT price) |
mini-ts-lib-price |
TypeScript library |
mini-ts-web-react |
TypeScript web (React) |
mini-ts-web-vue |
TypeScript web (Vue) |
Project ids match top-level directories (e.g. mini). Package names in test.yaml must match Cargo crate names or npm package.json names.
Integration tests
Integration tests use the #[test_it] macro from test-it-core. Function names must follow:
{project}_project_{suite}_suite_{test}_test
Example from mini/mini-rs-api/tests/integration.rs:
use test_it_core::prelude::*;
#[test_it(scope = "mini_wallet")]
async fn mini_project_wallet_suite_get_address_value_usdt_test(env: TestEnv) {
let resp = env
.project("mini")
.service("mcap")?
.get("/btc/usdt?timestamp=1710000000")
.send()
.await;
resp.assert_status(200);
}
Rules enforced by the macro:
- Function must be
async - Single parameter:
env: TestEnv - No return type (failures via panics/assertions)
- Segments
{project},{suite},{test}— lowercase snake_case
Scope mini_wallet boots the mini Docker stack (including mini-rs-lib-mcap) before the test runs.
Isolated run directories
test-it — each run creates a unique directory under target/test-it/runs/{run-id}/ by default. Symlink target/test-it/latest points at the most recent run.
review-it — each run creates target/review-it/runs/{run-id}/reports/ by default (review.txt, review.json).
Both tools honor defaults.runs_base in test.yaml when set (legacy compatibility for custom run roots).
| Path (test-it) | Purpose |
|---|---|
targets/{project}/ |
Isolated CARGO_TARGET_DIR |
coverage/{project}/*.lcov |
Per-package LCOV (unit, integration, merged) |
coverage/merged.lcov |
Workspace-wide merged coverage |
reports/index.html |
Aggregate HTML report |
reports/review.txt |
Plain-text review for change requests |
reports/review.json |
Machine-readable review findings |
reports/integration/*.jsonl |
Integration test HTTP records |
reports/unit/*.json |
Vitest JSON results for node packages |
env.json |
Docker service endpoint snapshot |
All paths under target/ are gitignored.
Recovering interrupted review runs
Both review-it and test-it (with the review feature enabled) support --recover <RUN_ID_OR_PATH> to continue an interrupted agent review into an existing run directory. Recovery reuses the same reports/ folder and appends to review-progress.jsonl when agent progress is enabled. It does not create a new run id.
There are no recover-specific environment variables (for example no REVIEW_IT_RECOVER_RUN or TEST_IT_RECOVER_RUN). Review-it config is review.yaml or test.yaml at the config root.
Commands
review-it --recover 20260528-191307-dc8f28
review-it --recover ./target/review-it/runs/20260528-191307-dc8f28
test-it --recover 20260528-191307-dc8f28
test-it --recover ./target/test-it/runs/20260528-191307-dc8f28
<RUN_ID_OR_PATH> is either the run directory name under defaults.runs_base (for example 20260528-191307-dc8f28) or a path to the run root. The path must lie under the canonical runs base and include an existing reports/ directory.
What each tool does
review-it --recover |
test-it --recover |
|
|---|---|---|
| Reuses | target/review-it/runs/<id>/ (or defaults.runs_base) |
target/test-it/runs/<id>/ (or defaults.runs_base) |
| Runs | Review pipeline only (same as a normal review-it run) | Review phase only — no tests, Docker, Vitest, llvm-cov, coverage merge, or full combined report rebuild |
| Writes | review.txt, review.json, review.html |
review.txt, review.json only (no standalone review.html) |
| Skips | N/A (review-it never runs tests) | Test execution, scope boot, coverage/LCOV, reports/index.html regeneration, target/test-it/latest update |
Safe recovery policy
reports/review-progress.jsonl records progress and accounting only. It is not a store of review findings.
| Situation | Behavior |
|---|---|
Progress complete + agent cache hit |
Work can be satisfied from cache (task removed from the provider queue) |
Progress complete + no cache hit |
Provider task is rerun (findings are not assumed from progress alone) |
Progress request without matching complete |
Rerun |
| Malformed progress lines | Ignored (counted); recovery continues |
| No progress file | Review runs into the existing run directory as usual |
If the git diff, test.yaml, or rules catalog changed since the original run, recovery may warn and schedule different work than the interrupted run. Do not run two recover processes against the same run directory at once.
Invalid and feature-off cases
test-it --recover ./target/test-it/runs/<id> --no-review
# error: recover is a review recovery path; --no-review is not allowed
cargo run -p test-it --no-default-features --bin test-it -- --recover <id>
# error: test-it --recover requires the review feature (default features include review)
Cache recommendation for long runs
Enable the agent cache before large or long agent reviews so interrupted runs can reuse validated evaluations:
defaults:
review_agent_cache: true
review_agent_cache_dir: .test-it/cache/review
Or for a single run:
TEST_IT_REVIEW_AGENT_CACHE=1 review-it --no-open
TEST_IT_REVIEW_AGENT_CACHE=1 test-it --no-open
review-it also honors review-related settings via REVIEW_IT_* with fallback to TEST_IT_REVIEW_* (same keys as in the environment tables above). The cache directory may contain code snippets, rule text, and findings — treat .test-it/cache/review as source-sensitive (.test-it/ is gitignored).
v1 limitations (not supported)
- Full test-run checkpoint recovery (no automatic rerun of prior tests from run artifacts alone)
- Reconstructing Vitest JSONL, merged LCOV, integration HTTP records, or Docker
env.jsonduring test-it recover - Regenerating the combined
reports/index.htmlduring test-it recover (re-run a normal test-it job for that) - Recover environment variables
Full test-run checkpointing is a possible future task; v1 recover is review-only into an existing run directory.
Reports
After a run, test-it prints one aggregate terminal report and writes one HTML report.
Terminal — run id, status line, projects table, integration table, report path:
test-it OK | run:20260523-085406-0adbe8 | pkg 4/4 | tests run:6 skip:0 fail:0 ok:100% | all:23.6% unit:45.9% it:5.5% diff:83.6% | report: .../reports/index.html
HTML — target/test-it/latest/reports/index.html:
- Sticky status header (same summary line)
- Projects table — coverage per project/package; ALL (changed) footer sums diff coverage
- Tests — foldable per project and suite; filter, sort, show failures/skipped; failed tests show HTTP reason and source snippet
- Changes — foldable per project and file; unified diff with line-coverage accents
- Review — embedded in the report: status line with agent execution stats (provider, mode, applicable rules per file in file mode, requests), verdict, foldable findings; optional
review.txtattachment
When the agent runs, the terminal report always includes a review section (even with zero findings) listing provider, applicable rules (file mode peak per file), requests, and per-file coverage. review.txt and review.json include the same execution summary under Review run. REVIEW.md also lists Rule evaluations (total file×rule work, the rule_checks field) for cost transparency. Status text uses rule and request counts instead of a bare review: ok; use agent off or agent skipped (...) when the agent did not run.
While tests finish, test-it prints phase banners (==> report snapshot, ==> code review, ==> html report) and live agent progress ([review] i/N · path · rules) so long review runs do not look stuck. Optional reports/review-progress.jsonl records the same events for tail -f or CI (disable with TEST_IT_REVIEW_PROGRESS=0). Set RUST_LOG=test_it=info for deeper agent diagnostics.
Disable review with --no-review, TEST_IT_SKIP_REVIEW=1, or defaults.code_review: false. Review is rules-only: the bundled index selects .md rules under .cursor/rules/{skill}/ (and merges nested .cursor/skills/{skill}/rules/ when present) and the agent evaluates applicable rules on changed files when a provider is available. Default batch mode is file (review_agent_mode / TEST_IT_REVIEW_AGENT_MODE): one agent request per changed file with all applicable rules bundled (lowest cost when few files change). Use pair for one request per file×rule (most precise). Use rule when many files change against the same rule set. With no TEST_IT_REVIEW_PROVIDER, test-it auto-detects in order: Cursor CLI on PATH (including ~/.local/bin/cursor and the macOS app bundle), ANTHROPIC_API_KEY, then OPENAI_API_KEY. When TEST_IT_REVIEW_PROVIDER is set but that provider is unavailable, test-it falls back through the same chain and logs a warning. With agent review disabled or no provider found, review produces zero findings. Force enable with --review-llm, defaults.review_llm: true, TEST_IT_REVIEW_AGENT=1, or TEST_IT_REVIEW_LLM=1. Disable agent with TEST_IT_REVIEW_AGENT=0 or review_agent: false. Default finding severity when the model omits it: review_default_severity (default major). CI gate: review_fail_on: critical or major. For pipelines, use review_profile: ci with review_fail_on_incomplete: true and a configured LLM provider — see AGENTS.md for the recommended CI profile.
The embedded index test/test-it/assets/rules.yaml (generated at build time from .cursor/rules/*/*.md plus nested .cursor/skills/*/rules/*.md) organizes the catalog under a rules: map keyed by skill name (~49 skills, ~1,800 rules). Every section uses the same layout: include (explicit rule file stems, sorted), exclude (ids skipped during test-it review only), and override (per-rule severity, enforce, scope). Precedence is include → exclude → override; exclude wins. Legacy review-index.yaml / review-index.toml layouts under the rules directory remain supported for hand-edited project indexes. Standalone test-it embeds the generated index (include_str! in the binary). Rule bodies (.md files) are never copied or embedded; they load from the filesystem .cursor/rules tree (with nested skill rules merged from .cursor/skills) at runtime. The review prompt ships bundled as test/test-it/assets/review.md (include_str! in the binary). Override precedence: TEST_IT_REVIEW_PROMPT_FILE (when set and the file exists), then defaults.review_prompt / walk-up .cursor/prompts/review.md when that file exists, otherwise the bundled default (no error when the override path is absent). At runtime, test-it prefers a project review-index.yaml under the resolved rules directory, then falls back to the embedded rules index. Paths resolve from review_rules_dir / review_prompt in test.yaml (relative to the config file), walk-up discovery toward the repo root, or TEST_IT_CURSOR_DIR / TEST_IT_RULES_DIR / TEST_IT_REVIEW_PROMPT_FILE. When the catalog is missing, review fails with an actionable error (skills install --locked). build.rs requires the catalog at compile time and regenerates the embedded index. Tune cost with TEST_IT_REVIEW_AGENT_CONCURRENCY (default 4), optional TEST_IT_REVIEW_AGENT_MAX_RULES (max requests per run), and opt-in chunk limits TEST_IT_REVIEW_AGENT_MAX_RULES_PER_REQUEST / TEST_IT_REVIEW_AGENT_MAX_FILES_PER_REQUEST when prompts hit context limits. Optional pair-level cache: review_agent_cache: true or TEST_IT_REVIEW_AGENT_CACHE=1 stores validated evaluations under review_agent_cache_dir (default .test-it/cache/review). The cache may contain code snippets, rule text, and findings — treat it as source-sensitive; .test-it/ is gitignored. Cache read/write failures are non-fatal. Mechanical policies (TODO, unwrap, etc.) are enforced by the agent reading the corresponding rule text.
| Accent | Meaning |
|---|---|
| Green | OK / pass, high coverage (≥ 50%) |
| Yellow | Skipped tests, uncovered changed lines, low coverage (< 50%) |
| Light blue | LCOV-covered diff lines |
| Red | FAIL, failed projects/tests, uncovered lines in failed projects |
Terminal colors apply when stdout is a TTY. Set NO_COLOR to disable. --quiet also disables colors.
Changed files include unstaged, staged, untracked, and commits since diff_base. The changes section lists only files with a renderable unified diff.
Set diff_base to empty, empty-tree, or root to diff against Git’s empty tree (all tracked files in HEAD vs the empty commit). Default remains main.
Code review uses the same changed-file set as the report changes section (git diff plus LCOV-expanded sources), not git alone.
On local interactive runs, the HTML report opens automatically when generation succeeds. CI environments skip auto-open. Disable with --no-open, TEST_IT_NO_OPEN=1, or defaults.open_report: false.
Sample projects
| Project | Package | Purpose |
|---|---|---|
mini |
mini-rs-api |
Axum BFF: Bitcoin address lookup, Postgres, calls mcap for BTC/USDT rate |
mini |
mini-rs-lib-mcap |
Axum service: BTC/USDT price at timestamp |
mini |
mini-ts-lib-price |
Shared USDT formatting logic (Vitest) |
mini |
mini-ts-web-react |
React component wrapping price display (Vitest) |
mini |
mini-ts-web-vue |
Vue component wrapping price display (Vitest) |
Limitations
--keep-going— flag is declared but not yet wired; runner does not continue after project failure- Docker-less environments — scoped integration tests are skipped with reason
"no docker", not failed - Node packages — require preinstalled
node_modules; test-it does not runpnpm install - Rust coverage — requires
cargo-llvm-cov; otherwise tests run with a warning - Scope coverage for sibling services — packages like
mini-rs-lib-mcapbooted in the same compose scope appear in changed-file coverage from unit LCOV; Docker service execution is not llvm-instrumented - Scaffold depth — max 6 directory levels; exotic compose layouts may need manual
test.yamledits latestsymlink — Unix only
Development
cargo test -p test-it -p test-it-core -p review-it # launcher, core, review CLI unit tests
cd mini && cargo test -p mini-rs-api # sample crate tests (mini workspace)
cargo test-it mini --no-coverage # full aggregator without llvm-cov
review-it --root . # review-only (no tests)
TEST_IT_REPO="$PWD" ./install.sh # install local test-it to ~/.local/bin
TEST_IT_REPO="$PWD" ./install.sh --with-review-it
Standalone mini/ copy (isolated smoke test):
rm -rf /tmp/mini && cp -a mini /tmp/mini
test-it update
cd /tmp/mini && pnpm install && test-it --no-open
The root workspace contains only test/test-it*. Sample Rust crates live under mini/Cargo.toml so cargo install --git and copying mini/ alone do not hit nested-workspace conflicts.
For AI agents
Before editing this repository, read:
- AGENTS.md — repo map, agent workflows, rules, security checklist
- CLAUDE.md — concise quick reference for Claude/Cursor
Key constraints: no code comments in source files, no unwrap() in service code, never commit target/, test framework changes are high risk.