Testing
The Rust
induscodeagent is tested by inline#[cfg(test)]modules that live next to the code, apublish = falseinduscode-testkitharness that carries the agent-layer fakes, and 15instasnapshot baselines that pin the ratatui TUI cell-for-cell against aTestBackend. Every test is network-free — a scriptedModelInvoker, a localhostwiremocktranscript server, andtempfilesandboxes stand in for the model, the wire, and the home directory. Run them withcargo test -p induscode(orcargo nextest run).
Table of Contents
- Test philosophy
- Running the suite
- The induscode-testkit harness
- Vendored framework seams
- Agent-layer fakes
- Snapshot tests for the TUI
- Unit and integration layout
- Hermetic discipline
- The behavior-trace parity unit
- Green test count
- Relationship to the framework suite
Test philosophy
Tests live inside the crate in #[cfg(test)] mod tests blocks at the bottom
of each source file — the idiomatic Rust placement, not a parallel tests/ tree.
There is no conftest.py analogue and no shared-fixtures module compiled into the
product: each test module declares the doubles it needs, and the cross-cutting
doubles come from the dev-only induscode-testkit
crate, which is wired in as a path [dev-dependencies] entry.
The suite is built on three rules that make it deterministic and CI-portable across all three operating systems:
| Rule | How it is enforced |
|---|---|
| No network | A scripted ModelInvoker (ScriptedModel) replays recorded emission turns; the 11 wire connectors replay recorded SSE/NDJSON bodies through a localhost TranscriptServer (wiremock), never a real provider. |
| No real home | Real-fs tests run under a TempWorkspace (tempfile::TempDir, removed on drop) — never ~/.indusagi/. |
| No PTY | TUI tests draw into a ratatui TestBackend in-memory Buffer and assert against the cell grid — there is no terminal device, so the snapshot suite runs headless in CI. |
The product crate is induscode (the merged crate at crates/induscode), and its
two [[bin]] targets are indusr and indusagir — see
induscode (Rust) Overview for the binary surface.
Running the suite
cd indus-code-rust
# the whole agent suite — inline #[cfg(test)] modules + snapshot baselines
cargo test -p induscode
# faster, parallel, with a nicer summary
cargo nextest run -p induscode
# the testkit's own self-tests (the fakes and the fixture loader)
cargo test -p induscode-testkit
default-members = ["crates/induscode"] in the virtual workspace manifest means a
bare cargo test targets the product crate. The workspace has exactly three
members — crates/induscode, crates/induscode-testkit, and xtask.
Run one module or one test by path filter:
# every test in the conductor subsystem module
cargo test -p induscode conductor::
# one snapshot test
cargo test -p induscode draw_models_snapshot
Reviewing snapshot diffs
The TUI snapshots use insta. When a render changes, insta
fails the test and writes a .snap.new next to the baseline. Review and accept:
cargo install cargo-insta # one-time
cargo insta test -p induscode # run + collect pending snapshots
cargo insta review # interactive accept/reject of diffs
cargo insta accept # accept all pending (non-interactive)
The induscode-testkit harness
induscode-testkit (crates/induscode-testkit) is the dev-only test scaffold:
# crates/induscode-testkit/Cargo.toml
[package]
name = "induscode-testkit"
publish = false # dev-only support crate; never published
[dependencies]
induscode = { path = "../induscode" }
indusagi = { workspace = true }
ratatui = { workspace = true }
wiremock = "0.6"
tempfile = "3"
It does two things, declared in src/lib.rs:
- Vendors the framework's test seams — the scripted
ModelInvoker, the ratatuiTestBackendrender harness, and thewiremocktranscript server — built directly on the publishedindusagiumbrella crate. - Declares the agent-layer fakes the framework testkit does not carry —
FakeAgent,CaptureView,BehaviorTrace, the agent-local golden loader, andTempWorkspace.
The public surface is the flat re-export from lib.rs:
// crates/induscode-testkit/src/lib.rs
pub use render::{buffer_to_string, render_frame, render_lines};
pub use script::ScriptedModel;
pub use wire::TranscriptServer;
pub use capture_view::CaptureView;
pub use fake_agent::FakeAgent;
pub use trace::BehaviorTrace;
pub use workspace::TempWorkspace;
| Module | Type / fn exported | Role |
|---|---|---|
render |
render_frame, render_lines, buffer_to_string, TestBackend |
The ratatui in-memory render harness for snapshot tests. |
script |
ScriptedModel |
A ModelInvoker that replays recorded turns and panics past the script. |
wire |
TranscriptServer |
A localhost wiremock server that serves recorded provider bodies. |
fake_agent |
FakeAgent, RecordedTurn |
A network-free Agent-trait fake that records prompts and appends ack:{prompt}. |
capture_view |
CaptureView, EndOfInput |
A headless InteractiveView capture seam for boot/runner tests with no TTY. |
trace |
BehaviorTrace, ToolCall |
The normalized parity-trace unit compared across the TS and Rust agents. |
fixture |
fixtures_dir, fixture_path, load_text, load_json |
The agent-local golden-corpus loader. |
workspace |
TempWorkspace |
A tempfile-backed throwaway workspace, cleaned up on drop. |
Why the seams are vendored, not re-imported
The framework ships its own indusagi-testkit path crate, but that crate depends
on the path indusagi, which would put a second, type-incompatible copy of
indusagi in the graph alongside the crates.io indusagi 0.1.0 the agent builds
against. So induscode-testkit vendors ScriptedModel, the render_*
helpers, and TranscriptServer verbatim into src/{script,render,wire}.rs, built
on the published indusagi public contract types — keeping the whole agent
workspace resolving a single indusagi crate (enforced by cargo deny check bans). See the framework's testing strategy for the
upstream originals.
Vendored framework seams
ScriptedModel — the scripted ModelInvoker
ScriptedModel (src/script.rs) is the Rust port of the TS scriptModel +
channelOf(emissions). It implements the framework's
indusagi::runtime::contract::ModelInvoker trait, replaying one recorded turn per
invoke call and panicking past the script so a runaway drive loop surfaces
as a failed assertion rather than a wedged suite:
// crates/induscode-testkit/src/script.rs
pub struct ScriptedModel {
turns: Vec<Arc<Vec<Emission>>>,
cursor: AtomicUsize,
}
impl ScriptedModel {
pub fn new(turns: Vec<Vec<Emission>>) -> Self { /* … */ }
pub fn single(turn: Vec<Emission>) -> Self { Self::new(vec![turn]) }
pub fn calls(&self) -> usize { /* turn count issued so far */ }
}
impl ModelInvoker for ScriptedModel {
fn invoke(&self, _conversation: Conversation, _options: StreamOptions) -> Channel {
// returns a re-iterable Channel::of(..) for turns[cursor]; panics past the end
}
}
The returned Channel is re-iterable (iterating it twice replays the same turn
without advancing the cursor), mirroring the production gateway stream contract.
model.calls() is asserted to pin the exact number of drive-loop turns. The
self-tests cover ordered replay, re-iterability, and the
#[should_panic(expected = "only 1 turn(s) recorded")] runaway guard.
TranscriptServer — the wiremock transcript replay
TranscriptServer (src/wire.rs) wraps a wiremock::MockServer so connector
tests replay a recorded provider body through the real framer + decoder + fold
rather than mocking the connector:
// crates/induscode-testkit/src/wire.rs
pub const SSE_CONTENT_TYPE: &str = "text/event-stream";
pub const NDJSON_CONTENT_TYPE: &str = "application/x-ndjson";
impl TranscriptServer {
pub async fn start() -> Self { /* ephemeral localhost port */ }
pub fn uri(&self) -> String { /* http://127.0.0.1:<port> */ }
pub async fn serve_sse(&self, method: &str, path: &str, body: impl Into<String>);
pub async fn serve_ndjson(&self, method: &str, path: &str, body: impl Into<String>);
pub async fn serve_status(&self, method: &str, path: &str, status: u16);
}
serve_status drives the HTTP-status → GatewayError-kind table tests. The
self-tests use a dependency-free TcpStream GET (no HTTP client of their own) to
prove the SSE content-type and status codes round-trip.
The render harness
render.rs is the ratatui snapshot engine. Three functions cover the two render
shapes view code takes:
// crates/induscode-testkit/src/render.rs
pub fn render_frame<F>(w: u16, h: u16, draw: F) -> String
where F: FnOnce(&mut ratatui::Frame);
pub fn render_lines(w: u16, lines: &[ratatui::text::Line<'_>]) -> String;
pub fn buffer_to_string(buffer: &Buffer) -> String;
render_frame draws into a fixed w×h TestBackend and stringifies the buffer;
render_lines lays a Vec<Line> (the shape view functions like the diff renderer
return) one-per-row; buffer_to_string flattens a Buffer into snapshot text —
glyph layout only, style dropped. Color/modifier are asserted separately by the
unit tests that read buf[(x, y)].bg etc. The doc-comment intent: the snapshot
pins the flicker / wrapping / clamping contract, not the palette.
Agent-layer fakes
FakeAgent — the Agent-trait fake
FakeAgent (src/fake_agent.rs) is the Rust port of the TS fakeAgent() from
conductor.test.ts / submit.test.ts / session-persist.test.ts. It records
each prompt and appends a RecordedTurn { prompt, ack: "ack:{prompt}" }, and
counts aborts so boot/console tests assert a count rather than callback-spy:
// crates/induscode-testkit/src/fake_agent.rs
pub struct RecordedTurn { pub prompt: String, pub ack: String }
impl FakeAgent {
pub fn record_prompt(&self, input: &str); // appends ack:{input}
pub fn record_abort(&self);
pub fn prompts(&self) -> Vec<String>;
pub fn turns(&self) -> Vec<RecordedTurn>;
pub fn abort_count(&self) -> usize;
}
CaptureView — the headless InteractiveView
CaptureView (src/capture_view.rs) is the headless InteractiveView capture
seam: render pushes events into a Vec, prompt pops the next scripted input
line and returns EndOfInput once the queue drains (the EOF-not-sentinel
discipline). Boot/runner tests then assert "this invocation drove the view with
these events and asked for input N times":
// crates/induscode-testkit/src/capture_view.rs
pub struct EndOfInput;
impl CaptureView {
pub fn with_inputs<I, S>(inputs: I) -> Self where I: IntoIterator<Item = S>, S: Into<String>;
pub fn next_input(&self) -> Result<String, EndOfInput>;
pub fn rendered(&self) -> Vec<String>;
pub fn prompt_count(&self) -> usize;
pub fn is_closed(&self) -> bool;
}
TempWorkspace — the temp-dir sandbox
TempWorkspace (src/workspace.rs) is the tempfile::TempDir-backed throwaway
directory the deck (checkpoint, read-edit-gate), session-persist, settings, and
@file-attachment suites seed files into. Each test gets its own, removed on drop,
so runs are isolated and parallel-safe:
// crates/induscode-testkit/src/workspace.rs
impl TempWorkspace {
pub fn new() -> Self;
pub fn root(&self) -> &Path;
pub fn path(&self, rel: impl AsRef<Path>) -> PathBuf;
pub fn write_file(&self, rel: impl AsRef<Path>, contents: impl AsRef<[u8]>) -> PathBuf;
pub fn read_file(&self, rel: impl AsRef<Path>) -> String;
}
The agent-local golden loader
fixture.rs resolves tests/fixtures/ under this workspace root (the agent
goldens — export_html.json, sgr_corpus.json, system_prompt.json,
wire_ndjson.json, slash_catalog.json, … — dumped from the TS agent, never
hand-edited). It is anchored to CARGO_MANIFEST_DIR walked up two levels, so it
resolves identically no matter which test invoked it. The framework's own loader
resolves under the framework root and cannot reach the agent corpus, so this thin
mirror exists:
// crates/induscode-testkit/src/fixture.rs
pub fn fixtures_dir() -> PathBuf; // <agent-workspace>/tests/fixtures
pub fn fixture_path(name: &str) -> PathBuf;
pub fn load_text(name: &str) -> String; // panics loudly on a missing fixture
pub fn load_json<T: DeserializeOwned>(name: &str) -> T;
Snapshot tests for the TUI
The console renderer is pinned by 15 insta snapshot baselines under
src/console/**/snapshots/. Each snapshot test draws a component into a fixed-size
TestBackend, flattens the buffer to a string, and calls insta::assert_snapshot!.
insta is a [dev-dependencies] of induscode alongside induscode-testkit,
proptest, tempfile, assert_cmd, predicates, and ratatui.
A representative test — the model picker overlay:
// crates/induscode/src/console/overlays/models.rs
#[test]
fn draw_models_snapshot() {
let theme = theme();
let mut terminal = Terminal::new(TestBackend::new(64, 20)).expect("terminal");
terminal
.draw(|f| draw_models(f, f.area(), &theme, sample(), 0))
.expect("draw");
let buf = terminal.backend().buffer().clone();
let mut out = String::new();
for y in 0..20u16 {
for x in 0..64u16 {
out.push_str(buf[(x, y)].symbol());
}
out.push('\n');
}
insta::assert_snapshot!(out);
}
A .snap baseline is a YAML header (source: + expression:) followed by the
literal rendered cells. For example the approval-dialog baseline
(induscode__console__overlays__approval__tests__render_snapshot.snap) records the
full bordered box:
---
source: crates/induscode-console/src/overlays/approval.rs
expression: "render_to_string(&dialog(), 70, 16)"
---
╭────────────────────────────────────────────────────────────────────╮
│ Permission │
│ Allow Bash to run? │
│ npm run test │
│ Allow-always remembers: Bash(npm run test:*) │
│ > Allow once — run this one call │
│ Allow always — run it and remember the tool for this session │
│ Deny — block this call │
│ Up/Down to move, Enter selects, Esc denies │
╰────────────────────────────────────────────────────────────────────╯
The 15 baselines, by area:
| Snapshot file (stem) | Area | What it pins |
|---|---|---|
overlays__approval__tests__render_snapshot |
Dialogs | The permission approval dialog box. |
overlays__models__tests__draw_models_snapshot |
Dialogs | The model picker list. |
overlays__scoped_models__tests__draw_scoped_models_snapshot |
Dialogs | The per-scope model routing editor. |
overlays__oauth__tests__draw_oauth_snapshot |
Dialogs | The OAuth device-flow prompt. |
overlays__signin__tests__draw_signin_snapshot |
Dialogs | The provider sign-in overlay. |
overlays__signout__tests__draw_signout_snapshot |
Dialogs | The sign-out overlay. |
overlays__sessions__tests__draw_sessions_snapshot |
Dialogs | The session resume picker. |
overlays__settings__tests__draw_settings_snapshot |
Dialogs | The settings overlay. |
overlays__theme__tests__draw_theme_snapshot |
Dialogs | The theme picker. |
overlays__tree__tests__draw_tree_snapshot |
Dialogs | The transcript-tree navigator. |
overlays__user_turns__tests__draw_user_turns_snapshot |
Dialogs | The user-turns history view. |
overlays__plugin__tests__render_snapshot |
Dialogs | The plugin/MCP overlay. |
console__slash__tests__catalog_listing_snapshot |
Slash commands | The full slash catalog listing (static + dynamic splice + collision guard). |
view__chrome__tests__footer_stats_threshold_snapshot |
Console | The footer stats line at the visibility threshold. |
view__chrome__tests__live_theme_switch_snapshot |
Theming | The same chrome rendered under midnight then daylight. |
Snapshot tests assert glyph layout; cell-level style (the live theme switch, gutter
highlight) is checked by sibling assertion tests that read buf[(x, y)] directly —
e.g. approval.rs::render_shows_header_args_suggestion_and_choices asserts the
> Allow once selection gutter on the highlighted row.
Unit and integration layout
There is no parallel tests/ integration tree in the product crate — every test
is an inline #[cfg(test)] mod tests block, so unit and integration tests sit in
the file that owns the code. The module tree under crates/induscode/src/ mirrors
the subsystem layout (see Architecture). Inline test
attributes (#[test] / #[tokio::test] / #[should_panic]) break down roughly:
| Module path | Tests (≈) | Covers |
|---|---|---|
console/ |
355 | Reducer, input/intents/chords, theme, banner + chrome, the overlay router, the slash handler — plus the 15 snapshot tests. See Console overview. |
conductor/ |
237 | Catalog gate, model matcher, transcript-tree append/branch/fork, NDJSON serialize round-trip, submit/resume, queue + fork, contract + signal hub. See Conductor. |
launch/ |
111 | The flag table, invocation parser, usage renderer, package command. See Launch. |
deck/ |
98 | Card catalog, builtin cards, provisioning, the contract + bridge ledger. See Capability Deck. |
briefing/ |
83 | System-prompt composition. See Briefing. |
boot/ |
82 | Tokenize → mode mapping, workspace resolution, run_stages, runner selection. See Boot. |
core/ |
71 | Brand identity, guardrails, workspace layout, sessions, settings, the kit helpers. See Settings / Sessions. |
addons/ |
71 | The addon host pipeline. See Addons. |
insight/ |
51 | The tracing wrapper over the framework. See Insight. |
window_budget/ |
48 | Context-window budgeting + condense scope. See Window Budget. |
transcript_export/ |
36 | Markdown/HTML transcript export. See Transcript Export. |
channels/ |
30 | The non-interactive print/JSON channels. See Channels. |
runtime_bridge/ |
27 | External-runtime provider routing. See Runtime Bridge. |
The three layers of fidelity:
- Pure-data unit tests over dependency-free reducers, folds, and matchers (e.g. the console reducer, the conductor model matcher, the diff renderer).
- Seam / protocol tests over injected doubles —
ScriptedModelfor the model,FakeAgentfor theAgenttrait,TranscriptServerfor the wire,CaptureViewfor the interactive view,TempWorkspacefor real-fs tools. - Snapshot tests drawing real components into a
TestBackend.
Hermetic discipline
The whole suite is sealed against the outside world:
- No model call.
ScriptedModelreplays recordedVec<Emission>turns; itscalls()cursor doubles as a runaway-loop guard that panics past the script. - No real provider. The 11 wire connectors replay recorded SSE/NDJSON bodies
through the real framer/decoder over a localhost
TranscriptServer. The HTTP-status → error-kind table is driven byserve_status. - No real home. Real-fs tests run under
TempWorkspace; the goldens load from the agent-localtests/fixtures/via thefixtureloader. - No PTY. TUI tests draw into a ratatui
TestBackendand assert against the in-memoryBuffer— the snapshot suite runs in CI on all three OSes with no terminal device.
The behavior-trace parity unit
BehaviorTrace (src/trace.rs) is the comparison unit for the behavioral-parity
harness, which runs the real TS agent and the Rust agent on the same scripted
scenario and asserts the observable behavior is identical. It is a normalized,
order-preserving record of what the agent decided — not how it painted (that is
covered by snapshots and goldens):
// crates/induscode-testkit/src/trace.rs
pub struct ToolCall { pub name: String, pub args: String, pub is_error: bool }
pub struct BehaviorTrace {
pub tool_calls: Vec<ToolCall>,
pub final_text: String,
pub settle_phase: String,
pub session_node_count: usize,
}
impl BehaviorTrace {
pub fn normalize(self) -> Self; // strips timestamps/ULIDs/abs temp paths → placeholders
}
It derives PartialEq + Eq + Serialize + Deserialize, so a parity scenario
reduces to assert_eq!(rust_trace.normalize(), ts_trace.normalize()) and round-trips
through JSON. For the milestone-by-milestone parity ledger this encodes, see the
crate's PARITY_REPORT.md and the TS / Python editions.
Green test count
The inline test attributes discoverable in source total 1,344:
~1,324 #[test] / #[tokio::test] / #[should_panic] attributes in the
induscode product crate (including the 15 TUI snapshot tests) plus 20
self-tests in induscode-testkit. The framework underneath (indusagi-rust)
carries its own separately-counted suite — see
the framework testing page. All milestones (M0 →
M-final) are declared GREEN with zero failures, flake-free across repeated runs,
and clippy -D warnings clean.
# reproduce the inline-test attribute count yourself
grep -rE '^\s*#\[(tokio::)?test' crates/induscode/src | wc -l # ≈ 1324
grep -rE '^\s*#\[(tokio::)?test' crates/induscode-testkit/src | wc -l # 20
Relationship to the framework suite
The agent suite imports both from the crate under test and from the published
indusagi framework it builds on. ScriptedModel implements the framework's
indusagi::runtime::contract::ModelInvoker; the chrome snapshot tests use
indusagi::tui_render::bridge::AgentMessage,
indusagi::tui_render::theme_adapter::create_theme_adapter, and
indusagi::tui_render::types::{SessionStats, StatsTokens, StatusKind}. That makes
the suite an integration boundary: a framework change can move these tests even
with no induscode change. The framework has its own
test suite; this one sits on top of it, resolving a
single published indusagi crate (the reason the framework testkit seams are
vendored rather than re-imported).
See also: Architecture for the module map,
Console overview for the rendered surface the
snapshots pin, and Conductor for the drive loop
ScriptedModel and FakeAgent exercise.
