Testing
The
induscoderebuild ships a 703-test, network-freepytestsuite undertests/. Run it with.venv/bin/python -m pytest. Every test runs against a mock connector and the TextualPilotdriver — no LLM calls, no real home directory, no sleeps.
Table of Contents
- Running the suite
- Hermetic discipline
- Layout: one directory per subsystem
- Coverage map
- The two root guards
- Three layers of fidelity
- The console end-to-end harness
- Key fixtures and helpers
- The 703 vs 683 gap
- Framework as a test dependency
Running the suite
There is no conftest.py and no shared-fixtures module — every test file
defines its own fakes inline. All configuration lives in pyproject.toml
under [tool.pytest.ini_options]:
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]
asyncio_mode = "auto" means the ~240 async def test_* functions (mostly the
Textual Pilot drives) run under pytest-asyncio with no per-test marker.
testpaths = ["tests"] pins collection to the tests/ tree.
cd induscode-python-rebuild
# the whole suite — 703 tests, mock connector + Textual Pilot
.venv/bin/python -m pytest
Run one subsystem, or one file:
# 110 catalog/matcher/serialize/submit/queue/fork cases
.venv/bin/python -m pytest tests/conductor/
# the scripted Pilot end-to-end scenarios over the real ConsoleApp
.venv/bin/python -m pytest tests/console/test_e2e_pilot.py
Reproduce the per-directory collected count yourself:
.venv/bin/python -m pytest --collect-only -q \
| grep '::' \
| sed -E 's#tests/([^/]+)/.*#\1#' \
| sort | uniq -c | sort -rn
# 324 console, 110 conductor, 36 console_slash, 34 launch, 34 capability_deck, 33 boot, ...
Hermetic discipline
The whole suite is sealed off from the outside world along three axes:
| Axis | Rule | How it is enforced |
|---|---|---|
| No network | Nothing talks to a real model | A mock connector / ScriptedConductor stands in for the LLM and streams deterministic deltas |
| No real home | No test touches ~/.pindusagi/ |
Every disk test runs under tmp_path, with BRAND.env_profile_dir (INDUSAGI_CODING_AGENT_DIR) pinned beneath it and INDUSAGI_HOME deleted |
| No sleeps | No wall-clock timing | Textual Pilot.pause(), the bounded wait_until() poller, and app.workers.wait_for_complete() synchronize against the message pump instead |
This makes the suite deterministic and fast (full collection in ~1s) — and it makes streaming and abort moments observable without races, because the Conductor fake can park mid-stream rather than depending on real latency.
Layout: one directory per subsystem
tests/ mirrors src/induscode/ one-to-one: tests/<subsystem>/test_*.py
maps to src/induscode/<subsystem>/. Two flat files at the tests/ root guard
the package as a whole (see the two root guards).
tests/
├── test_public_api.py # frozen-contract / lazy-barrel guard
├── test_scaffold.py # M0 version, BRAND identity, Workspace paths
├── console/ # 13 files — the interactive surface
├── conductor/ # catalog, submit, queue/fork, contract+hub+skill
├── console_slash/ # the framework-backed slash registry
├── launch/ # the flag table, usage, package command
├── capability_deck/ # the tooling layer + MCP bridge ledger
├── boot/ # the launch orchestrator
├── runtime_bridge/ # external-runtime provider routing
├── addons/ # the addon host pipeline
├── settings/ # the PreferenceStore
├── window_budget/ # context-window budgeting
├── kit/ # leaf helpers
├── briefing/ # system-prompt composition
├── sessions/ # the SessionLibrary
├── insight/ # the tracing wrapper
├── transcript_export/ # markdown/highlight export
└── channels/ # the non-interactive channels
Coverage map
The 703 collected tests break down by top-level directory as follows. Each file's docstring names the source area it exercises.
| Directory | Tests | What it covers |
|---|---|---|
console/ |
324 | Reducer, input/intents/chords/completion, theme, banner + chrome, overlays (ModalKind router), startup survey, the slash handler + command groups (dynamic / integrations / transcript / workbench), and the two Pilot files (test_console_app.py, test_e2e_pilot.py). See Console overview. |
conductor/ |
110 | catalog_store (tolerant catalog gate, ModelMatcher scoring constants, transcript-tree append/branch/fork, NDJSON serialize round-trip), submit/resume behavior, queue + fork, and contract + signal-hub + skill-parse. See Conductor. |
console_slash/ |
36 | The framework-backed slash registry (build_registry, Handled) exercised end-to-end with no TUI and no real conductor. See Slash commands. |
launch/ |
34 | The flag table and read_invocation parser, usage renderer, plus the package command. See Launch. |
capability_deck/ |
34 | The contract + bridge-ledger (pure, python-ulid keys) and card provisioning / novel cards. See Capability Deck. |
boot/ |
33 | tokenize_invocation flag→mode mapping, --help/--version short-circuit, workspace resolution, idempotent apply_upgrades, run_stages BootContext, select_runner; plus the invocation projection, the --resume picker seam, and session-persist integration. See Boot. |
runtime_bridge/ |
18 | External-runtime provider routing. See Runtime Bridge. |
addons/ |
14 | The addons host pipeline (no real import, no disk for host-pipeline cases). See Addons. |
settings/ |
11 | The PreferenceStore. See Settings. |
window_budget/ |
10 | Context-window budgeting incl. the condense-scope branch digest. See Window Budget. |
kit/ |
10 | The leaf helper kit. |
briefing/ |
10 | System-prompt composition. See Briefing. |
sessions/ |
9 | The SessionLibrary. See Sessions. |
insight/ |
9 | The tracing wrapper over the framework. See Insight. |
transcript_export/ |
8 | The markdown/highlight transcript export (markdown-it-py + pygments). See Transcript Export. |
channels/ |
8 | The non-interactive channels. See Channels. |
test_public_api.py |
7 | The frozen-contract / lazy-barrel guard. |
test_scaffold.py |
17 | The M0 version / brand / workspace gate. |
The two root guards
Two flat files sit at the root of tests/ and verify the package as a whole
rather than any one subsystem.
`tests/test_public_api.py` — the frozen-contract guard
This file proves the public surface stays frozen and the lazy barrel stays lazy. See Package exports for the contract it enforces.
| Name | Kind | Purpose |
|---|---|---|
EXPECTED_SUBSYSTEMS |
const | The 17-name tuple (addons, boot, briefing, capability_deck, channels, conductor, console, console_slash, insight, kit, launch, runtime_bridge, sessions, settings, transcript_export, window_budget, workspace) the top-level barrel must lazily re-export. |
test_all_modules_import |
function | Walks every module under induscode via pkgutil.walk_packages and imports each — the import-can't-fail gate. |
test_declared_exports_resolve |
function | For every module's __all__, asserts the list is sorted and every named symbol resolves — the frozen-contract discipline. |
test_barrel_reexports_every_subsystem |
function | Asserts each EXPECTED_SUBSYSTEMS name resolves through the barrel and re-exports the matching induscode.<name> package. |
test_barrel_subsystems_match_packages |
function | Cross-checks EXPECTED_SUBSYSTEMS against both the on-disk packages and induscode._SUBSYSTEMS. |
test_bare_import_is_side_effect_free |
function | Runs import induscode in a clean subprocess and asserts no textual.* and no induscode subsystem (beyond workspace/brand/locator) got eagerly loaded — proves the PEP 562 lazy barrel. |
EXPECTED_SUBSYSTEMS is hardcoded here on purpose — kept in the test, not
imported from the barrel, so a regression that silently drops a subpackage is
caught. Adding or removing a subpackage requires editing this tuple, or several
of these tests fail.
`tests/test_scaffold.py` — the M0 gate
This file pins identity and path resolution:
- Version single-sourcing. It reads the version from
importlib.metadata.version("induscode")with no hardcoded literal, so a version bump inpyproject.tomlnever requires a test edit. It assertsinduscode.__version__ == induscode.VERSION == metadata.version("induscode"). - Frozen
BRANDidentity.BRAND.name == "induscode",BRAND.profile_dir_name == ".pindusagi"(the flat root — state lives at~/.pindusagi/),BRAND.bin_names == ("pindus", "induscode"), andBRAND.env_profile_dir == "INDUSAGI_CODING_AGENT_DIR". - Sandboxed
Workspacepath resolution and the 15-keyLAYOUT, asserted againstinduscode.workspace.LAYOUT, theWorkspacedataclass fields, and the verbatim basenames.
TS_LAYOUT_KEYS = (...) # 15 layout keys, snake-cased
# asserted against induscode.workspace.LAYOUT and Workspace dataclass fields
Three layers of fidelity
Tests come in three layers of increasing integration:
Pure-data unit tests over dependency-free reducers, folds, and matchers.
tests/console/test_reducer.pyfolds events throughconsole_reducerasserting purity and no-op identity;tests/conductor/test_catalog_store.pypinsModelMatcherscoring constants and NDJSON serialize round-trips.from induscode.console import init_console_state, console_reducer, RowsAppend, ViewRow state = init_console_state() nxt = console_reducer(state, RowsAppend(row=ViewRow(id="r1", kind="answer", text="hi"))) assert nxt is not state and len(state.rows) == 0 # input state untouchedProtocol / seam tests over injected stubs —
CatalogSourcestubs, thememory_backend/fs_backendstores, and theSessionConductorprotocol faked byScriptedConductor.Full Textual end-to-end drives in
tests/console/test_e2e_pilot.pyandtests/console/test_console_app.pythat mount the realConsoleAppviaapp.run_test(size=...)and drive it throughPilot.
The console end-to-end harness
The e2e files mount the actual ConsoleApp and drive it the way a user would —
typing keys, clicking, escaping — then synchronize against the message pump.
app, scripted = build_app(tmp_path)
async with app.run_test(size=(100, 40)) as pilot:
await pilot.pause()
await pilot.press("h", "i", "enter")
await app.workers.wait_for_complete()
assert scripted.submitted == ["hi"]
These tests go further than buffer checks: they read
app.screen._compositor.render_strips() to assert against the actually-painted
terminal cells (and caret visibility). That is a deliberate fix — a regression
where the prompt box drew its border but the content row stayed blank slipped
past buffer-only checks (editor.get_text()) and only a rendered-cell assertion
caught it. Several e2e classes are explicit regression pins with long
explanatory docstrings:
| Test class | Bug it pins |
|---|---|
TestEditorRendersTypedText |
The prompt box drew its border but the content row was blank — asserts against rendered cells. |
TestEditorKeyboardFocus |
Text could not be typed into the composer — documents the non-focusable-scroll-containers fix. |
Key fixtures and helpers
Because there is no conftest.py, the same fake patterns recur per-file. The
central console e2e helpers in tests/console/test_e2e_pilot.py:
| Name | Kind | Purpose |
|---|---|---|
ScriptedConductor |
class | A deterministic fake satisfying the SessionConductor protocol (~25 methods implemented inline). submit() streams scripted TextSignal deltas and can park mid-stream (hold_turn_end / hang) so a Pilot test observes the live tail or aborts at a precise point. |
build_app |
function | Assembles a real ConsoleApp over a fresh ScriptedConductor with the live transcript + workbench slash groups; returns (app, scripted) for app.run_test() drives. |
wait_until |
async function | A bounded message-pump poll helper (a pilot.pause loop, default 100 tries) — the no-sleep synchronization primitive used throughout the console e2e tests. |
make_services |
function | Builds an OverlayServices wiring a PreferenceStore.at_paths, a SessionLibrary, a FakeVault, and stubbed login callbacks over tmp_path for overlay scenarios. |
The boot sandbox fixture in tests/boot/test_boot.py:
| Name | Kind | Purpose |
|---|---|---|
sandbox_home |
fixture | Gives a fresh tmp_path home with BRAND.env_profile_dir pinned beneath it and INDUSAGI_HOME deleted, so boot() resolves a hermetic workspace. |
And the reducer exhaustiveness pin in tests/console/test_reducer.py:
| Name | Kind | Purpose |
|---|---|---|
_REPRESENTATIVES |
const | One ConsoleEvent per discriminant fed through console_reducer — the analogue of union-exhaustiveness, asserted to cover all CONSOLE_EVENT_TYPES. |
The 703 vs 683 gap
The headline count is 703 collected tests (confirmed by pytest --collect-only), while there are 683 raw def test_ declarations. The 20-test
gap is @pytest.mark.parametrize expansion — there are exactly three
parametrize sites in the whole suite, two in
tests/conductor/test_catalog_store.py and one in
tests/console/test_console_app.py.
Framework as a test dependency
The suite imports from both the package under test and the upstream indusagi
framework it builds on — 21 test files import from indusagi.... That makes
the suite an integration boundary, not a closed unit. Framework imports cluster
on:
| Framework area | Files | Used for |
|---|---|---|
indusagi.ai |
11 | AssistantMessage, UserMessage, TextContent, ToolCall, create_zero_usage, … |
indusagi.react_ink |
9 | ModelDialog, StatusMessage, UiDisplayBlock, and the components.editor.PromptEditor / components.messages.list.MessageList / components.display_block.DisplayBlockView used in the Pilot e2e |
indusagi.agent |
3 | The custom message kinds BashExecutionMessage / BranchSummaryMessage / CompactionSummaryMessage / CustomMessage |
indusagi.tui.* |
— | Editor / keybindings / keys / autocomplete primitives |
indusagi.llmgateway.credentials.oauth |
— | OAuth credential seam |
Two consequences follow. First, the input tests in
tests/console/test_input.py explicitly verify that editor verbs were
absorbed by the framework's editor defaults rather than reimplemented, and
the reducer's dropped buffer/caret/history event families are asserted absent
because they moved into the framework editor core. See the framework's
react-ink and TUI pages for those
primitives.
Second, the suite tests against the live framework source (an editable-installed
indusagi[mcp,tui]>=0.1.2), so a framework change can break these tests even
with no induscode change. The framework has its own
test suite; this one sits on top of it.
For the milestone-by-milestone parity ledger this suite encodes — most test-file docstrings carry explicit case counts like "all 16 cases" — see Parity.
