Smoke test required

...

← All documentation

Documentation

Smoke Runbook

**Status:** active · last reviewed 2026-05-03

source: docs/getting-started/SMOKE_RUNBOOK.md

Phase 2.4 integration smoke — the last safety net before flipping the public Ensemble grid live. If any test in this suite fails after a merge to main, do not dashboard-fire the affected project until the regression is investigated.

What it covers

tests/integration/ drives a real bin/control-panel-server.py subprocess and exercises the magic-moment flow end-to-end:

File Scope Slot directive
test_e2e_signup_to_publish.py signup → procedural fill → edit → publish → unpublish → re-publish Test 1
test_e2e_visitor_flow.py unauth visitor reads + private-canvas blocking + slot-limit 402 + payload-too-large 413 + XSS escape + cookie banner + ToS gate Tests 2-5
test_e2e_browser.py optional Playwright DOM smoke (skipped when Playwright is not installed) bonus

The session fixture in conftest.py boots the server against a tempdir for $CLAUDE_PLUGIN_DATA + $ORCH_DATA and wipes /tmp/ensemble-dashboard/canvases + /tmp/ensemble-dashboard/subscriptions before the run so each invocation starts clean. The dashboard output dir under the same prefix is left alone so a developer running the real control panel locally does not lose their generated HTML.

When to run

  • Before flipping the Ensemble grid public. Required.
  • Before merging any PR that touches bin/control-panel-server.py, bin/dashgen/canvas_*.py, bin/dashgen/pairing.py, bin/dashgen/route_table.py, or any of the legal / cookie / first-run templates.
  • Weekly as a scheduled run while the marketing page is live, to catch drift caused by silent dependency upgrades.

How to run

One-shot

bash bin/run_e2e_smoke.sh

The wrapper resolves a Python interpreter that has pytest available. On a fresh Mac, it creates a venv at /tmp/ensemble-e2e-venv and installs pytest + markdown into it. Override with:

E2E_PYTHON=/path/to/python bash bin/run_e2e_smoke.sh

Filter / verbose

run_e2e_smoke.sh forwards any extra args to pytest:

bash bin/run_e2e_smoke.sh -v                # verbose
bash bin/run_e2e_smoke.sh -k visitor        # only visitor tests
bash bin/run_e2e_smoke.sh tests/integration/test_e2e_signup_to_publish.py

With Playwright

/tmp/ensemble-e2e-venv/bin/python -m pip install playwright
/tmp/ensemble-e2e-venv/bin/python -m playwright install chromium
bash bin/run_e2e_smoke.sh

The 3 browser tests in test_e2e_browser.py will pick up the install automatically; without it they emit a UserWarning and skip.

Interpreting failures

Failing test Likely root cause
test_signup_seeds_procedural_canvas Procedural-fill fallback templates broken or signup_context routing changed in canvas_handlers.handle_create_canvas.
test_publish_to_slot_zero_appears_in_grid Canvas store _index.json missing the published list, OR /ensemble/canvases?filter=featured query rewriting regressed.
test_unpublish_removes_canvas_from_grid _remove_from_index not clearing published.
test_canvas_detail_endpoint_serves_public_canvas _dispatch_canvas_get route regex no longer matches /ensemble/canvas/<id>.
test_visitor_cannot_publish / test_visitor_cannot_modify Pairing-token middleware (_resolve_paired_user) no longer rejects empty headers.
test_visitor_cannot_see_private_canvas handle_get_canvas private-visibility check removed or inverted.
test_free_tier_fourth_slot_returns_402_with_contract_shape Slot-limit enforcement broken or 402 body shape drifted from docs/architecture/api-contracts/CHUNK_2_API_CONTRACT.md §2.
test_swap_slot_zero_demotes_old_canvas_to_draft Atomic swap in handle_publish_canvas regressed — old occupant should land in draft state, not be deleted.
test_text_component_with_script_tag_is_escaped_in_static_render bin/dashgen/pages/canvas_view.py no longer applies replace("</", "<\\/") on the inline boot JSON. High-severity XSS regression.
test_javascript_url_in_click_to_link_is_filtered_in_static_render Static page is leaking javascript: URL into an href attribute. High-severity XSS regression.
test_payload_too_large_returns_413 Body-size cap in do_POST removed or raised.
test_state_endpoint_without_auth_returns_visitor_safe Visitor-safe /state shape now leaks privileged keys. Production-state-leak regression.
test_index_html_contains_cookie_banner Cookie banner template removed from bin/dashgen/__main__.py or pages/cookie_banner.py.
test_index_html_contains_first_run_tos_checkbox First-run modal lost the ToS checkbox or /legal/terms.html link.
test_legal_terms_page_renders legal_pages.render_legal_pages skipped or markdown library missing. The test skips (rather than fails) if the page is absent — investigate the dashboard regen log.

Manual rollback

If the smoke catches a regression on main:

  1. Identify the offending commit: bash git log --oneline -- bin/dashgen/canvas_handlers.py bin/dashgen/pages/canvas_view.py bin/control-panel-server.py
  2. Revert (do not amend; do not force-push): bash git revert <sha>
  3. Re-run the smoke before merging the revert: bash bash bin/run_e2e_smoke.sh
  4. If the failing test was an XSS or state-leak (the High-severity rows above), keep the public grid behind the feature gate until the revert lands.

CI integration sketch

A skeleton GitHub Actions workflow lives at .github/workflows/integration-test.yml (slot S1A enables the auto-trigger after the Sprint 5 ramp-up; until then it is dispatched manually). Required runner permissions: nothing beyond python3 on PATH. The job sets up its own venv:

  - name: E2E smoke
    run: bash bin/run_e2e_smoke.sh -v

The smoke completes in under 5 seconds without Playwright; with the 3 Playwright tests it is closer to 25-30 seconds (one Chromium boot per test).

Implementation notes

  • The canvas + subscription stores hard-code /tmp/ensemble-dashboard/ as their root directory. The fixture wipes the canvases/ and subscriptions/ subdirs before each session to isolate runs. Tests that need to seed canvas data use the in-process JsonFileCanvasStore (which also points at the shared /tmp path).
  • The pairing store does honour $ORCH_DATA, so per-test pairing tokens are written into the test’s tempdir. The subprocess server reads them off disk (no IPC needed).
  • HTTP requests use stdlib urllib.request — no requests dependency.
  • The fixture writes a DASHBOARD_TOKEN=e2e-test-dashboard-token into $ORCH_DATA/.env.local so the server’s check_auth() is exercised in production-realistic mode (otherwise it would return True for every call and the visitor-safe /state branch would never fire).