Skip to content

HTTP/CLI tests and structured error codes #41

@clean6378-max-it

Description

@clean6378-max-it

Problem

  1. HTTP boundary untested beyond integration smoke — ~2,500 lines of parser/export utility tests exist, but blueprint wiring (query parsing, status codes, headers, export zip responses) is only partially covered. tests/test_export_api_bulk.py registers blueprints in isolation, not through create_app. Routes such as GET /api/sessions/.../stats and GET /api/export/session/... have no full-app tests.
  2. CLI behavior untested end-to-endtests/test_cli_args.py validates argparse flags only; no test runs scripts/export.py list|stats|export and asserts exit codes, stderr on bad input, or artifact output.
  3. Errors are not machine-readable — Responses use {"error": "..."} without a "code" field. Programmatic clients cannot distinguish SESSION_NOT_FOUND from PARSE_ERROR without fragile string matching. This blocks Thursday API reference documentation and future API versioning.
  4. Regression risk on issue Security: stop leaking exception details in HTTP error responses #25 — New error-path tests must continue to assert that 500 bodies never leak exception class names, tracebacks, or internal paths.

Eval clusters: verification-gap (test 31 — Test Depth) and error-versioning-trajectory (test 10 — Error Design).


Scope (one PR)

Block A — HTTP endpoint and CLI tests (5 pt)

A1 — Route matrix (tests/test_api_routes.py)

Reuse client, client_empty, client_thinking from tests/conftest.py. Do not re-test scenarios already in tests/test_api_integration.py (projects list, project sessions, session detail, search happy/400/default).

Method Path Wednesday tests to add
GET / 200 HTML smoke
GET /api/projects/<name>/sessions Invalid path → 400 + [] (document current behavior)
GET /api/sessions/<proj>/<id>/stats 200 (conversation_turns, cost_estimate_usd); 404; invalid path 400
GET /api/search limit=0 → 400; limit=1.5 → 400; limit=9999 capped at 500
GET /api/sessions/<proj>/<id> Invalid path 400; corrupt JSONL → 500 generic body, no ValueError/Traceback leak
GET /api/export/state Defaults via full create_app
POST /api/export 422 empty projects; 400 invalid since; 400 non-object JSON body
GET /api/export/session/<proj>/<id> Markdown attachment; ?format=json; 404; invalid path 400

Port assertions from tests/test_export_api_bulk.py into full-app tests where applicable.

A2 — CLI e2e (tests/test_cli_e2e.py)

Use subprocess.run with sys.executable and scripts/export.py (pattern: tests/test_export_exclusion_filtering.py).

Scenario Assert
list exit 0
list --project <unknown> non-zero or empty output (match actual behavior; document in test)
stats exit 0
--since yesterday (invalid) exit ≠ 0, stderr mentions invalid since
export --since all with seeded data exit 0, zip artifact exists

Do not duplicate tests/test_cli_args.py parser parity tests.

A3 — Shared helpers

  • Add assert_error_response(resp, expected_code=None) to tests/conftest.py.
  • Update tests/test_api_integration.py::_assert_error_shape to require "code" after Block B lands.

Slip if overloaded: CLI e2e (one subprocess test) → export session routes → defer remaining export POST happy-path to follow-up commit on same PR.


Block B — Structured error codes (3 pt)

B1 — Central module (api/error_codes.py)

class ErrorCode(StrEnum):
    SEARCH_INVALID_LIMIT = "SEARCH_INVALID_LIMIT"
    INVALID_PATH = "INVALID_PATH"
    SESSION_NOT_FOUND = "SESSION_NOT_FOUND"
    INVALID_REQUEST_BODY = "INVALID_REQUEST_BODY"
    INVALID_SINCE_MODE = "INVALID_SINCE_MODE"
    PARSE_ERROR = "PARSE_ERROR"
    EXPORT_NOTHING_TO_EXPORT = "EXPORT_NOTHING_TO_EXPORT"
    INTERNAL_ERROR = "INTERNAL_ERROR"
    # ... see detail guide for full mapping

def error_response(code: ErrorCode, message: str, status: int, **extra) -> tuple[Response, int]:
    body = {"error": message, "code": str(code)}
    body.update(extra)
    return jsonify(body), status

Optional: models/errors.py with ErrorResponseDict if Monday models/ package is on branch.

B2 — Migrate all api/*.py error sites

File Example codes
api/search.py SEARCH_INVALID_LIMIT on bad limit
api/sessions.py INVALID_PATH, SESSION_NOT_FOUND, PARSE_ERROR, INTERNAL_ERROR
api/export_api.py INVALID_REQUEST_BODY, INVALID_SINCE_MODE, EXPORT_NOTHING_TO_EXPORT, SESSION_NOT_FOUND, INVALID_PATH

Preserve extra fields where they exist today (e.g. "since" on invalid since mode). Keep human-readable error text unchanged for SPA compatibility.

B3 — Tests

  • tests/test_error_codes.py — parametrized table: path/body → status + code.
  • Update tests/test_search.py, tests/test_api_integration.py, tests/test_export_api_bulk.py error assertions.
  • Grep guard: no remaining bare jsonify({"error": ...}) in api/ without code.

B4 — Catalog stub for Thursday

Short table in README.md or docs/error-codes.md listing each ErrorCode, HTTP status, and meaning (Thursday API reference expands this).


Acceptance criteria

HTTP + CLI tests (issue fab633e4-...)

  • Every blueprint endpoint in api/projects.py, api/sessions.py, api/search.py, api/export_api.py has ≥1 happy-path and ≥1 error-path test using Flask test client with create_app
  • GET /api/sessions/<project>/<id>/stats covered (happy + 404 + invalid path)
  • GET /api/export/session/<project>/<id> covered (md + json + 404)
  • POST /api/export and GET /api/export/state exercised through full app (not isolated blueprint only)
  • CLI scripts/export.py has ≥1 e2e test each for list, stats, and export path; invalid --since asserts non-zero exit
  • Error responses verified: correct status, JSON error key, no stack trace / exception class leakage (issue Security: stop leaking exception details in HTTP error responses #25)
  • Tests pass in CI
  • PR approved by at least 1 reviewer

Structured error codes (issue 8c01c675-...)

  • All API error JSON bodies include "code" (stable UPPER_SNAKE_CASE string) and "error" (human-readable message)
  • ErrorCode enum (or constants module) is single source of truth; documented in README or docs/error-codes.md
  • api/search.py, api/sessions.py, api/export_api.py migrated to error_response() helper
  • Tests assert both "error" and "code" on every error response under test
  • GET /api/search?q=test&limit=abc returns code: SEARCH_INVALID_LIMIT (or equivalent) with 400
  • Tests pass in CI
  • PR approved by at least 1 reviewer

Overall Wednesday checkpoint

  • pytest -q and mypy green
  • No duplicate coverage of Tuesday integration scenarios (extend, don't copy)
  • Thursday docs can reference error code catalog without re-auditing source

Out of scope

  • API reference documentation (docs/api-reference.md) — Thursday, 5 pt
  • CONTRIBUTING.md / architecture docs — Thursday, 3 pt
  • Playwright / browser E2E
  • New HTTP routes or CLI flags
  • Changing 2xx response JSON field names
  • TypedDict work (Monday) unless adding ErrorResponseDict

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions