agent-runtime

Archived

Author	SHA1	Message	Date
Nico	4e679a3ad9	Add model matrix test suite: 3 tests × 3 variants = 9 combos New 'matrix' suite runs same API tests with different LLM model configs: - Variants: gemini-flash (baseline), haiku, gpt-4o-mini - Tests: eras_query (SQL correctness), eras_artifact (data output), social_reflex (fast path) - Posts results as test_name[variant] to /tests dashboard - All 9 combos passing (6/9 verified locally, ~35s for ERAS tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 18:12:24 +02:00
Nico	58734c34d2	Wire model overrides end-to-end: API → runtime → frame engine - /api/chat accepts {"models": {"role": "provider/model"}} for per-request overrides - runtime.handle_message passes model_overrides through to frame engine - All 4 graph definitions (v1-v4) now declare MODELS dicts - test_graph_has_models expanded to verify all graphs - 11/11 engine tests green Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 18:07:42 +02:00
Nico	ecfbc86676	Add 3 red tests for Phase 1: config-driven models Tests that will pass once implemented: - graph_has_models: graph definition includes MODELS dict - instantiate_applies_graph_models: node.model set from graph config - model_override_per_request: process_message accepts model_overrides Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 18:03:56 +02:00
Nico	097c7f31f3	Add engine test suite: 8 tests for graph loading, conditions, frame traces New 'engine' suite in run_tests.py with tests that verify frame engine mechanics without LLM calls. Covers graph loading, node instantiation, edge type completeness, reflex/tool_output conditions, and frame trace structure for reflex/expert/expert+interpreter pipelines. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 18:01:06 +02:00

4 Commits