Skip to content

Add CI, test suite, team templates, and fix config bugs#1

Merged
tjb-tech merged 3 commits intoHKUDS:mainfrom
he-yufeng:feat/ci-tests-templates
Mar 18, 2026
Merged

Add CI, test suite, team templates, and fix config bugs#1
tjb-tech merged 3 commits intoHKUDS:mainfrom
he-yufeng:feat/ci-tests-templates

Conversation

@he-yufeng
Copy link
Copy Markdown
Contributor

Summary

This PR adds foundational quality infrastructure and new features to ClawTeam:

Test Suite (112 tests, 0.3s)

  • test_config.py — config load/save roundtrip, get_effective env/file/default priority
  • test_models.py — Pydantic model defaults, alias serialization, JSON roundtrips
  • test_tasks.py — TaskStore CRUD, dependency resolution, task locking and dead-agent release
  • test_mailbox.py — send/receive/broadcast, consume semantics, peek, event logging
  • test_manager.py — team create/cleanup, member management, leader lookup
  • test_templates.py — variable substitution, builtin template validation, user override

All tests use tmp_path with CLAWTEAM_DATA_DIR isolation — no filesystem side effects.

CI (GitHub Actions)

  • Ruff lint + pytest across Python 3.10/3.11/3.12 on ubuntu + macos

New Team Templates

  • code-review — multi-perspective code review (security, performance, architecture reviewers)
  • research-paper — academic research team (PI, lit survey, methodology, data analysis)

Bug Fixes

  • config show only displayed 3 of 7 config keys — now shows all
  • get_effective used truthiness check (if file_val:), so skip_permissions=False would fall through to default True — now compares against default value instead
  • config show display: boolean values rendered properly with str() instead of or "(empty)"

Test plan

  • pytest tests/ -v — 112 passed in 0.32s
  • ruff check tests/ — all checks passed
  • CI workflow validates on push

CI:
- GitHub Actions workflow: ruff lint + pytest on Python 3.10/3.11/3.12
  across ubuntu and macos

Templates:
- code-review: multi-perspective review team (security, performance,
  architecture reviewers coordinated by a lead)
- research-paper: academic research team (PI, literature surveyor,
  methodology designer, data analyst)

Bug fixes:
- config show: display all 7 config keys instead of only 3
- get_effective: compare against default instead of checking truthiness,
  so explicitly-set falsy values (like skip_permissions=False) aren't
  silently replaced by defaults
- config show: handle boolean values in display (str() instead of or)
@he-yufeng
Copy link
Copy Markdown
Contributor Author

@tjb-tech

@tjb-tech tjb-tech merged commit a46ab24 into HKUDS:main Mar 18, 2026
a24ibrah pushed a commit to a24ibrah/ClawTeam that referenced this pull request Mar 30, 2026
Add CI, test suite, team templates, and fix config bugs
juntaochi added a commit to novix-science/ClawTeam-gstack that referenced this pull request Apr 20, 2026
…n Transport.deliver

GREEN phase — implements the two Phase 2 hook sites that break A<->B deadlock
loops and enforce the envelope protocol scope-limited for BC.

DefaultRoutingPolicy (clawteam/team/routing_policy.py):
  - decide() prefixed with cycle detection BEFORE existing throttle (D-18);
    reuses the existing recentEvents 50-entry bounded window (specifics
    lesson HKUDS#1, no parallel tracker).
  - _topic_hash(envelope): priority chain dedupe_key -> payload.request_id ->
    sha1(content[:128])[:16] (D-19).
  - _detect_cycle(state, envelope, topic_hash): scans last 20 recentEvents for
    >=3 A->B AND >=3 B->A with matching topic_hash (D-20); returns None when any
    entry carries progressSignal: True (Pitfall HKUDS#3 mitigation).
  - Cycle hit persists topic_hash onto routes[route_key].suppressedTopics, saves
    state BEFORE sync emit (Pitfall HKUDS#7), then emits CycleDetected with pair,
    topic_hash, route_keys, window_size=20.
  - Subsequent hits for same route+topic short-circuit with
    reason='cycle_suppressed_existing' (no re-emit, D-21).
  - _append_event gains topic_hash kwarg; pending/aggregated/cycle_suppressed
    entries now carry topicHash so future decide() calls can correlate.

Transport (clawteam/transport/base.py):
  - _pre_deliver_hooks(data) -> TeamMessage: parses JSON bytes, short-circuits
    when msg has none-of-three envelope fields (Pitfall HKUDS#8 BC pass-through),
    otherwise requires all three via TurnEnvelope.model_validate.
  - Per-agent module-level consecutive-malformed counter (D-09); threshold = 8
    consecutive strikes emits DriftRegression (sync). Any valid envelope resets
    the counter for that agent. MalformedEnvelope emitted on every strike.
  - _reset_drift_counters(): test + SprintConductor-teardown helper (Plan 02-11
    contract).
  - Transport ABC unchanged (public interface preserved).

FileTransport.deliver (clawteam/transport/file.py):
  - Invokes _pre_deliver_hooks at the top; MalformedEnvelopeError re-raised to
    caller so MailboxManager can route to dead-letter / surface to the agent.
  - p2p.py falls back to FileTransport.deliver so its delivery path inherits
    the hook automatically; no direct change needed there for v1.

Tests: 698 passed, 1 skipped (Plan 02-09 gate); ruff clean.
Regression matrix (12 templates) and test_runtime_routing.py both still green.
juntaochi added a commit to novix-science/ClawTeam-gstack that referenced this pull request Apr 20, 2026
…ressSignal routing-policy field

- clawteam/harness/forced_progress_gate.py: PhaseGate subclass with 2-turn per-agent
  threshold, strict per-agent attribution (no sprint-wide summation), multi-choice
  question writer (A=keep waiting, B=restart agent, C=abort sprint), ForcedProgressTriggered
  emit, module-level increment_turn_counter + reset_turn_counter_state helpers (Pitfall HKUDS#2
  turn_id dedupe).
- clawteam/harness/theater_detector.py: re-export shim — flips the skipif gate on
  tests/test_cycle_detector.py::test_progress_signal_breaks_cycle_streak_per_pitfall3
  so the previously-skipped Pitfall HKUDS#3 progress-signal test now runs green.
- clawteam/harness/artifacts.py: ArtifactStore.__init__ accepts turn_counter_callback
  kwarg (keyword-only, defaults None — BC preserved); write() invokes callback at end
  of hook chain when metadata carries an agent field (D-14 site HKUDS#2).
- clawteam/transport/base.py: module-level _TURN_COUNTER_CALLBACK slot + set_turn_counter_callback
  setter + get_turn_counter_callback test helper; _pre_deliver_hooks invokes callback
  on successful envelope validation (D-14 site HKUDS#1); _reset_drift_counters also clears
  the callback for clean sprint teardown.
- clawteam/team/routing_policy.py: _append_event gains progress_signal kwarg and writes
  progressSignal key onto every recentEvents entry (Plan 02-08 reserved the field,
  Plan 02-09 populates it — Pitfall HKUDS#3 seam).

Verification:
- pytest tests/test_forced_progress_gate.py -q → 9 passed
- pytest tests/test_cycle_detector.py -q → 7 passed (progress-signal test now active)
- pytest tests/test_transport_envelope.py tests/test_artifact_caps.py -q → all pass
- pytest tests/test_template_regression_matrix.py -q → 12 passed
- pytest tests/test_sprint_state.py tests/test_interaction_gate.py tests/test_event_types_phase2.py -q → peer green
- Full suite: 734 passed in 107s
- Ruff clean on all modified files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants