chore: shared Hypothesis failure DB + deterministic CI profile#1041
chore: shared Hypothesis failure DB + deterministic CI profile#1041
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🧰 Additional context used📓 Path-based instructions (3)**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
tests/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
⚙️ CodeRabbit configuration file
Files:
{src/synthorg,tests}/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧠 Learnings (25)📓 Common learnings📚 Learning: 2026-04-03T15:55:11.343ZApplied to files:
📚 Learning: 2026-03-15T18:42:17.990ZApplied to files:
📚 Learning: 2026-03-31T16:58:29.857ZApplied to files:
📚 Learning: 2026-03-17T06:30:14.180ZApplied to files:
📚 Learning: 2026-03-15T18:28:13.207ZApplied to files:
📚 Learning: 2026-03-20T21:44:04.528ZApplied to files:
📚 Learning: 2026-03-31T16:58:29.857ZApplied to files:
📚 Learning: 2026-03-31T14:28:28.895ZApplied to files:
📚 Learning: 2026-04-03T15:55:11.343ZApplied to files:
📚 Learning: 2026-04-02T08:47:46.313ZApplied to files:
📚 Learning: 2026-04-02T08:47:46.313ZApplied to files:
📚 Learning: 2026-04-02T08:47:46.313ZApplied to files:
📚 Learning: 2026-03-31T14:31:11.894ZApplied to files:
📚 Learning: 2026-03-17T22:08:13.456ZApplied to files:
📚 Learning: 2026-04-02T08:47:46.313ZApplied to files:
📚 Learning: 2026-04-03T15:55:11.343ZApplied to files:
📚 Learning: 2026-03-16T07:22:28.134ZApplied to files:
📚 Learning: 2026-03-31T21:07:37.470ZApplied to files:
📚 Learning: 2026-04-03T15:55:11.343ZApplied to files:
📚 Learning: 2026-04-03T15:55:11.343ZApplied to files:
📚 Learning: 2026-03-15T18:28:13.207ZApplied to files:
📚 Learning: 2026-04-01T15:36:39.993ZApplied to files:
📚 Learning: 2026-04-01T15:36:39.993ZApplied to files:
📚 Learning: 2026-04-01T15:36:39.993ZApplied to files:
🔇 Additional comments (6)
WalkthroughAdds a Hypothesis 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
There was a problem hiding this comment.
Code Review
This pull request enhances the property-based testing workflow by introducing a dedicated "fuzz" profile and a shared, write-only database for Hypothesis failures. This setup allows for deep fuzzing sessions while ensuring that failing examples are persisted across worktrees for analysis without replaying them automatically. Review feedback suggests making the _WriteOnlyDatabase strictly append-only by disabling delete and move operations to prevent data loss, alongside a minor formatting adjustment in the documentation for consistency.
tests/conftest.py
Outdated
| def delete(self, key: bytes, value: bytes) -> None: | ||
| self._db.delete(key, value) | ||
|
|
||
| def move( | ||
| self, | ||
| src: bytes, | ||
| dest: bytes, | ||
| value: bytes, | ||
| ) -> None: | ||
| self._db.move(src, dest, value) |
There was a problem hiding this comment.
The _WriteOnlyDatabase is intended to be a persistent log of all failing examples for later analysis. By delegating delete and move to the underlying database, you risk losing this historical data when, for example, a test is deleted or renamed and Hypothesis cleans up its database.
To make the shared database a true append-only log and prevent accidental data loss, consider making delete and move no-ops.
| def delete(self, key: bytes, value: bytes) -> None: | |
| self._db.delete(key, value) | |
| def move( | |
| self, | |
| src: bytes, | |
| dest: bytes, | |
| value: bytes, | |
| ) -> None: | |
| self._db.move(src, dest, value) | |
| def delete(self, key: bytes, value: bytes) -> None: | |
| pass | |
| def move( | |
| self, | |
| src: bytes, | |
| dest: bytes, | |
| value: bytes, | |
| ) -> None: | |
| pass |
| uv run python -m pytest tests/ -n 8 --cov=synthorg --cov-fail-under=80 # full suite + coverage | ||
| HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties # property tests (dev profile, 1000 examples) | ||
| HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties # property tests (dev, 1000 examples) | ||
| HYPOTHESIS_PROFILE=fuzz uv run python -m pytest tests/ -m unit -n 8 -k properties # deep fuzzing (10,000 examples, no deadline) |
There was a problem hiding this comment.
For better readability and consistency, you could align the comments for the HYPOTHESIS_PROFILE commands by adding a space before the #.
| HYPOTHESIS_PROFILE=fuzz uv run python -m pytest tests/ -m unit -n 8 -k properties # deep fuzzing (10,000 examples, no deadline) | |
| HYPOTHESIS_PROFILE=fuzz uv run python -m pytest tests/ -m unit -n 8 -k properties # deep fuzzing (10,000 examples, no deadline) |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@CLAUDE.md`:
- Around line 174-175: Add a short migration note to the Hypothesis section
advising developers to update existing property tests that lack explicit
`@example`() decorators so they are covered in CI: scan tests under tests/ for
Hypothesis-using tests (uses `@given` from Hypothesis as configured in
tests/conftest.py and controlled by HYPOTHESIS_PROFILE), add representative
`@example`(...) cases that capture known edge/failure inputs, and include a brief
recommended checklist (identify property tests with no `@example`, write one or
more `@example`(...) decorators for edge cases, run with HYPOTHESIS_PROFILE=ci to
verify reproducibility, commit the explicit examples); mention that without this
migration those tests get zero CI coverage.
In `@tests/conftest.py`:
- Around line 54-56: The module-level mkdir call for _SHARED_HYPOTHESIS_DIR can
raise and break test imports; wrap the
Path.home()/".synthorg"/"hypothesis-examples" creation in a try/except that
falls back to a safe alternative (e.g., create a temp dir via tempfile.mkdtemp()
or set _SHARED_HYPOTHESIS_DIR = None) and log or ignore the error, and then make
any dependent initialization (such as _shared_db) conditional on
_SHARED_HYPOTHESIS_DIR being non-None/valid so tests degrade gracefully when the
home directory is unwritable.
- Around line 66-73: The CI profile currently uses settings.register_profile
with phases=[Phase.explicit], which prevents Hypothesis `@given` tests from
running; update the profile (the settings.register_profile call) to remove the
explicit-only phase and set max_examples=50 (e.g., include max_examples=50 and
appropriate phases such as default or Phase.default) so property tests execute
in CI, or alternatively add `@example`(...) decorators to the existing `@given`
tests listed; ensure you adjust the settings.register_profile invocation (and
keep suppress_health_check=[HealthCheck.too_slow] if desired) so CI runs 50
examples per property test instead of only explicit examples.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6351c9de-d1d9-4b1f-b290-56dac6a23a83
📒 Files selected for processing (2)
CLAUDE.mdtests/conftest.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Agent
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (go)
- GitHub Check: Analyze (python)
- GitHub Check: Dependency Review
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations—Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: useexcept A, B:(no parentheses)—ruff enforces this on Python 3.14.
All public functions must have type hints; mypy strict mode is required.
Docstrings must use Google style and are required on all public classes and functions—enforced by ruff D rules.
Create new objects instead of mutating existing ones—use immutability as the default pattern.
For non-Pydantic internal collections (registries,BaseTool), usecopy.deepcopy()at construction andMappingProxyTypewrapping for read-only enforcement.
Usemodel_copy(update=...)for runtime state that evolves; separate mutable-via-copy models from frozen Pydantic models for config/identity. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel,model_validator,computed_field,ConfigDict). Useallow_inf_nan=Falsein allConfigDictdeclarations to rejectNaN/Infin numeric fields at validation time.
Use@computed_fieldfor derived values instead of storing and validating redundant fields (e.g.TokenUsage.total_tokens).
UseNotBlankStrfromcore.typesfor all identifier/name fields—including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants—instead of manual whitespace validators.
Preferasyncio.TaskGroupfor fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over barecreate_task. Existing code is being migrated incrementally.
Function line length limit: 88 characters; enforced by ruff.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly; never silently swallow exceptions.
Validate at system boundaries (user input, external APIs, config files).
Logger variable name must always belogger(not_logger, notlog).
Always use structured logging with kwargs: `l...
Files:
tests/conftest.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow.
Prefer@pytest.mark.parametrizefor testing similar cases in Python tests.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names in tests:test-provider,test-small-001, etc.
Use Hypothesis for property-based testing in Python with@givenand@settingsdecorators. Hypothesis profiles:ci(50 examples, default) anddev(1000 examples). Run dev profile:HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties.
Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()to make them deterministic. For tasks that must block indefinitely until cancelled, useasyncio.Event().wait()instead ofasyncio.sleep(large_number).
Files:
tests/conftest.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/conftest.py
🧠 Learnings (23)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:58:29.857Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing: `given` + `settings`; dev profile has 1000 examples (via `HYPOTHESIS_PROFILE=dev env var`), ci profile (default) has 50
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing with `given` + `settings` decorators; control profiles via `HYPOTHESIS_PROFILE` env var (`ci` for 200 examples, `dev` for 1000 examples)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T13:34:37.835Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing in Python with `given` and `settings` decorators. Hypothesis profiles: `ci` (50 examples, default) and `dev` (1000 examples). Run dev profile: `HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties`.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Property-based testing: Python uses Hypothesis (given + settings). Hypothesis profiles: ci (200 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing in Python with `given` + `settings`. Control via `HYPOTHESIS_PROFILE` env var (dev: 1000 examples, ci: 200 examples).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:58:29.857Z
Learning: Run property tests with `HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties`
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Property-based testing: Python uses Hypothesis (given + settings). Hypothesis profiles: ci (200 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing with `given` + `settings` decorators; control profiles via `HYPOTHESIS_PROFILE` env var (`ci` for 200 examples, `dev` for 1000 examples)
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-04-03T13:34:37.835Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T13:34:37.835Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing in Python with `given` and `settings` decorators. Hypothesis profiles: `ci` (50 examples, default) and `dev` (1000 examples). Run dev profile: `HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties`.
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-03-31T16:58:29.857Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:58:29.857Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing: `given` + `settings`; dev profile has 1000 examples (via `HYPOTHESIS_PROFILE=dev env var`), ci profile (default) has 50
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Run unit tests with `uv run python -m pytest tests/ -m unit -n auto`; integration tests with `-m integration -n auto`; e2e tests with `-m e2e -n auto`
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Run full pytest suite with coverage: `uv run python -m pytest tests/ -n auto --cov=synthorg --cov-fail-under=80`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-31T16:58:29.857Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:58:29.857Z
Learning: Run property tests with `HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties`
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Lint Python code with `uv run ruff check src/ tests/`; auto-fix with `--fix`; format with `uv run ruff format src/ tests/`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-31T14:31:11.894Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:31:11.894Z
Learning: Export OpenAPI schema with `uv run python scripts/export_openapi.py` (required before docs build)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Type-check Python code with `uv run mypy src/ tests/` (strict mode)
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-03T13:34:37.835Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T13:34:37.835Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests—always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic. For tasks that must block indefinitely until cancelled, use `asyncio.Event().wait()` instead of `asyncio.sleep(large_number)`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to {src,tests}/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small` as aliases
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-03T13:34:37.835Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T13:34:37.835Z
Learning: Applies to tests/**/*.py : Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names in tests: `test-provider`, `test-small-001`, etc.
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to {src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx}} : NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small`
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-02T07:18:02.381Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T07:18:02.381Z
Learning: Applies to tests/**/*.py : Never skip, dismiss, or ignore flaky tests; fix them fully and fundamentally by mocking `time.monotonic()` and `asyncio.sleep()` for determinism or using `asyncio.Event().wait()` for indefinite blocking
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing in Python with `given` + `settings`. Control via `HYPOTHESIS_PROFILE` env var (dev: 1000 examples, ci: 200 examples).
Applied to files:
tests/conftest.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/conftest.py
🔇 Additional comments (3)
tests/conftest.py (2)
6-16: LGTM!The imports are well-organized and correctly import the necessary Hypothesis database primitives for the write-only database implementation.
19-45: LGTM!The
_WriteOnlyDatabaseimplementation correctly wraps anExampleDatabaseto provide write-only semantics. Thefetch()method returning an empty iterator ensures failures are captured but never replayed automatically, which aligns with the stated design goal.CLAUDE.md (1)
40-41: LGTM!The quick commands accurately reflect the profile configurations defined in
tests/conftest.py.
There was a problem hiding this comment.
Pull request overview
Restructures Hypothesis configuration to make CI runs deterministic while enabling richer local fuzzing and persisting failing examples outside individual git worktrees.
Changes:
- Updates Hypothesis profiles:
ciruns onlyPhase.explicit, addsfuzzprofile, and wiresdev/fuzzto a combined local+shared example database. - Introduces a
_WriteOnlyDatabasewrapper to record failing examples to~/.synthorg/hypothesis-examples/without replay. - Updates developer documentation/commands for running property tests under the new profiles.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/conftest.py | Adds shared “write-only” Hypothesis example DB + new ci/dev/fuzz profiles. |
| CLAUDE.md | Documents the new Hypothesis profiles and recommended workflows/commands. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/conftest.py
Outdated
| _SHARED_HYPOTHESIS_DIR = Path.home() / ".synthorg" / "hypothesis-examples" | ||
| _SHARED_HYPOTHESIS_DIR.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| _local_db = DirectoryBasedExampleDatabase(".hypothesis/examples/") | ||
| _shared_db = _WriteOnlyDatabase( | ||
| DirectoryBasedExampleDatabase(str(_SHARED_HYPOTHESIS_DIR)), | ||
| ) |
There was a problem hiding this comment.
Creating ~/.synthorg/hypothesis-examples/ unconditionally at import time can break test collection in environments where $HOME is unset or not writable (e.g., sandboxed CI runners/containers) and also causes side-effectful writes even when running the default ci profile. Consider deferring directory creation until the selected profile is dev/fuzz, and/or catching OSError and falling back to a local-only database (or database=None) when the shared directory cannot be created.
tests/conftest.py
Outdated
| # CI only runs explicit @example() cases -- fully deterministic. | ||
| # Random fuzzing happens locally (dev profile) where failures are | ||
| # captured to the shared DB for analysis. | ||
| phases=[Phase.explicit], | ||
| suppress_health_check=[HealthCheck.too_slow], |
There was a problem hiding this comment.
With ci configured as phases=[Phase.explicit], Hypothesis will not run the generation/shrinking phases for @given tests. This repo currently appears to have many @given property tests but no Hypothesis @example(...) decorators, so CI will effectively stop exercising those invariants. If the goal is deterministic CI while keeping property coverage, consider using Hypothesis’ deterministic modes (e.g., derandomize=True and/or a fixed seed) instead of disabling generation entirely, or start adding explicit @example(...) cases to each property test you want covered in CI.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1041 +/- ##
=======================================
Coverage 91.48% 91.48%
=======================================
Files 687 687
Lines 38392 38392
Branches 3821 3821
=======================================
Hits 35124 35124
Misses 2608 2608
Partials 660 660 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tests/conftest.py (1)
55-61:⚠️ Potential issue | 🟠 MajorGuard shared DB initialization to avoid import-time test-suite failures.
Line 55 and Line 56 execute filesystem resolution/creation at import time. If
Path.home()resolution ormkdirfails (sandboxed CI/container/home perms),conftest.pyimport fails and the entire test session aborts.Proposed resilient fallback
-_SHARED_HYPOTHESIS_DIR = Path.home() / ".synthorg" / "hypothesis-examples" -_SHARED_HYPOTHESIS_DIR.mkdir(parents=True, exist_ok=True) - -_local_db = DirectoryBasedExampleDatabase(".hypothesis/examples/") -_shared_db = _WriteOnlyDatabase( - DirectoryBasedExampleDatabase(str(_SHARED_HYPOTHESIS_DIR)), -) - -# Dev/local: local DB for Hypothesis replay + shared DB captures -# failures for analysis without replaying them. -_local_combined_db = MultiplexedDatabase(_local_db, _shared_db) +_local_db = DirectoryBasedExampleDatabase(".hypothesis/examples/") +try: + _shared_hypothesis_dir = Path.home() / ".synthorg" / "hypothesis-examples" + _shared_hypothesis_dir.mkdir(parents=True, exist_ok=True) +except OSError, RuntimeError: + # Fall back to local-only DB when shared path is unavailable. + _local_combined_db = _local_db +else: + _shared_db = _WriteOnlyDatabase( + DirectoryBasedExampleDatabase(str(_shared_hypothesis_dir)), + ) + # Dev/local: local DB for Hypothesis replay + shared DB captures + # failures for analysis without replaying them. + _local_combined_db = MultiplexedDatabase(_local_db, _shared_db)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/conftest.py` around lines 55 - 61, Wrap the shared-dir resolution and creation in a try/except and avoid running Path.home()/.mkdir at import time: move the logic that sets _SHARED_HYPOTHESIS_DIR and constructs _shared_db into a small lazy initializer (e.g., get_shared_db()) or perform the operations inside a try block that catches OSError/PermissionError; on failure fall back to a safe alternative such as using tempfile.mkdtemp() or reusing the local DirectoryBasedExampleDatabase(".hypothesis/examples/") for _shared_db, and on success assign _shared_db = _WriteOnlyDatabase(DirectoryBasedExampleDatabase(str(_SHARED_HYPOTHESIS_DIR))). Ensure any exception is swallowed/handled so import of conftest.py never raises.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tests/conftest.py`:
- Around line 55-61: Wrap the shared-dir resolution and creation in a try/except
and avoid running Path.home()/.mkdir at import time: move the logic that sets
_SHARED_HYPOTHESIS_DIR and constructs _shared_db into a small lazy initializer
(e.g., get_shared_db()) or perform the operations inside a try block that
catches OSError/PermissionError; on failure fall back to a safe alternative such
as using tempfile.mkdtemp() or reusing the local
DirectoryBasedExampleDatabase(".hypothesis/examples/") for _shared_db, and on
success assign _shared_db =
_WriteOnlyDatabase(DirectoryBasedExampleDatabase(str(_SHARED_HYPOTHESIS_DIR))).
Ensure any exception is swallowed/handled so import of conftest.py never raises.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1d702712-86b4-4dfd-8cf9-200e9c8c568f
📒 Files selected for processing (2)
CLAUDE.mdtests/conftest.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: Nofrom __future__ import annotations— Python 3.14 has PEP 649 native lazy annotations
Use PEP 758 except syntax:except A, B:(no parentheses) — ruff enforces this on Python 3.14
Type hints: all public functions, mypy strict mode
Docstrings: Google style, required on public classes/functions (enforced by ruff D rules)
Line length: 88 characters (ruff)
Files:
tests/conftest.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Markers:@pytest.mark.unit,@pytest.mark.integration,@pytest.mark.e2e,@pytest.mark.slow
Async:asyncio_mode = "auto"— no manual@pytest.mark.asyncioneeded
Parametrize: Prefer@pytest.mark.parametrizefor testing similar cases
Property-based testing: Python uses Hypothesis (@given+@settings). Hypothesis profiles:ci(50 examples, default) anddev(1000 examples), controlled viaHYPOTHESIS_PROFILEenv var.
Flaky tests: NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mocktime.monotonic()andasyncio.sleep()to make them deterministic. For tasks that must block indefinitely until cancelled, useasyncio.Event().wait()instead ofasyncio.sleep(large_number).
Files:
tests/conftest.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/conftest.py
{src/synthorg,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names:
example-provider,example-large-001,example-medium-001,example-small-001,large/medium/smallas aliases.
Files:
tests/conftest.py
🧠 Learnings (26)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T15:55:11.343Z
Learning: Applies to tests/**/*.py : Property-based testing: Python uses Hypothesis (`given` + `settings`). Hypothesis profiles: `ci` (50 examples, default) and `dev` (1000 examples), controlled via `HYPOTHESIS_PROFILE` env var.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:58:29.857Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing: `given` + `settings`; dev profile has 1000 examples (via `HYPOTHESIS_PROFILE=dev env var`), ci profile (default) has 50
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing with `given` + `settings` decorators; control profiles via `HYPOTHESIS_PROFILE` env var (`ci` for 200 examples, `dev` for 1000 examples)
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Property-based testing: Python uses Hypothesis (given + settings). Hypothesis profiles: ci (200 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing in Python with `given` + `settings`. Control via `HYPOTHESIS_PROFILE` env var (dev: 1000 examples, ci: 200 examples).
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:58:29.857Z
Learning: Run property tests with `HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties`
📚 Learning: 2026-04-03T15:55:11.343Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T15:55:11.343Z
Learning: Applies to tests/**/*.py : Property-based testing: Python uses Hypothesis (`given` + `settings`). Hypothesis profiles: `ci` (50 examples, default) and `dev` (1000 examples), controlled via `HYPOTHESIS_PROFILE` env var.
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Property-based testing: Python uses Hypothesis (given + settings). Hypothesis profiles: ci (200 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties.
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing with `given` + `settings` decorators; control profiles via `HYPOTHESIS_PROFILE` env var (`ci` for 200 examples, `dev` for 1000 examples)
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-03-31T16:58:29.857Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:58:29.857Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing: `given` + `settings`; dev profile has 1000 examples (via `HYPOTHESIS_PROFILE=dev env var`), ci profile (default) has 50
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-03-31T16:58:29.857Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T16:58:29.857Z
Learning: Run property tests with `HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties`
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to tests/**/*.py : Use Hypothesis for property-based testing in Python with `given` + `settings`. Control via `HYPOTHESIS_PROFILE` env var (dev: 1000 examples, ci: 200 examples).
Applied to files:
CLAUDE.mdtests/conftest.py
📚 Learning: 2026-04-03T15:55:11.343Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T15:55:11.343Z
Learning: Applies to **/*.py : Line length: 88 characters (ruff)
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Run unit tests with `uv run python -m pytest tests/ -m unit -n auto`; integration tests with `-m integration -n auto`; e2e tests with `-m e2e -n auto`
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Run full pytest suite with coverage: `uv run python -m pytest tests/ -n auto --cov=synthorg --cov-fail-under=80`
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Lint Python code with `uv run ruff check src/ tests/`; auto-fix with `--fix`; format with `uv run ruff format src/ tests/`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-31T14:31:11.894Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:31:11.894Z
Learning: Export OpenAPI schema with `uv run python scripts/export_openapi.py` (required before docs build)
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to tests/**/*.py : Test markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`. Coverage: 80% minimum. Async: `asyncio_mode = 'auto'` — no manual `pytest.mark.asyncio` needed. Timeout: 30 seconds per test. Parallelism: `pytest-xdist` via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-02T08:47:46.313Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T08:47:46.313Z
Learning: Type-check Python code with `uv run mypy src/ tests/` (strict mode)
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-03T15:55:11.343Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T15:55:11.343Z
Learning: Applies to {src/synthorg,tests}/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small` as aliases.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-31T21:07:37.470Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T21:07:37.470Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-03T15:55:11.343Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T15:55:11.343Z
Learning: Parallelism: `pytest-xdist` via `-n 8` — ALWAYS include `-n 8` when running pytest locally, never run tests sequentially.
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-03T15:55:11.343Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-03T15:55:11.343Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic. For tasks that must block indefinitely until cancelled, use `asyncio.Event().wait()` instead of `asyncio.sleep(large_number)`.
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Parallelism: pytest-xdist via `-n auto` — ALWAYS include `-n auto` when running pytest, never run tests sequentially.
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to {src,tests}/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small` as aliases
Applied to files:
CLAUDE.md
📚 Learning: 2026-04-01T15:36:39.993Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:36:39.993Z
Learning: Applies to {src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx}} : NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small`
Applied to files:
CLAUDE.md
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins
Applied to files:
tests/conftest.py
📚 Learning: 2026-03-31T14:28:28.895Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:28:28.895Z
Learning: Applies to src/synthorg/**/*.py : Maintain 80% minimum test coverage (enforced in CI)
Applied to files:
tests/conftest.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).
Applied to files:
tests/conftest.py
🔇 Additional comments (3)
tests/conftest.py (1)
19-47: Write-only Hypothesis DB wrapper behavior looks correct.
save()/move()preserve discovered failures, whilefetch()/delete()prevent automatic replay from the shared store. This matches the stated failure-log-only design.CLAUDE.md (2)
40-41: Hypothesis run-command additions are clear and actionable.The new
devandfuzzcommand variants are documented with the right intent and expected scope.
174-175: Profile/workflow documentation matches the new deterministic CI + shared failure-log model.The explanation of deterministic CI, local fuzzing modes, and explicit
@example(...)promotion is coherent.
- Add write-only shared example DB at ~/.synthorg/hypothesis-examples/ that captures failing examples across all worktrees (survives worktree deletion) without replaying them automatically - CI profile uses phases=[Phase.explicit] -- only runs @example() cases, fully deterministic and reproducible - Dev profile (1000 examples) for local random fuzzing - New fuzz profile (10,000 examples, no deadline) for dedicated deep fuzzing sessions - All failures from dev/fuzz runs are captured to the shared DB for later analysis and conversion to @example() decorators
- Add super().__init__() to _WriteOnlyDatabase (fixes _listeners crash) - CI profile: max_examples=10 + derandomize=True instead of phases=[Phase.explicit] (deterministic AND still exercises properties) - _WriteOnlyDatabase.delete is now a no-op (shared DB is a failure log) - _WriteOnlyDatabase.move saves to dest instead of delegating (preserves entries in the failure log) - Update CLAUDE.md to match new CI profile behavior
…HOME Fall back to local-only DB if Path.home() or mkdir fails (e.g., containerized CI with read-only filesystem). Ensures conftest.py import never crashes.
c2cb43a to
62d285b
Compare
🤖 I have created a release *beep* *boop* --- ## [0.6.0](v0.5.9...v0.6.0) (2026-04-03) ### Features * dashboard UI for ceremony policy settings ([#1038](#1038)) ([865554c](865554c)), closes [#979](#979) * implement tool-based memory retrieval injection strategy ([#1039](#1039)) ([329270e](329270e)), closes [#207](#207) * local model management for Ollama and LM Studio ([#1037](#1037)) ([e1b14d3](e1b14d3)), closes [#1030](#1030) * workflow execution -- instantiate tasks from WorkflowDefinition ([#1040](#1040)) ([e9235e3](e9235e3)), closes [#1004](#1004) ### Maintenance * shared Hypothesis failure DB + deterministic CI profile ([#1041](#1041)) ([901ae92](901ae92)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Summary
Restructure Hypothesis property-based testing to be deterministic in CI and capture failures persistently across worktrees.
Changes
phases=[Phase.explicit]): only runs explicit@example()cases -- fully deterministic and reproducible. No random generation in CI.~/.synthorg/hypothesis-examples/: captures every failing example from dev/fuzz runs to a central location outside any worktree. Failures are logged for analysis but never replayed automatically (avoids blocking all test runs until fixed)._WriteOnlyDatabasewrapper: customExampleDatabasesubclass that writes to the shared DB but returns empty onfetch().Motivation
Previously, Hypothesis ran random generation in CI (non-deterministic, caused spurious failures) and the local
.hypothesis/example database was inside each worktree (lost on worktree deletion). Now:@example()decoratorsTest plan
uv run python -m pytest tests/ -m unit -n 8 -k properties-- 66 passed (explicit examples), 89 skipped (no random gen in CI profile)HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties-- 155 passed (full random fuzzing)uv run mypy src/ tests/-- clean