Skip to content

feat: log aggregation and shipping (syslog, HTTP, compression)#964

Merged
Aureliolo merged 11 commits intomainfrom
feat/log-aggregation
Apr 1, 2026
Merged

feat: log aggregation and shipping (syslog, HTTP, compression)#964
Aureliolo merged 11 commits intomainfrom
feat/log-aggregation

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Add SinkType.SYSLOG for shipping structured JSON to configurable syslog endpoints (UDP/TCP) with facility and protocol selection
  • Add SinkType.HTTP for batched JSON POST to HTTP endpoints with background daemon thread flushing, retries, and configurable batch size/interval
  • Add compressed archival of rotated log files via _CompressingRotatingFileHandler with gzip-aware rotation chain and atomic temp file writes
  • Add SyslogFacility (13 members) and SyslogProtocol (TCP/UDP) enums
  • Extend SinkConfig with syslog and HTTP fields, 4-branch match validator with cross-type field rejection
  • Extend RotationConfig with compress_rotated field (default false, backward compatible)
  • Extend sink_config_builder.py for runtime SYSLOG/HTTP custom sink configuration via sink_type dispatch
  • Add LogConfig endpoint uniqueness validators (syslog host:port, HTTP URL)
  • Add events/shipping.py with 11 shipping/compression event constants
  • Add centralized logging deployment guide (docs/guides/centralized-logging.md) covering syslog, HTTP, Docker logging drivers, and compression patterns
  • Update docs/design/operations.md sink layout table, rotation/compression section, and runtime settings descriptions

Test plan

  • 461 observability unit tests pass (39 new tests for syslog handler, HTTP handler, compression, config, enums, events, sink config builder)
  • 11769 total tests pass (uv run python -m pytest tests/ -n auto)
  • uv run ruff check src/ tests/ -- clean
  • uv run mypy src/ tests/ -- clean
  • All pre-commit and pre-push hooks pass
  • Verify syslog shipping against a local rsyslog/syslog-ng receiver
  • Verify HTTP shipping against a local Loki or generic JSON endpoint
  • Verify compressed rotation produces valid gzip files in Docker deployment

Review coverage

Pre-reviewed by 4 agents (code-reviewer, docs-consistency, silent-failure-hunter, issue-resolution-verifier). 14 findings addressed:

  • Added exception handling to HTTP flusher thread (prevents silent death on unexpected errors)
  • Added dropped-record counter for HTTP batch handler observability
  • Fixed pending count race condition in HTTP handler (decrement by drained count, not reset to zero)
  • Added stderr warnings on compression failure (was silently suppressed)
  • Added fallback to uncompressed rotation on doRollover file operation failure
  • Used atomic temp file for gzip compression (prevents corrupt backups on partial write)
  • Added fail-loud validation in build_http_handler/build_syslog_handler for empty URLs/hosts
  • Added compress_rotated to _ROTATION_FIELDS allowlist and parser (was silently rejected by runtime settings)
  • Added default case to build_handler match statement (prevents UnboundLocalError)
  • Updated docs: event count 54->55, custom_sinks/sink_overrides descriptions, CLAUDE.md package structure and logging exception list

Closes #566

Copilot AI review requested due to automatic review settings April 1, 2026 12:14
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 1, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e24ae572-3bd2-4d26-9a8a-a606a90341b4

📥 Commits

Reviewing files that changed from the base of the PR and between d3ab350 and 22d6166.

📒 Files selected for processing (2)
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py

Walkthrough

Adds centralized log shipping and compressed archival. Introduces SinkType.SYSLOG and SinkType.HTTP plus SyslogFacility and SyslogProtocol enums. Extends configuration with rotation.compress_rotated and per-sink syslog/http fields and validators (including duplicate syslog endpoint and HTTP URL checks). Adds shipping event constants, syslog and HTTP handler modules (HTTP batching, retries, flusher thread), a compressing rotating-file handler, custom-sink parsing for syslog/http, a centralized-logging guide and mkdocs navigation entry, and numerous unit tests and fixtures.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.41% which is insufficient. The required threshold is 40.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: addition of log aggregation and shipping features (syslog, HTTP, compression).
Description check ✅ Passed The description is well-structured and directly related to the changeset, covering all major features including syslog, HTTP shipping, compression, and documentation updates.
Linked Issues check ✅ Passed All objectives from issue #566 are met: SinkType.SYSLOG added with UDP/TCP and facility selection [#566], SinkType.HTTP added with batching and retries [#566], compressed archival via compress_rotated implemented [#566], and centralized logging guide documented [#566].
Out of Scope Changes check ✅ Passed All changes are directly aligned with the linked issue objectives: syslog/HTTP sink implementation, compression support, enums, config extensions, tests, and documentation are all in scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 22d6166.
Ensure that dependencies are being submitted on PR branches. Re-running this action after a short time may resolve the issue. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces centralized log shipping via Syslog and HTTP sinks, alongside automatic gzip compression for rotated log files. The changes include new handlers, configuration updates, and comprehensive documentation. Feedback identifies a double-encoding bug in the HTTP handler that results in an array of strings instead of objects, and a documentation error in the centralized logging guide regarding duplicate YAML anchors.

if not entries:
return

body = json.dumps(entries).encode("utf-8")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of _post_batch double-encodes the log records. self.format(record) already produces a JSON string for each log record. Calling json.dumps(entries) on a list of these strings will create a JSON array of JSON-encoded strings (e.g., ["{\"event\":...}", "{\"event\":...}"]), which is likely not what log aggregation endpoints expect. They typically expect a JSON array of JSON objects (e.g., [{"event":...}, {"event":...}]).

To fix this, you should join the JSON strings with commas and wrap them in brackets to form a valid JSON array string.

Suggested change
body = json.dumps(entries).encode("utf-8")
body = f"[{','.join(entries)}]".encode("utf-8")

Comment on lines +209 to +222
x-logging: &logging
driver: fluentd
options:
fluentd-address: "fluentd:24224"
tag: "synthorg.{{.Name}}"

# Or for syslog:
x-logging: &logging
driver: syslog
options:
syslog-address: "udp://syslog.internal:514"
syslog-facility: "local0"
tag: "synthorg"
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This YAML example is invalid because the x-logging key and &logging anchor are defined twice. This will cause parsing errors or unexpected behavior if copied directly. To show these as alternatives for the same block, please comment out the second one.

Suggested change
x-logging: &logging
driver: fluentd
options:
fluentd-address: "fluentd:24224"
tag: "synthorg.{{.Name}}"
# Or for syslog:
x-logging: &logging
driver: syslog
options:
syslog-address: "udp://syslog.internal:514"
syslog-facility: "local0"
tag: "synthorg"
```
x-logging: &logging
driver: fluentd
options:
fluentd-address: "fluentd:24224"
tag: "synthorg.{{.Name}}"
# Or for syslog:
# x-logging: &logging
# driver: syslog
# options:
# syslog-address: "udp://syslog.internal:514"
# syslog-facility: "local0"
# tag: "synthorg"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds centralized log shipping capabilities (syslog + HTTP) and optional gzip compression for rotated log backups, extending the observability configuration surface and documentation to support production log aggregation.

Changes:

  • Introduces new sink types (SYSLOG, HTTP) with new enums (SyslogFacility, SyslogProtocol) and config validation/uniqueness checks.
  • Implements new handlers for syslog + batched HTTP shipping, plus a gzip-aware rotating file handler for compressed archival.
  • Updates runtime sink config builder, tests, and docs (deployment guide + design spec updates).

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/synthorg/observability/http_handler.py New batched HTTP shipping handler with background flush thread and retries.
src/synthorg/observability/syslog_handler.py New syslog handler builder for structured JSON shipping via SysLogHandler.
src/synthorg/observability/sinks.py Adds _CompressingRotatingFileHandler and extends handler dispatch for SYSLOG/HTTP.
src/synthorg/observability/config.py Extends SinkConfig/RotationConfig, adds endpoint uniqueness validators.
src/synthorg/observability/sink_config_builder.py Extends runtime custom_sinks parsing with sink_type dispatch + new fields.
src/synthorg/observability/enums.py Adds SYSLOG/HTTP sink types and syslog facility/protocol enums.
src/synthorg/observability/events/shipping.py Adds shipping/compression event constants module.
src/synthorg/observability/__init__.py Exposes new enums at package top-level.
src/synthorg/settings/definitions/observability.py Updates runtime setting description for expanded custom_sinks schema.
tests/unit/observability/test_http_handler.py New unit tests for HTTP handler + builder behavior.
tests/unit/observability/test_syslog_handler.py New unit tests for syslog handler + mapping coverage.
tests/unit/observability/test_compression.py New unit tests for gzip-compressing rotation + build integration.
tests/unit/observability/test_sink_config_builder.py Adds tests for custom syslog/http sinks + sink_type dispatch.
tests/unit/observability/test_config.py Adds tests for new config fields, validation, and endpoint uniqueness.
tests/unit/observability/test_enums.py Adds tests for new enums and expanded sink types.
tests/unit/observability/test_events.py Adds discovery/assertions for new events.shipping constants.
tests/unit/observability/conftest.py Updates factories for new SinkConfig fields used in tests.
docs/guides/centralized-logging.md New guide describing syslog/HTTP shipping, Docker drivers, compression patterns.
docs/design/operations.md Updates sink layout + rotation/compression + event taxonomy count.
mkdocs.yml Adds centralized logging guide to the docs nav.
CLAUDE.md Updates repo component inventory + logging exceptions list for new observability modules.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +114
for i in range(self.backupCount - 1, 0, -1):
src = StdPath(f"{self.baseFilename}.{i}.gz")
dst = StdPath(f"{self.baseFilename}.{i + 1}.gz")
if src.exists():
if dst.exists():
dst.unlink()
src.rename(dst)

# Remove the oldest if it exceeds backupCount
oldest = StdPath(
f"{self.baseFilename}.{self.backupCount}.gz",
)
if oldest.exists():
oldest.unlink()
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_shift_gz_backups() unlinks ${baseFilename}.{backupCount}.gz after shifting backups. Since the loop already renames .N.gz to .N+1.gz (and deletes the previous destination), this extra unlink will delete the newest {backupCount}.gz you just created and effectively keep only backupCount-1 backups. Delete the oldest before shifting, or remove this post-shift unlink step (or target {backupCount+1}.gz).

Copilot uses AI. Check for mistakes.
Comment on lines +133 to +135
data = src.read_bytes()
with gzip.open(tmp_gz, "wb") as f:
f.write(data)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_compress_file() reads the entire rotated log into memory via read_bytes() before writing to gzip. This is unnecessary and can spike memory usage for large rotations; stream from the source file into gzip in chunks instead.

Suggested change
data = src.read_bytes()
with gzip.open(tmp_gz, "wb") as f:
f.write(data)
with src.open("rb") as src_f, gzip.open(tmp_gz, "wb") as gz_f:
for chunk in iter(lambda: src_f.read(1024 * 1024), b""):
gz_f.write(chunk)

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +111
if records:
self._post_batch(records)

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_drain_and_flush() drains the entire queue and sends it as a single POST, so batch_size only affects when a flush happens, not the maximum request size. This can produce very large HTTP payloads and defeats the purpose of batching; drain/post in chunks of at most self._batch_size until the queue is empty.

Suggested change
if records:
self._post_batch(records)
if not records:
return
# Send drained records in chunks of at most self._batch_size to
# ensure we never create excessively large HTTP payloads.
for start in range(0, len(records), self._batch_size):
batch = records[start : start + self._batch_size]
if batch:
self._post_batch(batch)

Copilot uses AI. Check for mistakes.
Comment on lines +112 to +125
def _post_batch(self, records: list[logging.LogRecord]) -> None:
"""POST a batch of records as a JSON array with retries."""
entries: list[str] = []
for record in records:
try:
entries.append(self.format(record))
except Exception:
self.handleError(record)

if not entries:
return

body = json.dumps(entries).encode("utf-8")
request = urllib.request.Request( # noqa: S310
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_post_batch() builds entries from self.format(record) (a JSON string when using ProcessorFormatter+JSONRenderer) and then json.dumps() the list, resulting in a JSON array of JSON-encoded strings (double encoding) rather than an array of objects. Convert the formatted JSON strings back to dicts (json.loads) before dumping, or bypass JSONRenderer and build dict payloads directly.

Copilot uses AI. Check for mistakes.
Comment on lines +137 to +140
urllib.request.urlopen( # noqa: S310
request,
timeout=self._timeout,
)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urllib.request.urlopen() returns a response object that should be closed; currently the response is ignored, which can leak sockets/file descriptors under load. Use a context manager (or contextlib.closing) around urlopen().

Suggested change
urllib.request.urlopen( # noqa: S310
request,
timeout=self._timeout,
)
with urllib.request.urlopen( # noqa: S310
request,
timeout=self._timeout,
) as _response:
# Response body is ignored; context manager ensures the connection is closed.
pass

Copilot uses AI. Check for mistakes.
Comment on lines +154 to +232
def _validate_sink_type_fields(self) -> Self:
"""Enforce required/rejected fields per sink type."""
match self.sink_type:
case SinkType.FILE:
self._validate_file_fields()
case SinkType.CONSOLE:
self._reject_file_fields("CONSOLE")
self._reject_syslog_fields("CONSOLE")
self._reject_http_fields("CONSOLE")
case SinkType.SYSLOG:
self._reject_file_fields("SYSLOG")
self._validate_syslog_fields()
self._reject_http_fields("SYSLOG")
case SinkType.HTTP:
self._reject_file_fields("HTTP")
self._reject_syslog_fields("HTTP")
self._validate_http_fields()
return self

def _validate_file_fields(self) -> None:
if self.file_path is None:
msg = "file_path is required for FILE sinks"
raise ValueError(msg)
if not self.file_path.strip():
msg = "file_path must not be empty or whitespace-only"
raise ValueError(msg)
path = PurePath(self.file_path)
if (
path.is_absolute()
or PurePosixPath(self.file_path).is_absolute()
or PureWindowsPath(self.file_path).is_absolute()
):
msg = f"file_path must be relative: {self.file_path}"
raise ValueError(msg)
if ".." in path.parts:
msg = f"file_path must not contain '..' components: {self.file_path}"
raise ValueError(msg)
self._reject_syslog_fields("FILE")
self._reject_http_fields("FILE")

def _reject_file_fields(self, sink_label: str) -> None:
if self.file_path is not None:
msg = f"file_path must be None for {sink_label} sinks"
raise ValueError(msg)
if self.rotation is not None:
msg = f"rotation must be None for {sink_label} sinks"
raise ValueError(msg)

def _validate_syslog_fields(self) -> None:
if self.syslog_host is None:
msg = "syslog_host is required for SYSLOG sinks"
raise ValueError(msg)
if not self.syslog_host.strip():
msg = "syslog_host must not be blank"
raise ValueError(msg)

def _reject_syslog_fields(self, sink_label: str) -> None:
if self.syslog_host is not None:
msg = f"syslog_host must be None for {sink_label} sinks"
raise ValueError(msg)

def _validate_http_fields(self) -> None:
if self.http_url is None:
msg = "http_url is required for HTTP sinks"
raise ValueError(msg)
if not self.http_url.strip():
msg = "http_url must not be blank"
raise ValueError(msg)
if not (
self.http_url.startswith("http://") or self.http_url.startswith("https://")
):
msg = "http_url must start with http:// or https://"
raise ValueError(msg)

def _reject_http_fields(self, sink_label: str) -> None:
if self.http_url is not None:
msg = f"http_url must be None for {sink_label} sinks"
raise ValueError(msg)

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SinkConfig's cross-type rejection only checks syslog_host / http_url. This allows non-SYSLOG sinks to accept SYSLOG-only fields (syslog_port/facility/protocol) and non-HTTP sinks to accept HTTP-only fields (http_headers/batch_size/etc) without error, which is misleading because those values are ignored. Consider rejecting any non-default values for those fields when sink_type doesn’t match.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +132
def test_emit_queues_record(
self,
handler_cleanup: list[logging.Handler],
) -> None:
handler = _make_handler()
handler_cleanup.append(handler)

with patch("urllib.request.urlopen"):
handler.emit(_make_record())
# Record is queued (not yet flushed since batch_size=5)
assert handler._queue.qsize() >= 0 # Just verify no crash

def test_batch_flushed_on_batch_size(
self,
handler_cleanup: list[logging.Handler],
) -> None:
handler = _make_handler(batch_size=3)
handler_cleanup.append(handler)

with patch("urllib.request.urlopen") as mock_urlopen:
for _ in range(3):
handler.emit(_make_record())
# Give the flusher thread time to process
time.sleep(0.5)

# Should have been called at least once with a batch
assert mock_urlopen.call_count >= 1

def test_flush_on_close(
self,
handler_cleanup: list[logging.Handler],
) -> None:
handler = _make_handler(batch_size=100)
# Don't add to cleanup since we're closing manually

with patch("urllib.request.urlopen") as mock_urlopen:
handler.emit(_make_record("close-test"))
handler.close()

# close() should flush remaining records
assert mock_urlopen.call_count >= 1

def test_flush_sends_json_array(
self,
handler_cleanup: list[logging.Handler],
) -> None:
handler = _make_handler(batch_size=100)

with patch("urllib.request.urlopen") as mock_urlopen:
handler.emit(_make_record("json-test"))
handler.close()

if mock_urlopen.call_count > 0:
request = mock_urlopen.call_args[0][0]
body = json.loads(request.data.decode("utf-8"))
assert isinstance(body, list)
assert len(body) >= 1

def test_timeout_applied(
self,
handler_cleanup: list[logging.Handler],
) -> None:
handler = _make_handler(batch_size=100, timeout=7.5)

with patch("urllib.request.urlopen") as mock_urlopen:
handler.emit(_make_record())
handler.close()

if mock_urlopen.call_count > 0:
call_kwargs = mock_urlopen.call_args
assert call_kwargs[1].get("timeout") == 7.5

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several assertions are effectively no-ops and can mask real failures: e.g., assert handler._queue.qsize() >= 0 is always true, and multiple tests only assert request details inside if mock_urlopen.call_count > 0, so they pass even if no HTTP call was made. Make these assertions unconditional (and ideally assert exact call counts) so the tests fail when flushing/posting regresses.

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +88
def test_batch_flushed_on_batch_size(
self,
handler_cleanup: list[logging.Handler],
) -> None:
handler = _make_handler(batch_size=3)
handler_cleanup.append(handler)

with patch("urllib.request.urlopen") as mock_urlopen:
for _ in range(3):
handler.emit(_make_record())
# Give the flusher thread time to process
time.sleep(0.5)

# Should have been called at least once with a batch
assert mock_urlopen.call_count >= 1

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_batch_flushed_on_batch_size relies on time.sleep(0.5) to wait for the background flusher, which is prone to flakiness under CI load. Prefer synchronizing on a condition (e.g., poll/wait until mock_urlopen.call_count increases with a timeout, or expose a flush hook for tests).

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 82.06186% with 87 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.86%. Comparing base (3dc515f) to head (22d6166).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/synthorg/observability/config.py 76.38% 22 Missing and 12 partials ⚠️
src/synthorg/observability/sink_config_builder.py 77.50% 12 Missing and 6 partials ⚠️
src/synthorg/observability/http_handler.py 82.65% 12 Missing and 5 partials ⚠️
src/synthorg/observability/sinks.py 80.95% 6 Missing and 6 partials ⚠️
...c/synthorg/observability/_shipping_sink_parsers.py 85.71% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #964      +/-   ##
==========================================
- Coverage   92.01%   91.86%   -0.15%     
==========================================
  Files         622      626       +4     
  Lines       33544    33996     +452     
  Branches     3239     3315      +76     
==========================================
+ Hits        30865    31230     +365     
- Misses       2123     2179      +56     
- Partials      556      587      +31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/guides/centralized-logging.md`:
- Around line 69-91: The fenced code blocks for the rsyslog and syslog-ng
examples are unlabeled and causing markdownlint-cli2 warnings; update the first
block (the one starting with "# /etc/rsyslog.d/50-synthorg.conf" containing
module(load="imudp") and template "synthorg-json") to use a language tag like
```conf and update the second block (the "Receiver Configuration (syslog-ng)"
block with source s_synthorg / destination d_synthorg / filter f_synthorg) to
use the same or an appropriate tag (e.g., ```conf or ```text) so the linter
recognizes them. Ensure both triple-backtick openings include the language
identifier and leave the content unchanged.
- Around line 154-171: The examples for Grafana Loki and Elasticsearch are
incorrect because they show sending a raw JSON array to endpoints that require
transformed payloads; update the "Grafana Loki" and "Elasticsearch" examples
(the blocks referencing "/loki/api/v1/push" and "/_bulk") to either (A) replace
the shown JSON payload with a brief example or description of the required
adapter/transformation (Loki: streams array with values containing
[timestamp_ns, log_line]; Elasticsearch: NDJSON with action metadata + document
pairs) or (B) remove these endpoint examples entirely and instead link to
collectors/endpoints that accept the documented JSON-array contract; ensure the
doc text explicitly states that the HTTP sink must be adapted for those vendor
endpoints and show the transformation approach or reference an adapter snippet.

In `@src/synthorg/observability/config.py`:
- Around line 153-170: The validator _validate_sink_type_fields currently only
rejects a subset of cross-type fields; update the logic so any sink type rejects
ALL explicit fields belonging to other transport types by expanding the reject
helpers (or their usage) to cover every syslog_* and every http_* attribute.
Specifically, ensure _reject_syslog_fields checks syslog_host, syslog_port,
syslog_facility, syslog_protocol (and any other syslog_* attributes) and
_reject_http_fields checks http_url, http_headers, http_batch_size,
http_timeout_seconds (and any other http_* attributes), and call these full
rejectors for FILE and CONSOLE cases (and correspondingly for SYSLOG/HTTP to
reject file/http or file/syslog fields) inside _validate_sink_type_fields so no
foreign transport fields are silently accepted.

In `@src/synthorg/observability/http_handler.py`:
- Around line 62-67: Replace all direct print(..., file=sys.stderr) usages in
this module with the observability logger obtained via get_logger(__name__) and
emit the appropriate domain events (SHIPPING_HTTP_* constants) instead of
printing; specifically update code paths in the flusher lifecycle and retry/drop
handlers (look for self._flusher, _flush_loop, start/stop logic and the
error/retry/drop blocks around the former print locations) to call
logger.info()/logger.warn()/logger.error() with a clear message and include the
relevant SHIPPING_HTTP_* constant and any exception/context details, ensuring
all state transitions and errors are sent through the observability pipeline
rather than stdout/stderr.
- Around line 112-125: In _post_batch, you're currently building entries as
list[str] by appending self.format(record), which produces a JSON-encoded string
and results in a JSON array of strings; change entries to collect real JSON
objects (list[dict]) instead: for each record, call self.format(record) and if
it returns a string, json.loads(...) it into a dict (handle exceptions with
self.handleError(record)), or if it already returns a dict, append it directly;
then json.dumps(entries).encode("utf-8") will produce a proper array of JSON
objects for the POST. Ensure you update the entries variable type and error
handling in the _post_batch method accordingly.
- Around line 135-146: The loop that calls urllib.request.urlopen(request,
timeout=self._timeout) leaks file descriptors because the response object isn’t
closed on success and HTTPError instances (which wrap a file pointer) aren’t
closed before retrying; wrap the successful urlopen call in a context manager
(with urllib.request.urlopen(...) as resp:) so the response is closed on
success, and in the except block if exc is an instance of urllib.error.HTTPError
explicitly call exc.close() (or exc.fp.close()) before assigning last_error and
continuing; keep the retry logic using self._max_retries and self._timeout and
ensure last_error still captures the exception for the final failure path.

In `@src/synthorg/observability/sink_config_builder.py`:
- Around line 62-84: Remove "json_format" from the allowed fields for shipping
sinks and add an explicit validation in _parse_common_sink_fields to reject any
provided json_format for those sink types; specifically, remove "json_format"
from _CUSTOM_SYSLOG_SINK_FIELDS and _CUSTOM_HTTP_SINK_FIELDS and update
_parse_common_sink_fields to check if the incoming sink dict has "json_format"
when sink_type is "syslog" or "http" and raise a clear ValueError (or
configuration error) stating that json_format is not allowed for shipping sinks.

In `@src/synthorg/observability/sinks.py`:
- Around line 116-145: The compression routine _compress_file currently swallows
OSError after _rotate_current_log has already created the .1 file, which can
cause the rotated log to be overwritten and lost; fix this by ensuring
compression failures propagate so rotation/retention logic can handle them: in
_compress_file (which uses StdPath src, tmp_gz, gz and calls
gzip.open/tmp_gz.rename/src.unlink) catch OSError only to perform best-effort
cleanup (remove tmp_gz if present) and then re-raise the exception (or raise a
new exception) instead of returning silently; alternatively, perform compression
before calling _rotate_current_log or adjust _rotate_current_log/rotate usage so
that a failed compression does not remove the original rotated file. Ensure you
reference and update _compress_file, _rotate_current_log, rotate, and
rotation_filename accordingly.
- Around line 99-114: The code in _shift_gz_backups shifts .{i}.gz -> .{i+1}.gz
up to i = backupCount - 1, so after the loop the file to remove (if any) is
.{backupCount + 1}.gz; currently it removes .{backupCount}.gz which deletes one
backup too early. Update the removal to target
StdPath(f"{self.baseFilename}.{self.backupCount + 1}.gz") (or simply drop the
final unlink and only remove files > backupCount if you prefer) so the method
preserves exactly backupCount compressed generations; references:
_shift_gz_backups, self.baseFilename, self.backupCount, StdPath.

In `@tests/unit/observability/test_http_handler.py`:
- Around line 113-117: Replace the permissive guards that skip assertions when
no HTTP call happened by asserting the call occurred before inspecting
arguments: change the conditional `if mock_urlopen.call_count > 0:` to `assert
mock_urlopen.call_count > 0` and then continue to extract `request =
mock_urlopen.call_args[0][0]` and assert on `body` as before (do the same
replacement for the analogous checks around call sites currently at lines
129-131 and 150-153 in this test file). This ensures the test fails if no
request was sent while preserving the existing inspection of `request.data` and
`body`.
- Around line 80-84: Replace the time.sleep-based synchronization with an
explicit threading.Event: create an Event before patching
urllib.request.urlopen, have your mock_urlopen side_effect/setter set that Event
when it's invoked by the flusher, call handler.emit(_make_record()) as before,
then wait on the Event with a reasonable timeout (e.g., event.wait(1)) instead
of time.sleep(0.5); this ensures the test waits until the mocked urlopen (inside
the flusher thread) runs and avoids the flakiness described. Ensure you
reference the patched "urllib.request.urlopen", the mock_urlopen's side_effect,
the handler.emit calls, and the Event wait to locate the change.

In `@tests/unit/observability/test_sink_config_builder.py`:
- Around line 672-695: The test test_http_custom_settings hardcodes the hostname
"loki" in the custom JSON (the http_url value) which makes the unit test
vendor-specific; update the custom JSON used in test_http_custom_settings so
http_url uses a generic example hostname (e.g., "http://example:3100/push" or
"http://localhost:3100/push") instead of "http://loki:3100/push" so the
assertion logic around http_* fields and level still runs unchanged; locate the
test by the function name test_http_custom_settings and the local variable
custom to apply the change.

In `@tests/unit/observability/test_syslog_handler.py`:
- Around line 165-191: The test test_emit_formats_as_json currently only wraps
handler.emit which doesn't verify the actual serialized payload; change it to
assert the JSON output by either (a) patching the underlying socket send method
(e.g., sendto or sendall on the sink returned by _syslog_sink) and capturing the
bytes sent when calling handler.emit(record), or (b) calling
handler.format(record) and asserting the returned string/bytes is valid JSON and
contains expected fields (logger name, level, timestamp, message) produced by
build_syslog_handler; reference handler.emit, handler.format, _syslog_sink, and
build_syslog_handler to locate where to patch/inspect and validate the JSON
payload.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 34e41175-9c07-4e29-a6c9-2bedf0920cb3

📥 Commits

Reviewing files that changed from the base of the PR and between f511e1d and 7381715.

📒 Files selected for processing (21)
  • CLAUDE.md
  • docs/design/operations.md
  • docs/guides/centralized-logging.md
  • mkdocs.yml
  • src/synthorg/observability/__init__.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/enums.py
  • src/synthorg/observability/events/shipping.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/sinks.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/test_enums.py
  • tests/unit/observability/test_events.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_sink_config_builder.py
  • tests/unit/observability/test_syslog_handler.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Agent
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (python)
  • GitHub Check: Dependency Review
🧰 Additional context used
📓 Path-based instructions (6)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use @pytest.mark.unit for unit tests, @pytest.mark.integration for integration tests, @pytest.mark.e2e for e2e tests, @pytest.mark.slow for slow tests
Coverage: 80% minimum (enforced in CI)
Async: asyncio_mode = "auto" -- no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. instead of real vendor names
Property-based testing (Python): use Hypothesis (@given + @settings). Profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties
Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) -- it is cancellation-safe

Files:

  • tests/unit/observability/test_events.py
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_enums.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_syslog_handler.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_sink_config_builder.py
  • tests/unit/observability/test_config.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases

Files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_enums.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/syslog_handler.py
  • tests/unit/observability/test_http_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/enums.py
  • tests/unit/observability/test_syslog_handler.py
  • tests/unit/observability/test_compression.py
  • src/synthorg/observability/sinks.py
  • tests/unit/observability/test_sink_config_builder.py
  • src/synthorg/observability/events/shipping.py
  • tests/unit/observability/test_config.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: No from __future__ import annotations -- Python 3.14 has PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Type hints: all public functions must have type hints, mypy strict mode
Google-style docstrings required on public classes and functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Config vs runtime state: use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Line length: 88 characters (ruff enforced)
Functions must be < 50 lines, files must be < 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)
Every module wit...

Files:

  • src/synthorg/observability/__init__.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/enums.py
  • src/synthorg/observability/sinks.py
  • src/synthorg/observability/events/shipping.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py
docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When approved deviations from the design spec occur, update the relevant docs/design/ page to reflect the new reality

Files:

  • docs/design/operations.md
{docs/design/operations.md,src/synthorg/providers/presets.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor names may only appear in: (1) Operations design page (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths/module names (e.g. litellm.types.llms.openai), (4) provider presets (src/synthorg/providers/presets.py)

Files:

  • docs/design/operations.md
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Docs: docs/ (Markdown, built with Zensical, config: mkdocs.yml). Design spec: docs/design/ (11 pages), Architecture: docs/architecture/, Roadmap: docs/roadmap/

Files:

  • docs/design/operations.md
  • docs/guides/centralized-logging.md
🧠 Learnings (52)
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

  • mkdocs.yml
  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations

Applied to files:

  • mkdocs.yml
  • CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue — DESIGN_SPEC.md is a pointer file linking to 7 design pages (Agents, Organization, Communication, Engine, Memory, Operations)

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue -- read DESIGN_SPEC.md as a pointer file linking to the 11 design pages

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to docs/{security.md,licensing.md,reference/**/*.md} : Security: `docs/security.md`, Licensing: `docs/licensing.md`, Reference: `docs/reference/`

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to docs/**/*.md : Docs: `docs/` (Markdown, built with Zensical, config: `mkdocs.yml`). Design spec: `docs/design/` (11 pages), Architecture: `docs/architecture/`, Roadmap: `docs/roadmap/`

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_enums.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/enums.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
  • tests/unit/observability/test_config.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • src/synthorg/observability/sink_config_builder.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to src/**/*.py : Event names: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • tests/unit/observability/test_events.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • tests/unit/observability/test_events.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/__init__.py
  • docs/design/operations.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/observability/__init__.py
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_enums.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/enums.py
  • docs/design/operations.md
  • src/synthorg/observability/sinks.py
  • tests/unit/observability/test_config.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks

Applied to files:

  • src/synthorg/observability/__init__.py
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_enums.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/enums.py
  • docs/design/operations.md
  • src/synthorg/observability/sinks.py
  • tests/unit/observability/test_config.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code; exceptions: `observability/setup.py` and `observability/sinks.py` may use stdlib logging and `print(..., file=sys.stderr)` for bootstrap code

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • tests/unit/observability/test_enums.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings

Applied to files:

  • src/synthorg/observability/__init__.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the synthorg logger instead

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising; all state transitions must log at INFO; DEBUG for object creation, internal flow, entry/exit of key functions

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code; use the injected logger instead

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Pure data models, enums, and re-exports do NOT need logging

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Settings: Runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge, ConfigResolver (typed composed reads for controllers), validation, registry, change notifications via message bus. Per-namespace setting definitions in definitions/ submodule (api, company, providers, memory, budget, security, coordination, observability, backup).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to src/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/settings/**/*.py : Settings package (settings/): runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge (JSON serialization for Pydantic/collections), ConfigResolver (typed accessors), validation, registry, change notifications via message bus, SettingsSubscriber protocol, SettingsChangeDispatcher (polls `#settings` channel, routes to subscribers, restart_required filtering)

Applied to files:

  • src/synthorg/settings/definitions/observability.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/backup/**/*.py : Backup package (backup/): scheduled/manual/lifecycle backups of persistence DB, agent memory, company config. BackupService orchestrator, BackupScheduler (periodic asyncio task), RetentionManager (count + age pruning), tar.gz compression, SHA-256 checksums, manifest tracking, validated restore with atomic rollback and safety backup. handlers/ subpackage: ComponentHandler protocol + concrete handlers (PersistenceComponentHandler, MemoryComponentHandler, ConfigComponentHandler)

Applied to files:

  • src/synthorg/observability/sinks.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • src/synthorg/observability/config.py
📚 Learning: 2026-03-15T18:42:17.990Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:42:17.990Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`

Applied to files:

  • src/synthorg/observability/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/observability/config.py
🪛 markdownlint-cli2 (0.22.0)
docs/guides/centralized-logging.md

[warning] 69-69: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 86-86: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (21)
mkdocs.yml (1)

104-104: LGTM!

The navigation entry for the new centralized logging guide is correctly placed under Guides, following the deployment section which is a logical grouping for operational documentation.

tests/unit/observability/test_events.py (2)

236-236: LGTM!

Correctly adds the new shipping domain module to the expected set for discovery validation.


702-728: LGTM!

The test comprehensively covers all 11 shipping event constants (syslog: 3, HTTP: 6, compression: 2) and follows the established pattern of verifying exact string values. Event names correctly follow the domain.subject.qualifier convention.

src/synthorg/observability/__init__.py (2)

32-38: LGTM!

The import block correctly groups all enum imports together. The new SyslogFacility and SyslogProtocol enums are properly imported for public re-export.


50-51: LGTM!

The __all__ list maintains alphabetical ordering with the new exports correctly positioned.

tests/unit/observability/conftest.py (1)

33-35: LGTM!

The factory correctly sets default values for the new sink configuration fields. For a CONSOLE sink type, these fields should be None/empty, which aligns with the SinkConfig validation rules that reject cross-type fields.

CLAUDE.md (2)

100-100: LGTM!

The package structure description accurately reflects the new log shipping and compressed archival capabilities added by this PR.


141-141: LGTM!

Correctly documents that syslog_handler.py and http_handler.py are allowed to use stdlib logging and print(..., file=sys.stderr) for handler construction and bootstrap code. This exception is necessary because these handlers run during or before logging system configuration.

tests/unit/observability/test_enums.py (4)

5-11: LGTM!

Import block correctly updated to include the new enum types being tested.


66-76: LGTM!

SinkType tests correctly updated to reflect the expanded enum (4 members) with proper value assertions for the new SYSLOG and HTTP types.


82-111: LGTM!

Comprehensive test coverage for SyslogFacility. The test_values_are_lowercase_strings assertion (lines 106-108) is particularly important as it enforces the invariant that member.value == member.name.lower(), which is relied upon by _SYSLOG_FACILITY_MAP in sink_config_builder.py for configuration parsing.


114-129: LGTM!

SyslogProtocol tests correctly verify both members and their exact string values ("tcp", "udp"), which are used as lookup keys in _SYSLOG_PROTOCOL_MAP for configuration parsing.

tests/unit/observability/test_config.py (6)

12-18: LGTM!

Import block correctly updated to include SyslogFacility and SyslogProtocol for the new sink configuration tests.


70-90: LGTM!

Tests for the new compress_rotated field cover default value (False), explicit True setting, and JSON round-trip serialization behavior.


191-290: LGTM!

Comprehensive test coverage for SYSLOG sink configuration including:

  • Valid configuration with defaults (port 514, UDP, USER facility)
  • Custom settings validation
  • Required field enforcement (syslog_host)
  • Blank host rejection
  • Cross-type field rejection (file_path, rotation, http_url)
  • Port bounds validation (1-65535)
  • Immutability and JSON round-trip

295-416: LGTM!

Comprehensive test coverage for HTTP sink configuration including:

  • Valid configuration with defaults
  • Custom settings (headers, batch size, timeouts, retries)
  • Required field enforcement (http_url)
  • Scheme validation (http/https only)
  • Cross-type field rejection
  • Numeric field bounds (positive values required)
  • Zero retries acceptance (valid edge case)
  • Immutability and JSON round-trip

421-454: LGTM!

Cross-type rejection tests ensure that CONSOLE and FILE sinks properly reject syslog and HTTP-specific fields, enforcing clean separation between sink types.


459-530: LGTM!

Endpoint uniqueness tests verify that LogConfig correctly:

  • Rejects duplicate syslog endpoints (same host:port)
  • Accepts different syslog ports on the same host
  • Rejects duplicate HTTP URLs
  • Accepts different HTTP URLs

This prevents accidental configuration of redundant sinks shipping to the same endpoint.

src/synthorg/observability/enums.py (3)

38-45: LGTM!

The SinkType enum is correctly extended with SYSLOG and HTTP members. The docstring accurately describes the new sink types' purposes.


48-66: LGTM!

The SyslogFacility enum correctly defines 13 standard syslog facilities with lowercase string values matching the member names. This convention is critical as the values are used as lookup keys in _SYSLOG_FACILITY_MAP for configuration parsing. The docstring correctly notes the mapping to logging.handlers.SysLogHandler.LOG_* constants.


69-78: LGTM!

The SyslogProtocol enum correctly defines TCP and UDP transport options with appropriate docstring documentation linking to socket types (SOCK_STREAM/SOCK_DGRAM).

Comment on lines +62 to +67
self._flusher = threading.Thread(
target=self._flush_loop,
daemon=True,
name="log-http-flusher",
)
self._flusher.start()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Emit shipping events through the observability logger instead of print().

This module starts/stops the flusher, retries/drops batches, and raises configuration errors without any structured event emission. The only runtime diagnostics are the print(..., file=sys.stderr) calls on Lines 91-95 and Lines 151-159, so the new SHIPPING_HTTP_* constants never reach the logging pipeline.

As per coding guidelines, never use print() in application code outside observability/setup.py and observability/sinks.py; every module with business logic must use get_logger(__name__), and all error paths/state transitions must log with domain event constants.

Also applies to: 80-96, 134-167, 183-185

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 62 - 67, Replace all
direct print(..., file=sys.stderr) usages in this module with the observability
logger obtained via get_logger(__name__) and emit the appropriate domain events
(SHIPPING_HTTP_* constants) instead of printing; specifically update code paths
in the flusher lifecycle and retry/drop handlers (look for self._flusher,
_flush_loop, start/stop logic and the error/retry/drop blocks around the former
print locations) to call logger.info()/logger.warn()/logger.error() with a clear
message and include the relevant SHIPPING_HTTP_* constant and any
exception/context details, ensuring all state transitions and errors are sent
through the observability pipeline rather than stdout/stderr.

Comment on lines +112 to +125
def _post_batch(self, records: list[logging.LogRecord]) -> None:
"""POST a batch of records as a JSON array with retries."""
entries: list[str] = []
for record in records:
try:
entries.append(self.format(record))
except Exception:
self.handleError(record)

if not entries:
return

body = json.dumps(entries).encode("utf-8")
request = urllib.request.Request( # noqa: S310
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Send JSON objects, not JSON-encoded strings.

Line 124 serializes a list[str], so the request body becomes an array of escaped strings instead of a batch of structured log objects. That breaks the “batched JSON POST” contract for receivers expecting real JSON entries.

🐛 Suggested fix
-        entries: list[str] = []
+        entries: list[Any] = []
         for record in records:
             try:
-                entries.append(self.format(record))
+                entries.append(json.loads(self.format(record)))
             except Exception:
                 self.handleError(record)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _post_batch(self, records: list[logging.LogRecord]) -> None:
"""POST a batch of records as a JSON array with retries."""
entries: list[str] = []
for record in records:
try:
entries.append(self.format(record))
except Exception:
self.handleError(record)
if not entries:
return
body = json.dumps(entries).encode("utf-8")
request = urllib.request.Request( # noqa: S310
def _post_batch(self, records: list[logging.LogRecord]) -> None:
"""POST a batch of records as a JSON array with retries."""
entries: list[Any] = []
for record in records:
try:
entries.append(json.loads(self.format(record)))
except Exception:
self.handleError(record)
if not entries:
return
body = json.dumps(entries).encode("utf-8")
request = urllib.request.Request( # noqa: S310
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 112 - 125, In
_post_batch, you're currently building entries as list[str] by appending
self.format(record), which produces a JSON-encoded string and results in a JSON
array of strings; change entries to collect real JSON objects (list[dict])
instead: for each record, call self.format(record) and if it returns a string,
json.loads(...) it into a dict (handle exceptions with
self.handleError(record)), or if it already returns a dict, append it directly;
then json.dumps(entries).encode("utf-8") will produce a proper array of JSON
objects for the POST. Ensure you update the entries variable type and error
handling in the _post_batch method accordingly.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/observability/sinks.py (1)

270-283: ⚠️ Potential issue | 🟠 Major

Reject compress_rotated outside built-in rotation.

RotationConfig currently allows compress_rotated=True for non-BUILTIN strategies, and this branch silently returns _FlushingWatchedFileHandler. The config looks valid, but compression is disabled at runtime.

Suggested fix
        if rotation.strategy == RotationStrategy.BUILTIN:
            if rotation.compress_rotated:
                return _CompressingRotatingFileHandler(
                    filename=str(file_path),
                    maxBytes=rotation.max_bytes,
                    backupCount=rotation.backup_count,
                    compress=True,
                )
            return _FlushingRotatingFileHandler(
                filename=str(file_path),
                maxBytes=rotation.max_bytes,
                backupCount=rotation.backup_count,
            )
+        if rotation.compress_rotated:
+            msg = (
+                "compress_rotated is only supported with "
+                "RotationStrategy.BUILTIN"
+            )
+            raise ValueError(msg)
         return _FlushingWatchedFileHandler(filename=str(file_path))

As per coding guidelines, "Validate at system boundaries (user input, external APIs, config files)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/sinks.py` around lines 270 - 283, The code
silently ignores rotation.compress_rotated when rotation.strategy !=
RotationStrategy.BUILTIN; add validation in the handler-selection logic to
detect this mismatch (check rotation.compress_rotated and rotation.strategy) and
raise a clear ValueError (or similar) instead of returning
_FlushingWatchedFileHandler, so callers know the config is invalid; specifically
update the branch that currently returns _FlushingWatchedFileHandler to first
assert that rotation.compress_rotated is False (or raise) and reference
RotationStrategy.BUILTIN, rotation.compress_rotated,
_CompressingRotatingFileHandler, and _FlushingWatchedFileHandler when
implementing the check.
♻️ Duplicate comments (2)
src/synthorg/observability/sinks.py (2)

99-114: ⚠️ Potential issue | 🟠 Major

Preserve all configured compressed generations.

For backupCount > 1, Line 109 deletes .{backupCount}.gz after Line 103 has already renamed .{backupCount - 1}.gz into that slot, so the handler retains one fewer compressed backup than configured.

Suggested fix
    def _shift_gz_backups(self) -> None:
        """Shift existing .gz backups (highest index first)."""
+        oldest = StdPath(f"{self.baseFilename}.{self.backupCount}.gz")
+        if oldest.exists():
+            oldest.unlink()
         for i in range(self.backupCount - 1, 0, -1):
             src = StdPath(f"{self.baseFilename}.{i}.gz")
             dst = StdPath(f"{self.baseFilename}.{i + 1}.gz")
             if src.exists():
                 if dst.exists():
                     dst.unlink()
                 src.rename(dst)
-
-        # Remove the oldest if it exceeds backupCount
-        oldest = StdPath(
-            f"{self.baseFilename}.{self.backupCount}.gz",
-        )
-        if oldest.exists():
-            oldest.unlink()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/sinks.py` around lines 99 - 114, In
_shift_gz_backups, the code renames .{backupCount-1}.gz into .{backupCount}.gz
then unconditionally removes .{backupCount}.gz, which loses one configured
compressed generation; fix by removing the oldest archive
(StdPath(f"{self.baseFilename}.{self.backupCount}.gz")) before performing the
rename loop instead of after it so the subsequent renames preserve all
configured generations — update _shift_gz_backups to unlink oldest (using oldest
= StdPath(f"{self.baseFilename}.{self.backupCount}.gz")) if it exists, then run
the for loop that builds src = StdPath(f"{self.baseFilename}.{i}.gz") and dst =
StdPath(f"{self.baseFilename}.{i + 1}.gz") to rename each file.

70-98: ⚠️ Potential issue | 🟠 Major

Don't treat a failed gzip as a successful rollover.

After Line 124 has already moved the active log to .1, _compress_file() still returns normally on OSError. If the failure happens before tmp_gz.rename(gz) succeeds, that rotated file stays as .1; _shift_gz_backups() ignores it, and Line 123 removes it on the following rollover. Please surface this failure back to doRollover() in a way that preserves the rotated file, or explicitly fold an existing uncompressed .1 into the next retention pass before _rotate_current_log() unlinks it.

As per coding guidelines, "Handle errors explicitly, never silently swallow them".

Also applies to: 116-146

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/sinks.py` around lines 70 - 98, The rollover
currently swallows compression errors causing an uncompressed .1 to be lost on
the next rotation; fix this by making _compress_file(surface name) propagate
OSError and guarantee it does not remove or replace the original dfn on failure,
and update doRollover to catch that propagated error but not call
super().doRollover() (which would remove the existing .1); instead log the error
and return (or re-raise) so the rotated uncompressed file produced by
_rotate_current_log() is preserved for the next retention pass. Reference:
doRollover, _shift_gz_backups, _rotate_current_log, _compress_file, backupCount.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/synthorg/observability/sinks.py`:
- Around line 270-283: The code silently ignores rotation.compress_rotated when
rotation.strategy != RotationStrategy.BUILTIN; add validation in the
handler-selection logic to detect this mismatch (check rotation.compress_rotated
and rotation.strategy) and raise a clear ValueError (or similar) instead of
returning _FlushingWatchedFileHandler, so callers know the config is invalid;
specifically update the branch that currently returns
_FlushingWatchedFileHandler to first assert that rotation.compress_rotated is
False (or raise) and reference RotationStrategy.BUILTIN,
rotation.compress_rotated, _CompressingRotatingFileHandler, and
_FlushingWatchedFileHandler when implementing the check.

---

Duplicate comments:
In `@src/synthorg/observability/sinks.py`:
- Around line 99-114: In _shift_gz_backups, the code renames .{backupCount-1}.gz
into .{backupCount}.gz then unconditionally removes .{backupCount}.gz, which
loses one configured compressed generation; fix by removing the oldest archive
(StdPath(f"{self.baseFilename}.{self.backupCount}.gz")) before performing the
rename loop instead of after it so the subsequent renames preserve all
configured generations — update _shift_gz_backups to unlink oldest (using oldest
= StdPath(f"{self.baseFilename}.{self.backupCount}.gz")) if it exists, then run
the for loop that builds src = StdPath(f"{self.baseFilename}.{i}.gz") and dst =
StdPath(f"{self.baseFilename}.{i + 1}.gz") to rename each file.
- Around line 70-98: The rollover currently swallows compression errors causing
an uncompressed .1 to be lost on the next rotation; fix this by making
_compress_file(surface name) propagate OSError and guarantee it does not remove
or replace the original dfn on failure, and update doRollover to catch that
propagated error but not call super().doRollover() (which would remove the
existing .1); instead log the error and return (or re-raise) so the rotated
uncompressed file produced by _rotate_current_log() is preserved for the next
retention pass. Reference: doRollover, _shift_gz_backups, _rotate_current_log,
_compress_file, backupCount.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b3657926-e5d7-46ae-baf0-cf25af437dd3

📥 Commits

Reviewing files that changed from the base of the PR and between 7381715 and a608836.

📒 Files selected for processing (1)
  • src/synthorg/observability/sinks.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: No from __future__ import annotations -- Python 3.14 has PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Type hints: all public functions must have type hints, mypy strict mode
Google-style docstrings required on public classes and functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Config vs runtime state: use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Line length: 88 characters (ruff enforced)
Functions must be < 50 lines, files must be < 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)
Every module wit...

Files:

  • src/synthorg/observability/sinks.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases

Files:

  • src/synthorg/observability/sinks.py
🧠 Learnings (1)
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/observability/sinks.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (2)
docs/guides/centralized-logging.md (1)

14-18: ⚠️ Potential issue | 🟡 Minor

Don't advertise vendor APIs as direct HTTP-sink targets here.

The later HTTP section already says this sink emits raw JSON arrays and vendor APIs usually need a collector/proxy. This summary row still reads like Loki/Elasticsearch/Datadog/Splunk HEC are direct drop-in targets, which is misleading.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/centralized-logging.md` around lines 14 - 18, The table's "HTTP
sink" summary is misleading by listing vendor APIs as direct targets; update the
"HTTP sink" row (the cell containing "App-level batched HTTP POST of JSON log
arrays" and the vendor list) to clarify that this sink emits raw JSON arrays and
that vendor services generally require a collector/proxy, e.g., replace the
vendor column entries with a short phrase like "Requires collector/proxy (e.g.,
Promtail/Logstash/Datadog Agent/Splunk Collector)" or similar wording so readers
know these are not drop-in HTTP targets.
src/synthorg/observability/http_handler.py (1)

79-94: ⚠️ Potential issue | 🟠 Major

Replace print() with structured logging using get_logger(__name__).

This module uses print(..., file=sys.stderr) for error reporting (lines 90-94 and 157-165), but http_handler.py is not in the allowed exception list (observability/setup.py and observability/sinks.py). Per coding guidelines, every module with business logic must use get_logger(__name__) and emit domain event constants (SHIPPING_HTTP_*).

🛠️ Suggested approach
+from synthorg.observability import get_logger
+from synthorg.observability.events.shipping import (
+    SHIPPING_HTTP_FLUSH_ERROR,
+    SHIPPING_HTTP_DROP,
+)
+
+logger = get_logger(__name__)
+
 # In _flush_loop:
-            except Exception as exc:
-                print(  # noqa: T201
-                    f"ERROR: log-http-flusher encountered unexpected error: {exc}",
-                    file=sys.stderr,
-                    flush=True,
-                )
+            except Exception as exc:
+                logger.error(
+                    SHIPPING_HTTP_FLUSH_ERROR,
+                    error=str(exc),
+                    exc_info=True,
+                )

Note: Be careful of potential bootstrapping issues if logging setup calls this handler before loggers are ready. Consider a fallback to stderr if the logger is unavailable.

As per coding guidelines, never use print() in application code outside observability/setup.py and observability/sinks.py.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 79 - 94, The module
uses print(...) to report errors in the background flush loop and elsewhere;
replace those prints with structured logging obtained from get_logger(__name__)
and emit the appropriate domain event constants (SHIPPING_HTTP_*), e.g., create
a module-level logger = get_logger(__name__) and in _flush_loop (and the other
error sites around lines ~157-165) call logger.error with a clear message,
include the exception, and attach/emit the SHIPPING_HTTP_* event constant; also
add a minimal fallback to write to stderr if get_logger raises or returns None
to avoid bootstrap issues.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/observability/http_handler.py`:
- Around line 167-173: The close() method can call _drain_and_flush() while the
_flusher thread is still running if _flusher.join(timeout=self._timeout) timed
out; update close() (method name: close) to check _flusher.is_alive() after join
and handle it defensively: if the flusher is still alive either wait longer
(looping join with small intervals until not alive or a hard deadline) or
skip/drain in a thread-safe way (e.g., set shutdown and batch_ready already set)
to avoid concurrent access, ensuring coordination between _flusher and
_drain_and_flush; reference _flusher, _shutdown, _batch_ready, _timeout,
_drain_and_flush, join and is_alive to implement the guard.

In `@src/synthorg/observability/sink_config_builder.py`:
- Around line 286-292: The current use of model_copy(update=updates) in
_parse_rotation_override and _apply_override bypasses Pydantic v2 validation and
allows invalid RotationConfig and SinkConfig values; instead merge the base
model's dict with the updates and call model_validate(...) on the resulting dict
(e.g., create a new_data = {**base.model_dump(), **updates} and call
RotationConfig.model_validate(new_data) or SinkConfig.model_validate(new_data)
as appropriate) so that RotationConfig constraints (max_bytes > 0, backup_count
>= 0) and SinkConfig constraints plus the _validate_sink_type_fields validator
are enforced; replace the model_copy(update=...) return paths with
model_validate on the merged data for both _parse_rotation_override and
_apply_override.

In `@src/synthorg/observability/syslog_handler.py`:
- Around line 63-69: The SysLogHandler instance created in
logging.handlers.SysLogHandler (the local variable handler) currently uses the
default append_nul behavior which appends a NUL byte to messages; set
handler.append_nul = False immediately after creating the handler (before it is
used/added to loggers) to disable NUL-termination for JSON payloads so
downstream consumers don't receive a trailing \x00.

In `@tests/unit/observability/test_compression.py`:
- Around line 181-211: The test test_compression_error_handled_gracefully should
assert that the uncompressed rotated backup is retained when compression fails:
after the loop that emits records (using _CompressingRotatingFileHandler and the
log_file Path), add an assertion that the rotated backup (e.g.,
Path(str(log_file) + ".1") or log_file.with_name(log_file.name + ".1")) exists
to confirm the uncompressed `.1` file was preserved when gzip.open raised
OSError.

In `@tests/unit/observability/test_config.py`:
- Around line 311-325: The test test_custom_http_settings currently uses a
vendor-branded URL ("http://loki:3100/loki/api/v1/push"); update the fixture to
use a neutral host/path (e.g., "http://example.local/api/v1/push" or similar) so
tests remain vendor-agnostic. Modify the SinkConfig instantiation in
test_custom_http_settings to replace the http_url value while keeping the other
parameters (http_headers, http_batch_size, http_flush_interval_seconds,
http_timeout_seconds, http_max_retries) unchanged; apply the same neutralization
to the other test block referenced around lines 515-526 that also sets http_url.
Ensure assertions on cfg.http_* and len(cfg.http_headers) remain the same.

In `@tests/unit/observability/test_http_handler.py`:
- Around line 59-69: The test leaves a handler with queued records in
handler_cleanup so teardown calls HttpBatchHandler.close() outside the patched
urllib.request.urlopen; ensure the handler is closed while urlopen is still
patched to avoid real network calls: within the with
patch("urllib.request.urlopen") block, after handler.emit(_make_record()) call,
call handler.close() (or otherwise drain the queue) before appending the handler
to handler_cleanup, referencing the test function test_emit_queues_record, the
handler variable, handler._queue, and HttpBatchHandler.close to locate the code
to change.

---

Duplicate comments:
In `@docs/guides/centralized-logging.md`:
- Around line 14-18: The table's "HTTP sink" summary is misleading by listing
vendor APIs as direct targets; update the "HTTP sink" row (the cell containing
"App-level batched HTTP POST of JSON log arrays" and the vendor list) to clarify
that this sink emits raw JSON arrays and that vendor services generally require
a collector/proxy, e.g., replace the vendor column entries with a short phrase
like "Requires collector/proxy (e.g., Promtail/Logstash/Datadog Agent/Splunk
Collector)" or similar wording so readers know these are not drop-in HTTP
targets.

In `@src/synthorg/observability/http_handler.py`:
- Around line 79-94: The module uses print(...) to report errors in the
background flush loop and elsewhere; replace those prints with structured
logging obtained from get_logger(__name__) and emit the appropriate domain event
constants (SHIPPING_HTTP_*), e.g., create a module-level logger =
get_logger(__name__) and in _flush_loop (and the other error sites around lines
~157-165) call logger.error with a clear message, include the exception, and
attach/emit the SHIPPING_HTTP_* event constant; also add a minimal fallback to
write to stderr if get_logger raises or returns None to avoid bootstrap issues.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c322f6ce-a54f-43a2-885b-3d151d1a41fb

📥 Commits

Reviewing files that changed from the base of the PR and between a608836 and f09195d.

📒 Files selected for processing (22)
  • CLAUDE.md
  • docs/design/operations.md
  • docs/guides/centralized-logging.md
  • mkdocs.yml
  • src/synthorg/observability/__init__.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/enums.py
  • src/synthorg/observability/events/shipping.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/sinks.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/test_enums.py
  • tests/unit/observability/test_events.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_sink_config_builder.py
  • tests/unit/observability/test_sinks.py
  • tests/unit/observability/test_syslog_handler.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (6)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: No from __future__ import annotations -- Python 3.14 has PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Type hints: all public functions must have type hints, mypy strict mode
Google-style docstrings required on public classes and functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Config vs runtime state: use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Line length: 88 characters (ruff enforced)
Functions must be < 50 lines, files must be < 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)
Every module wit...

Files:

  • src/synthorg/observability/__init__.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/enums.py
  • src/synthorg/observability/sinks.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/events/shipping.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/config.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases

Files:

  • src/synthorg/observability/__init__.py
  • tests/unit/observability/test_events.py
  • tests/unit/observability/test_enums.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/enums.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_sink_config_builder.py
  • src/synthorg/observability/sinks.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/events/shipping.py
  • tests/unit/observability/test_syslog_handler.py
  • src/synthorg/observability/http_handler.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/conftest.py
  • src/synthorg/observability/config.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use @pytest.mark.unit for unit tests, @pytest.mark.integration for integration tests, @pytest.mark.e2e for e2e tests, @pytest.mark.slow for slow tests
Coverage: 80% minimum (enforced in CI)
Async: asyncio_mode = "auto" -- no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. instead of real vendor names
Property-based testing (Python): use Hypothesis (@given + @settings). Profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties
Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) -- it is cancellation-safe

Files:

  • tests/unit/observability/test_events.py
  • tests/unit/observability/test_enums.py
  • tests/unit/observability/test_sinks.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_sink_config_builder.py
  • tests/unit/observability/test_syslog_handler.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/conftest.py
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Docs: docs/ (Markdown, built with Zensical, config: mkdocs.yml). Design spec: docs/design/ (11 pages), Architecture: docs/architecture/, Roadmap: docs/roadmap/

Files:

  • docs/guides/centralized-logging.md
  • docs/design/operations.md
docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When approved deviations from the design spec occur, update the relevant docs/design/ page to reflect the new reality

Files:

  • docs/design/operations.md
{docs/design/operations.md,src/synthorg/providers/presets.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor names may only appear in: (1) Operations design page (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths/module names (e.g. litellm.types.llms.openai), (4) provider presets (src/synthorg/providers/presets.py)

Files:

  • docs/design/operations.md
🧠 Learnings (65)
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to .github/CHANGELOG.md : Changelog: `.github/CHANGELOG.md` (auto-generated, do not edit)

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to docs/**/*.md : Docs: `docs/` (Markdown, built with Zensical, config: `mkdocs.yml`). Design spec: `docs/design/` (11 pages), Architecture: `docs/architecture/`, Roadmap: `docs/roadmap/`

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Documentation source in `docs/` (Markdown, built with Zensical). Design spec in `docs/design/` (7 pages: index, agents, organization, communication, engine, memory, operations). Architecture in `docs/architecture/` (overview, tech-stack, decision log). Roadmap in `docs/roadmap/`. Security in `docs/security.md`. Licensing in `docs/licensing.md`. Reference in `docs/reference/`. REST API reference in `docs/rest-api.md`. Library reference in `docs/api/` (auto-generated from docstrings). Custom templates in `docs/overrides/`. Config in `mkdocs.yml`.

Applied to files:

  • mkdocs.yml
  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to docs/design/*.md : Design spec pages: 7 pages in `docs/design/` — index, agents, organization, communication, engine, memory, operations

Applied to files:

  • mkdocs.yml
  • CLAUDE.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue. DESIGN_SPEC.md is a pointer file linking to the 7 design pages (index, agents, organization, communication, engine, memory, operations).

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue — DESIGN_SPEC.md is a pointer file linking to 7 design pages (Agents, Organization, Communication, Engine, Memory, Operations)

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to docs/design/**/*.md : Design specification pages in `docs/design/` must be consulted before implementing features (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to docs/** : Docs source in docs/ (Markdown, built with Zensical); design spec in docs/design/ (7 pages: index, agents, organization, communication, engine, memory, operations)

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue -- read DESIGN_SPEC.md as a pointer file linking to the 11 design pages

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to docs/{security.md,licensing.md,reference/**/*.md} : Security: `docs/security.md`, Licensing: `docs/licensing.md`, Reference: `docs/reference/`

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Always read the relevant `docs/design/` page before implementing any feature or planning any issue

Applied to files:

  • mkdocs.yml
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

  • src/synthorg/observability/__init__.py
  • tests/unit/observability/test_events.py
  • CLAUDE.md
  • tests/unit/observability/test_enums.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/sink_config_builder.py
  • docs/design/operations.md
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/enums.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/events/shipping.py
  • tests/unit/observability/test_syslog_handler.py
  • src/synthorg/observability/http_handler.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/conftest.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • tests/unit/observability/test_enums.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/sink_config_builder.py
  • docs/design/operations.md
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/enums.py
  • src/synthorg/observability/sinks.py
  • src/synthorg/observability/syslog_handler.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/conftest.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/observability/__init__.py
  • tests/unit/observability/test_events.py
  • CLAUDE.md
  • docs/design/operations.md
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
  • tests/unit/observability/conftest.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • src/synthorg/observability/__init__.py
  • tests/unit/observability/test_events.py
  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • src/synthorg/observability/__init__.py
  • tests/unit/observability/test_events.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • tests/unit/observability/test_enums.py
  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/sink_config_builder.py
  • docs/design/operations.md
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/enums.py
  • tests/unit/observability/test_sink_config_builder.py
  • src/synthorg/observability/syslog_handler.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/conftest.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code; exceptions: `observability/setup.py` and `observability/sinks.py` may use stdlib logging and `print(..., file=sys.stderr)` for bootstrap code

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
  • tests/unit/observability/test_enums.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
  • tests/unit/observability/conftest.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Use event constants from `synthorg.observability.events.<domain>` (e.g., `API_REQUEST_STARTED` from `events.api`); import directly and log with structured kwargs: `logger.info(EVENT, key=value)`, never interpolated strings

Applied to files:

  • src/synthorg/observability/__init__.py
  • CLAUDE.md
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • tests/unit/observability/test_events.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly rather than using string literals

Applied to files:

  • tests/unit/observability/test_events.py
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to src/**/*.py : Event names: always use constants from the domain-specific module under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • tests/unit/observability/test_events.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • tests/unit/observability/test_events.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • tests/unit/observability/test_events.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • tests/unit/observability/test_events.py
  • docs/design/operations.md
  • src/synthorg/observability/events/shipping.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

  • CLAUDE.md
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Package structure: src/synthorg/ organized as: api/ (REST+WebSocket, Litestar), auth/ (auth subpackage), backup/ (scheduled/manual backups), budget/ (cost tracking, CFO), cli/ (superseded by Go CLI), communication/ (message bus, meetings), config/ (YAML loading), core/ (domain models, resilience config), engine/ (orchestration, task state, coordination, approval gates, stagnation detection, context budget, compaction), hr/ (hiring, performance, promotion), memory/ (pluggable backend, Mem0, retrieval, consolidation), persistence/ (operational data, SQLite, settings), observability/ (logging, correlation, sinks), providers/ (LLM abstraction, LiteLLM, auth types, presets, runtime CRUD), settings/ (runtime-editable, typed definitions, encryption, config bridge), security/ (SecOps, rule engine, output scanning, progressive trust, autonomy levels), templates/ (company templates, personalities), tools/ (registry, built-in tools, git, sandbox, code_runner, MCP...

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : All state transitions must log at INFO.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Security: SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies: disabled/weighted/per-category/milestone), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Engine: Agent orchestration, execution loops, parallel execution, task decomposition, routing, task assignment, centralized single-writer task state engine (TaskEngine), task lifecycle, recovery, shutdown, workspace isolation, coordination (multi-agent pipeline: TopologyDispatcher protocol, 4 dispatchers — SAS/centralized/decentralized/context-dependent, wave execution, workspace lifecycle integration, CoordinationSectionConfig company config bridge, build_coordinator factory), coordination error classification, prompt policy validation, checkpoint recovery (checkpoint/, per-turn persistence, heartbeat detection, CheckpointRecoveryStrategy), approval gate (escalation detection, context parking/resume, EscalationInfo/ResumePayload models), stagnation detection (stagnation/, StagnationDetector protocol, ToolRepetitionDetector, dual-signal analysis, corrective prompt injection), agent runtime state (AgentRuntimeState, lightweight per-agent execution status for dashboard queries and recove...

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/security/**/*.py : Security package (security/): SecOps agent, rule engine (soft-allow/hard-deny, fail-closed), audit log, output scanner, output scan response policies (redact/withhold/log-only/autonomy-tiered), risk classifier, risk tier classifier, action type registry, ToolInvoker security integration, progressive trust (4 strategies), autonomy levels (presets, resolver, change strategy), timeout policies (park/resume)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to src/synthorg/hr/**/*.py : HR package (hr/): hiring, firing, onboarding, offboarding, agent registry, performance tracking (task metrics, collaboration scoring, LLM calibration, collaboration overrides, trend detection), promotion/demotion (criteria evaluation, approval strategies, model mapping)

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/security/**/*.py : Security module includes SecOps agent, rule engine (soft-allow/hard-deny), audit log, output scanner, risk classifier, autonomy levels (4 strategies), timeout policies.

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Settings: Runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge, ConfigResolver (typed composed reads for controllers), validation, registry, change notifications via message bus. Per-namespace setting definitions in definitions/ submodule (api, company, providers, memory, budget, security, coordination, observability, backup).

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • CLAUDE.md
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • CLAUDE.md
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to src/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • CLAUDE.md
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/settings/**/*.py : Settings package (settings/): runtime-editable settings persistence (DB > env > YAML > code defaults), typed definitions (9 namespaces), Fernet encryption for sensitive values, config bridge (JSON serialization for Pydantic/collections), ConfigResolver (typed accessors), validation, registry, change notifications via message bus, SettingsSubscriber protocol, SettingsChangeDispatcher (polls `#settings` channel, routes to subscribers, restart_required filtering)

Applied to files:

  • src/synthorg/settings/definitions/observability.py
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to tests/**/*.py : For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use `asyncio.Event().wait()` instead of `asyncio.sleep(large_number)` -- it is cancellation-safe

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to {src,tests}/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small` as aliases

Applied to files:

  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to {src,tests,web,cli,site}/**/*.{py,ts,tsx,go,astro} : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Vendor names may only appear in: (1) Operations design page provider list (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths/module names.

Applied to files:

  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to src/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py` and `observability/sinks.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for bootstrap)

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code; use the injected logger instead

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the synthorg logger instead

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Pure data models, enums, and re-exports do NOT need logging

Applied to files:

  • tests/unit/observability/conftest.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

  • tests/unit/observability/conftest.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • tests/unit/observability/conftest.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • tests/unit/observability/conftest.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {**/*.py,web/src/**/*.{ts,tsx}} : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/observability/config.py
📚 Learning: 2026-03-16T20:14:00.937Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T20:14:00.937Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/observability/config.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to src/**/*.py : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/observability/config.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/observability/config.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/observability/config.py
🔇 Additional comments (34)
src/synthorg/observability/events/shipping.py (1)

1-28: LGTM!

Well-structured event constants module following the established domain.subject.qualifier naming pattern. The module docstring appropriately documents the handler exception for print(..., file=sys.stderr) usage, which aligns with the coding guidelines exception for bootstrap code. All 11 constants are properly typed with Final[str].

tests/unit/observability/test_events.py (2)

236-236: LGTM!

Correctly adds "shipping" to the expected domain modules set for discovery validation.


702-728: LGTM!

Comprehensive test coverage for all 11 shipping event constants. The test follows the established pattern in this file and validates both import availability and exact string value matching.

src/synthorg/observability/__init__.py (1)

32-38: LGTM!

Clean extension of the public API to expose SyslogFacility and SyslogProtocol enums. The __all__ list maintains alphabetical ordering, and these re-exports enable convenient imports directly from synthorg.observability.

Also applies to: 50-51

tests/unit/observability/test_enums.py (4)

5-11: LGTM!

Import statement correctly extended to include the new SyslogFacility and SyslogProtocol enums.


64-79: LGTM!

TestSinkType correctly updated to expect 4 members and validate the new SYSLOG and HTTP enum values.


82-111: LGTM!

Comprehensive test coverage for SyslogFacility enum. The test correctly validates:

  • All 13 members exist (matching the enum definition)
  • Values follow the name.lower() pattern
  • StrEnum inheritance via isinstance(member, str)

114-129: LGTM!

Clean test coverage for SyslogProtocol enum validating the TCP/UDP members, their string values, and StrEnum inheritance.

CLAUDE.md (3)

100-100: LGTM!

Package description accurately updated to reflect the new log shipping (syslog, HTTP) and compressed archival capabilities.


141-141: LGTM!

Correctly extends the stdlib logging and print(..., file=sys.stderr) exception to include observability/syslog_handler.py and observability/http_handler.py. This is consistent with the module docstring in shipping.py explaining the handler bootstrap exception.


143-143: LGTM!

Good addition of SHIPPING_HTTP_BATCH_SENT from events.shipping to the event constants examples. This helps developers discover the new shipping domain events.

src/synthorg/observability/enums.py (3)

32-45: LGTM!

SinkType enum cleanly extended with SYSLOG and HTTP members. The docstring attributes accurately describe the new sink destinations.


48-66: LGTM!

SyslogFacility enum provides a comprehensive set of 13 facility codes matching standard syslog facilities. The docstring correctly references the logging.handlers.SysLogHandler.LOG_* constants that these map to.

Based on the context snippet from syslog_handler.py, all 13 members are present in FACILITY_MAP, ensuring no KeyError at handler construction time.


69-78: LGTM!

SyslogProtocol enum with TCP/UDP members and accurate docstring referencing the corresponding socket types. Both members are mapped in PROTOCOL_MAP per the context snippet.

src/synthorg/settings/definitions/observability.py (2)

35-52: LGTM!

sink_overrides description correctly updated to document the compress_rotated field within the rotation object, alongside the existing strategy, max_bytes, and backup_count fields.


54-75: LGTM!

custom_sinks description comprehensively documents the multi-sink configuration schema:

  • File sink with file_path required
  • Syslog sink with syslog_host required and optional port/facility/protocol
  • HTTP sink with http_url required and optional headers/batching/timeout/retries

This aligns with the PR objectives for log shipping and the new sink types in SinkType enum.

mkdocs.yml (1)

104-104: LGTM!

Navigation entry for the centralized logging guide is correctly placed in the Guides section. The guide file exists at docs/guides/centralized-logging.md and the positioning near "Deployment (Docker)" is logical since logging configuration is a deployment concern.

tests/unit/observability/test_sinks.py (1)

404-481: Good coverage on the new sink branches.

These tests pin the SYSLOG/HTTP dispatch paths and the compress_rotated + EXTERNAL guard, which are easy to regress.

tests/unit/observability/test_sink_config_builder.py (4)

547-607: LGTM! Comprehensive SYSLOG sink tests.

The test class covers essential scenarios: basic creation with defaults, custom settings (port, facility, protocol, level), required field validation, routing behavior, unknown field rejection, and invalid facility/protocol handling. The test assertions properly verify SinkType.SYSLOG and default values like syslog_port=514.


649-737: LGTM! HTTP sink tests are thorough and vendor-neutral.

Good coverage of HTTP sink creation, custom settings, required URL validation, scheme enforcement, routing behavior, and unknown field rejection. Line 677 correctly uses example-collector instead of a real vendor name.


739-793: LGTM! Sink type dispatch tests validate routing logic.

The tests properly verify that omitted sink_type defaults to FILE, explicit file type works, unknown types are rejected, and mixed sink types are appended in order. Good edge case coverage.


795-834: LGTM! Compression and boolean rejection tests.

The compress_rotated override test and boolean rejection for integer fields are well-structured and validate the expected behavior.

src/synthorg/observability/sinks.py (3)

96-108: LGTM! Backup shifting logic corrected.

The oldest backup (.{backupCount}.gz) is now deleted before the shift loop, preventing the off-by-one issue. The loop correctly shifts ii+1 for indices 1 to backupCount-1.


121-147: LGTM! Compression failure handling improved.

The _compress_file method now properly re-raises OSError after cleanup (line 147), allowing the caller (doRollover) to catch it and retain the uncompressed backup with a warning. This addresses the previous concern about silent data loss.


368-395: LGTM! Clean match dispatch for sink types.

The match statement clearly routes each sink type to its handler builder. SYSLOG and HTTP handlers are returned directly (they handle their own formatting), while CONSOLE and FILE use the shared _attach_formatter_and_routing helper. The default case with pragma: no cover is appropriate for exhaustive enum matching.

tests/unit/observability/test_compression.py (1)

1-17: LGTM! Well-structured test module.

Clean imports, proper test organization with two focused test classes, and appropriate use of pytest fixtures (tmp_path, handler_cleanup).

tests/unit/observability/test_syslog_handler.py (3)

156-181: LGTM! JSON formatting test now validates the output.

The test properly validates the formatter by calling handler.format(record), parsing it as JSON, and verifying the "event" key exists. This addresses the previous concern about not validating the actual formatting.


183-201: LGTM! Edge case tests for error handling.

Good coverage of the empty host validation (using object.__setattr__ to bypass Pydantic) and connection failure wrapping. The TCP connection failure test correctly patches SysLogHandler.__init__ since the socket creation happens during construction.


102-132: LGTM! Comprehensive facility mapping coverage.

The parametrized test covers a good sample of facilities (USER, DAEMON, LOCAL0, LOCAL7, AUTH, KERN, SYSLOG). Combined with TestFacilityAndProtocolMaps.test_facility_map_covers_all_members, all enum members are verified.

src/synthorg/observability/config.py (3)

27-32: LGTM! Default constants defined for cross-type validation.

The module-level constants for default values (_DEFAULT_SYSLOG_PORT, _DEFAULT_HTTP_*) are correctly used in the rejection helpers to detect non-default values, enabling proper cross-type field rejection.


162-180: LGTM! Unified validator with exhaustive type coverage.

The match statement dispatches to type-specific validation, and each branch properly rejects irrelevant fields from other sink types. This addresses the previous review concern about incomplete field rejection.


337-365: LGTM! Duplicate endpoint/URL validators prevent configuration errors.

The validators correctly prevent multiple SYSLOG sinks with the same (host, port) and multiple HTTP sinks with the same URL, catching configuration mistakes early.

src/synthorg/observability/http_handler.py (2)

138-165: LGTM! Resource leak fixed with context manager.

The urlopen call now uses a with statement (lines 141-145), which properly closes the response object on both success and error paths. This addresses the previous concern about socket/file descriptor leaks.


116-128: The implementation is correct and produces valid JSON as confirmed by tests. The list[str] type annotation is technically accurate—these are strings in Python, and their JSON content is a runtime property, not a type distinction. No changes needed.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/synthorg/observability/http_handler.py (1)

12-12: ⚠️ Potential issue | 🟠 Major

Close HTTPError responses before retrying.

The success path is scoped with with, but 4xx/5xx responses raise HTTPError, which still owns a response stream. Retrying without exc.close() leaks sockets/file descriptors under repeated server-side failures.

🔧 Minimal fix
-import urllib.request
+import urllib.error
+import urllib.request
@@
             except Exception as exc:
+                if isinstance(exc, urllib.error.HTTPError):
+                    exc.close()
                 last_error = exc
                 if attempt < self._max_retries:
                     continue
#!/bin/bash
python - <<'PY'
import urllib.error
import urllib.response

print("HTTPError_is_addinfourl:", issubclass(urllib.error.HTTPError, urllib.response.addinfourl))
print("HTTPError_has_close:", hasattr(urllib.error.HTTPError, "close"))
PY

sed -n '138,149p' src/synthorg/observability/http_handler.py

Expected result: HTTPError is closeable, and the current retry loop retries it without an explicit close().

Also applies to: 138-149

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` at line 12, The HTTPError paths
in the retry loop are leaking sockets because exceptions from
urllib.request.urlopen still own a response stream; in the except block that
catches urllib.error.HTTPError (the retry loop around urllib.request.urlopen in
the HTTP handling code), call exc.close() (or exc.close() then re-raise/continue
as appropriate) before retrying or returning to ensure the underlying response
fd is closed; update the except urllib.error.HTTPError as exc branch in the HTTP
handler retry loop to explicitly close the exception (exc.close()) prior to
looping or raising.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/observability/sink_config_builder.py`:
- Around line 600-623: _pars e_http_headers currently preserves header names
verbatim; update it to trim and validate header names at parse time by stripping
whitespace from pair[0] (e.g., name = pair[0].strip()), rejecting and raising
ValueError referencing custom_sinks[{index}].http_headers[{j}] if the stripped
name is empty or invalid, and then append the normalized name (not the original)
with the value to headers; use the existing symbols _parse_http_headers,
raw_headers, pair, _HEADER_PAIR_LEN, headers to locate and change the logic
where headers.append((pair[0], pair[1])) occurs.

In `@src/synthorg/observability/syslog_handler.py`:
- Around line 63-77: The TCP branch in SysLogHandler risks concatenated JSON
messages because handler.append_nul is set False; update the code that
constructs the handler (where SysLogHandler is instantiated and append_nul is
set) to detect when PROTOCOL_MAP[sink.syslog_protocol] indicates TCP and either
(a) enforce/verify that the remote syslog receiver supports an explicit framing
scheme (RFC5424 octet-counting or newline-delimited JSON) before allowing
syslog_protocol=tcp, or (b) implement TCP-specific framing locally by
replacing/wrapping the handler with a custom handler/class that emits a
deterministic separator (e.g., octet-count prefixing or newline delimiter) for
each record; also ensure any gating/advertising of tcp support references this
framing implementation (symbols: SysLogHandler, append_nul, PROTOCOL_MAP,
sink.syslog_protocol).

In `@tests/unit/observability/test_http_handler.py`:
- Around line 117-122: The test currently only checks that the request body
parses to a list with length, but doesn't assert element types; update the
assertions after parsing request.data in
tests/unit/observability/test_http_handler.py (around the mock_urlopen usage and
variable names request and body) to verify each item in body is a JSON
object/dict (not a string) and optionally contains expected keys (e.g.,
"event"), e.g., iterate over body and assert isinstance(item, dict) and "event"
in item to prevent stringified JSON entries from passing.

---

Duplicate comments:
In `@src/synthorg/observability/http_handler.py`:
- Line 12: The HTTPError paths in the retry loop are leaking sockets because
exceptions from urllib.request.urlopen still own a response stream; in the
except block that catches urllib.error.HTTPError (the retry loop around
urllib.request.urlopen in the HTTP handling code), call exc.close() (or
exc.close() then re-raise/continue as appropriate) before retrying or returning
to ensure the underlying response fd is closed; update the except
urllib.error.HTTPError as exc branch in the HTTP handler retry loop to
explicitly close the exception (exc.close()) prior to looping or raising.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d7d079ec-28b9-4f83-81d0-035961b5970c

📥 Commits

Reviewing files that changed from the base of the PR and between f09195d and cf09f58.

📒 Files selected for processing (7)
  • docs/guides/centralized-logging.md
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/syslog_handler.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/test_http_handler.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Web
  • GitHub Check: Build Backend
  • GitHub Check: Build Sandbox
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use @pytest.mark.unit for unit tests, @pytest.mark.integration for integration tests, @pytest.mark.e2e for e2e tests, @pytest.mark.slow for slow tests
Coverage: 80% minimum (enforced in CI)
Async: asyncio_mode = "auto" -- no manual @pytest.mark.asyncio needed
Timeout: 30 seconds per test (global in pyproject.toml -- do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed)
Parametrize: Prefer @pytest.mark.parametrize for testing similar cases
Tests must use test-provider, test-small-001, etc. instead of real vendor names
Property-based testing (Python): use Hypothesis (@given + @settings). Profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var. Run dev profile: HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n auto -k properties
Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() to make them deterministic instead of widening timing margins
For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use asyncio.Event().wait() instead of asyncio.sleep(large_number) -- it is cancellation-safe

Files:

  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_config.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001, large/medium/small as aliases

Files:

  • tests/unit/observability/test_compression.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_config.py
  • src/synthorg/observability/http_handler.py
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Docs: docs/ (Markdown, built with Zensical, config: mkdocs.yml). Design spec: docs/design/ (11 pages), Architecture: docs/architecture/, Roadmap: docs/roadmap/

Files:

  • docs/guides/centralized-logging.md
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: No from __future__ import annotations -- Python 3.14 has PEP 649
Use PEP 758 except syntax: except A, B: (no parentheses) -- ruff enforces this on Python 3.14
Type hints: all public functions must have type hints, mypy strict mode
Google-style docstrings required on public classes and functions (enforced by ruff D rules)
Create new objects, never mutate existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction + MappingProxyType wrapping for read-only enforcement
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence)
Config vs runtime state: use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time
Use @computed_field for derived values instead of storing + validating redundant fields (e.g. TokenUsage.total_tokens)
Use NotBlankStr (from core.types) for all identifier/name fields -- including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants -- instead of manual whitespace validators
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task
Line length: 88 characters (ruff enforced)
Functions must be < 50 lines, files must be < 800 lines
Handle errors explicitly, never silently swallow them
Validate at system boundaries (user input, external APIs, config files)
Every module wit...

Files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/http_handler.py
🧠 Learnings (38)
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_config.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_config.py
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_config.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code; exceptions: `observability/setup.py` and `observability/sinks.py` may use stdlib logging and `print(..., file=sys.stderr)` for bootstrap code

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to **/*.py : Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to **/*.py : Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to src/**/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-26T15:18:16.848Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-26T15:18:16.848Z
Learning: Applies to src/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. Use `computed_field` for derived values instead of storing redundant fields. Use `NotBlankStr` for all identifier/name fields.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to tests/**/*.py : Fix flaky tests completely and fundamentally; for timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally; mock `time.monotonic()` and `asyncio.sleep()` for determinism; use `asyncio.Event().wait()` for indefinite blocking instead of `asyncio.sleep(large_number)`

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to tests/**/*.py : For tasks that must block indefinitely until cancelled (e.g. simulating a slow provider or stubborn coroutine), use `asyncio.Event().wait()` instead of `asyncio.sleep(large_number)` -- it is cancellation-safe

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to tests/**/*.py : NEVER skip, dismiss, or ignore flaky tests — always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to tests/**/*.py : Flaky tests: NEVER skip, dismiss, or ignore flaky tests -- always fix them fully and fundamentally. For timing-sensitive tests, mock `time.monotonic()` and `asyncio.sleep()` to make them deterministic instead of widening timing margins

Applied to files:

  • tests/unit/observability/test_http_handler.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to {src,tests}/**/*.py : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small` as aliases

Applied to files:

  • tests/unit/observability/test_config.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to {src,tests,web,cli,site}/**/*.{py,ts,tsx,go,astro} : Vendor-agnostic everywhere: NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Vendor names may only appear in: (1) Operations design page provider list (docs/design/operations.md), (2) .claude/ skill/agent files, (3) third-party import paths/module names.

Applied to files:

  • tests/unit/observability/test_config.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {src/synthorg/**/*.py,tests/**/*.py,web/src/**/*.{ts,tsx}} : NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples — use generic names: `example-provider`, `example-large-001`, `example-medium-001`, `example-small-001`, `large`/`medium`/`small`

Applied to files:

  • tests/unit/observability/test_config.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to tests/**/*.py : Tests must use `test-provider`, `test-small-001`, etc. instead of real vendor names

Applied to files:

  • tests/unit/observability/test_config.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Tests must use test-provider, test-small-001, etc. for vendor-agnostic test data.

Applied to files:

  • tests/unit/observability/test_config.py
📚 Learning: 2026-04-01T10:42:14.650Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T10:42:14.650Z
Learning: Applies to src/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py` and `observability/sinks.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for bootstrap)

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code; use the injected logger instead

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the synthorg logger instead

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • src/synthorg/observability/http_handler.py

Comment on lines +63 to +77
handler = logging.handlers.SysLogHandler(
address=(sink.syslog_host, sink.syslog_port),
facility=FACILITY_MAP[sink.syslog_facility],
socktype=socket.SocketKind(
PROTOCOL_MAP[sink.syslog_protocol],
),
)
except OSError as exc:
msg = (
f"Failed to connect to syslog endpoint "
f"{sink.syslog_host}:{sink.syslog_port} "
f"({sink.syslog_protocol.value.upper()}): {exc}"
)
raise RuntimeError(msg) from exc
handler.append_nul = False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import inspect
import logging.handlers

print("append_nul_default:", logging.handlers.SysLogHandler.append_nul)
source = inspect.getsource(logging.handlers.SysLogHandler.emit)
for line in source.splitlines():
    if "append_nul" in line or "sendall" in line or "sendto" in line:
        print(line)
PY

sed -n '63,78p' src/synthorg/observability/syslog_handler.py

Repository: Aureliolo/synthorg

Length of output: 827


🏁 Script executed:

# Check the full context of the syslog_handler.py file
cd src/synthorg/observability && wc -l syslog_handler.py

Repository: Aureliolo/synthorg

Length of output: 82


🏁 Script executed:

# Look at the broader context of the handler setup and message format
sed -n '40,95p' src/synthorg/observability/syslog_handler.py

Repository: Aureliolo/synthorg

Length of output: 1683


🏁 Script executed:

# Check if there's any RFC 3164/5424 framing logic elsewhere
rg -B2 -A2 "(RFC|3164|5424|framing)" src/synthorg/observability/

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

# Look for any tests related to syslog TCP/UDP behavior
fd -type f -name "*.py" -path "*/tests/*" -exec rg -l "syslog|SysLog" {} \;

Repository: Aureliolo/synthorg

Length of output: 232


Validate TCP message framing contract after disabling NUL termination.

The TCP path creates a message boundary risk. UDP relies on datagram boundaries, but TCP is a byte stream. With append_nul = False on line 77 and JSON log formatting, consecutive events are concatenated without any explicit separator in the stream. Verify that the syslog receiver is configured for a compatible framing scheme (e.g., RFC 5424 octet-counting, newline-delimited JSON, or equivalent). If not, add TCP-specific framing via a custom handler before advertising syslog_protocol=tcp support.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/syslog_handler.py` around lines 63 - 77, The TCP
branch in SysLogHandler risks concatenated JSON messages because
handler.append_nul is set False; update the code that constructs the handler
(where SysLogHandler is instantiated and append_nul is set) to detect when
PROTOCOL_MAP[sink.syslog_protocol] indicates TCP and either (a) enforce/verify
that the remote syslog receiver supports an explicit framing scheme (RFC5424
octet-counting or newline-delimited JSON) before allowing syslog_protocol=tcp,
or (b) implement TCP-specific framing locally by replacing/wrapping the handler
with a custom handler/class that emits a deterministic separator (e.g.,
octet-count prefixing or newline delimiter) for each record; also ensure any
gating/advertising of tcp support references this framing implementation
(symbols: SysLogHandler, append_nul, PROTOCOL_MAP, sink.syslog_protocol).

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/settings/definitions/observability.py (1)

41-46: ⚠️ Potential issue | 🟡 Minor

Document that compress_rotated is builtin-only.

The runtime rejects compress_rotated when rotation.strategy is external, but this schema text presents it as a generic rotation field. A short “builtin only” note here would keep the settings UI from advertising an unsupported combination.

📝 Suggested wording
-            "json_format (bool), rotation (object with strategy, "
-            "max_bytes, backup_count, compress_rotated)"
+            "json_format (bool), rotation (object with strategy, "
+            "max_bytes, backup_count, compress_rotated [builtin only])"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/settings/definitions/observability.py` around lines 41 - 46,
Update the schema description for per-sink overrides in observability.py to note
that the rotation field compress_rotated is builtin-only: change the description
string (the one listing enabled, level, json_format, rotation with strategy,
max_bytes, backup_count, compress_rotated) to append a short parenthetical like
“compress_rotated (builtin-only; rejected when rotation.strategy == 'external')”
so the settings UI won't advertise an unsupported combination.
♻️ Duplicate comments (3)
src/synthorg/observability/sinks.py (1)

79-91: ⚠️ Potential issue | 🟠 Major

Don't treat a post-rotation gzip failure as a completed rollover.

Once _rotate_current_log() has created .1, catching the OSError here leaves that backup outside the managed chain. The next rollover then unlinks .1 in _rotate_current_log(), so a transient compression error permanently drops one retained generation. Preserve an existing uncompressed .1 before the next rollover, or keep the failure on the retention path instead of swallowing it here. Please add a regression test that performs a second rollover after a forced compression failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/sinks.py` around lines 79 - 91, The current
rollover swallows OSError from _compress_file after _rotate_current_log has
already created the .1 file, which can lose a generation on the next rollover;
change the logic in the rollover path that calls _shift_gz_backups(),
_rotate_current_log(), and _compress_file() so that compression failures do not
leave an untracked .1: either move compression into the retention/shift path (so
_compress_file is invoked before old backups are removed), or on OSError restore
or re-link the uncompressed dfn into the managed backup chain (e.g., ensure .1
remains present and counted) and then re-raise or return a non-swallowed error
so retention logic can handle it; update behavior of methods _shift_gz_backups,
_rotate_current_log, and _compress_file accordingly; add a regression test that
simulates a compression failure on the first rollover and then performs a second
rollover to assert no generation was lost and retention count still matches
backupCount.
src/synthorg/observability/http_handler.py (1)

91-95: ⚠️ Potential issue | 🟠 Major

Stop printing shipping failures to stderr.

_flush_loop() and _post_batch() bypass the observability pipeline by calling print(..., file=sys.stderr), so these failures never hit the configured sinks or the new shipping event taxonomy. Route them through logger with events.shipping constants instead.

As per coding guidelines, "Every module with business logic MUST have: from synthorg.observability import get_logger then logger = get_logger(__name__)" and "Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py and observability/sinks.py may use stdlib logging and print(..., file=sys.stderr) for bootstrap and handler-cleanup code."

Also applies to: 143-151

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 91 - 95, Replace
stdout/stderr prints in this module with the observability logger and shipping
event constants: ensure the module imports get_logger from
synthorg.observability and instantiates logger = get_logger(__name__), then
change the print(...) calls inside _flush_loop() and _post_batch() (and the
similar block at lines ~143-151) to logger.error(...) (or the appropriate level)
and emit the events.shipping constant(s) so failures go through the
observability pipeline; remove direct print/sys.stderr usage entirely and use
the events.shipping taxonomy when logging shipping failures.
src/synthorg/observability/config.py (1)

219-267: ⚠️ Potential issue | 🟠 Major

Explicit default-valued foreign fields still slip through.

These helpers only reject non-default values, so inputs like SinkConfig(sink_type=SinkType.CONSOLE, syslog_port=514) or http_batch_size=100 still validate even though transport-specific fields were explicitly supplied. That weakens the new cross-type rejection contract and hides misrouted config.

Based on learnings, "Validate at system boundaries (user input, external APIs, config files)."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/config.py` around lines 219 - 267, The helpers
_reject_syslog_fields and _reject_http_fields currently only reject non-default
values, so explicitly supplied defaults slip through; update the config handling
to also detect and reject fields that were explicitly provided even if they
equal the default. Add per-field "provided" flags (e.g. syslog_host_provided,
syslog_port_provided, syslog_facility_provided, syslog_protocol_provided,
http_url_provided, http_headers_provided, http_batch_size_provided,
http_flush_interval_seconds_provided, http_timeout_seconds_provided,
http_max_retries_provided) set in the constructor or factory when parsing user
input, and modify _reject_syslog_fields and _reject_http_fields to raise
ValueError if either the value is non-default or the corresponding *_provided
flag is True; ensure checks reference the existing symbols (e.g.
_reject_syslog_fields, _reject_http_fields, syslog_port, http_batch_size) so
detection covers explicit default-valued inputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CLAUDE.md`:
- Line 141: The policy exception line in CLAUDE.md currently expands allowed
uses of stdlib logging/print to include observability/syslog_handler.py and
observability/http_handler.py; revert this to match the enforced guideline by
removing those two files so the exception only names observability/setup.py and
observability/sinks.py, and update the sentence wording to: "Exception:
observability/setup.py and observability/sinks.py may use stdlib logging and
print(..., file=sys.stderr) for bootstrap and handler-cleanup code." Ensure the
edited line replaces the existing one exactly where the rule text appears.

In `@src/synthorg/observability/config.py`:
- Around line 233-244: Update _validate_http_fields to parse self.http_url
(e.g., with urllib.parse.urlparse) and verify the parsed result contains a
non-empty network location/host (parsed.netloc or parsed.hostname). If
netloc/hostname is empty or missing, raise ValueError with a clear message such
as "http_url must include a host" in addition to the existing scheme and blank
checks so scheme-only values (like "https://") fail validation at startup.

In `@src/synthorg/observability/http_handler.py`:
- Around line 97-115: _drained_and_flush currently pulls the entire _queue into
a local list before batching which duplicates memory and delays sending; change
_drain_and_flush to build and send batches incrementally by repeatedly
get_nowait from self._queue into a current batch, calling
self._post_batch(current_batch) each time it reaches self._batch_size and after
the loop for any remaining items, and update self._pending_count under
self._pending_lock by subtracting only the number of records actually sent (not
the entire queue size); reference symbols: _drain_and_flush, self._queue,
self._batch_size, self._post_batch, self._pending_lock, self._pending_count.
- Around line 153-175: _close() currently only joins the flusher for a single
self._timeout, but _send_with_retries can take up to (self._max_retries + 1) *
self._timeout so close() can return while the daemon flusher still owns the last
batch and logs are lost; update close() (and the related drain/shutdown logic
that calls _drain_and_flush()) to wait until the flusher has finished sending
all retries by either repeatedly joining/waiting until the queue is empty and
the flusher thread is no longer alive or by joining the flusher with a timeout
computed as (self._max_retries + 1) * self._timeout (or looping joins until
completion), ensuring _drain_and_flush() is fully executed before returning so
_send_with_retries() completes and no batches remain owned by the daemon.

In `@src/synthorg/observability/sink_config_builder.py`:
- Around line 448-667: The file exceeds the 800-line limit and several new
sink-specific parsers should be extracted: move _parse_syslog_optional_fields,
_build_custom_syslog_sink, _HEADER_PAIR_LEN, _parse_http_headers, and
_build_custom_http_sink into a new helper module (e.g.,
synthorg.observability.sink_parsers) and keep only orchestration logic in
sink_config_builder; update sink_config_builder to import those symbols (and any
small shared helpers they need, like _parse_int_field/_parse_number_field if
used) and adjust references to call the imported functions, preserving existing
function names and signatures so callers (e.g., places that construct
SinkConfig) keep working. Ensure the new module has the same top-level imports
and type annotations and run tests/lint to confirm no unresolved imports.
- Around line 286-295: The parser currently accepts a RotationConfig with
compress_rotated and external together (parsed in
sink_config_builder._build_file_handler via RotationConfig.model_validate),
which later causes runtime rejection in _build_file_handler; after merging and
validating the RotationConfig (the merged passed into
RotationConfig.model_validate in build_log_config_from_settings), explicitly
check the resulting RotationConfig instance for the invalid combination
(rotation.compress_rotated == True and rotation.external is not None/True) and
raise a clear ValueError/ConfigError from build_log_config_from_settings so
invalid settings are rejected at parse time; update the tests by adding a
builder-level regression test that constructs settings with compress_rotated and
external and asserts build_log_config_from_settings raises the expected error.

In `@src/synthorg/observability/syslog_handler.py`:
- Around line 59-61: The current check "if not sink.syslog_host" misses
whitespace-only hosts; update the validation in the syslog sink builder (the
block that checks sink.syslog_host in
src/synthorg/observability/syslog_handler.py) to reject strings containing only
whitespace by testing trimmed value (e.g., if not sink.syslog_host or not
sink.syslog_host.strip()) and raise the same ValueError with the existing
message so blank-but-nonempty hosts fail fast during construction.

In `@tests/unit/observability/test_sink_config_builder.py`:
- Around line 549-834: The test file has grown past the 800-line limit; split
the large test module by moving the grouped test classes (e.g.,
TestCustomSyslogSinks, TestCustomHttpSinks, TestCompressRotatedOverrides,
TestBooleanRejection and related test functions like test_syslog_sink_created,
test_http_sink_created, test_compress_rotated_override,
test_http_batch_size_boolean_raises) into one or more new test modules (for
example tests/unit/observability/test_sink_config_syslog.py and
test_sink_config_http.py) keeping each file under 800 lines, update any shared
helpers (like _build and _DEFAULTS_COUNT) to be imported from the original test
helper module, and run the test suite to ensure imports and pytest markers
remain correct.

---

Outside diff comments:
In `@src/synthorg/settings/definitions/observability.py`:
- Around line 41-46: Update the schema description for per-sink overrides in
observability.py to note that the rotation field compress_rotated is
builtin-only: change the description string (the one listing enabled, level,
json_format, rotation with strategy, max_bytes, backup_count, compress_rotated)
to append a short parenthetical like “compress_rotated (builtin-only; rejected
when rotation.strategy == 'external')” so the settings UI won't advertise an
unsupported combination.

---

Duplicate comments:
In `@src/synthorg/observability/config.py`:
- Around line 219-267: The helpers _reject_syslog_fields and _reject_http_fields
currently only reject non-default values, so explicitly supplied defaults slip
through; update the config handling to also detect and reject fields that were
explicitly provided even if they equal the default. Add per-field "provided"
flags (e.g. syslog_host_provided, syslog_port_provided,
syslog_facility_provided, syslog_protocol_provided, http_url_provided,
http_headers_provided, http_batch_size_provided,
http_flush_interval_seconds_provided, http_timeout_seconds_provided,
http_max_retries_provided) set in the constructor or factory when parsing user
input, and modify _reject_syslog_fields and _reject_http_fields to raise
ValueError if either the value is non-default or the corresponding *_provided
flag is True; ensure checks reference the existing symbols (e.g.
_reject_syslog_fields, _reject_http_fields, syslog_port, http_batch_size) so
detection covers explicit default-valued inputs.

In `@src/synthorg/observability/http_handler.py`:
- Around line 91-95: Replace stdout/stderr prints in this module with the
observability logger and shipping event constants: ensure the module imports
get_logger from synthorg.observability and instantiates logger =
get_logger(__name__), then change the print(...) calls inside _flush_loop() and
_post_batch() (and the similar block at lines ~143-151) to logger.error(...) (or
the appropriate level) and emit the events.shipping constant(s) so failures go
through the observability pipeline; remove direct print/sys.stderr usage
entirely and use the events.shipping taxonomy when logging shipping failures.

In `@src/synthorg/observability/sinks.py`:
- Around line 79-91: The current rollover swallows OSError from _compress_file
after _rotate_current_log has already created the .1 file, which can lose a
generation on the next rollover; change the logic in the rollover path that
calls _shift_gz_backups(), _rotate_current_log(), and _compress_file() so that
compression failures do not leave an untracked .1: either move compression into
the retention/shift path (so _compress_file is invoked before old backups are
removed), or on OSError restore or re-link the uncompressed dfn into the managed
backup chain (e.g., ensure .1 remains present and counted) and then re-raise or
return a non-swallowed error so retention logic can handle it; update behavior
of methods _shift_gz_backups, _rotate_current_log, and _compress_file
accordingly; add a regression test that simulates a compression failure on the
first rollover and then performs a second rollover to assert no generation was
lost and retention count still matches backupCount.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 745d655f-1c05-4aab-970f-96746cef334a

📥 Commits

Reviewing files that changed from the base of the PR and between cf09f58 and 715346b.

📒 Files selected for processing (22)
  • CLAUDE.md
  • docs/design/operations.md
  • docs/guides/centralized-logging.md
  • mkdocs.yml
  • src/synthorg/observability/__init__.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/enums.py
  • src/synthorg/observability/events/shipping.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/sinks.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/test_enums.py
  • tests/unit/observability/test_events.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_sink_config_builder.py
  • tests/unit/observability/test_sinks.py
  • tests/unit/observability/test_syslog_handler.py


- **Every module** with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py` and `observability/sinks.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for bootstrap and handler-cleanup code that runs before or during logging system configuration)
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py`, `observability/sinks.py`, `observability/syslog_handler.py`, and `observability/http_handler.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for handler construction, bootstrap, and error reporting code that runs before or during logging system configuration)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep logging/print exceptions narrowly scoped.

Line 141 expands exceptions to observability/syslog_handler.py and observability/http_handler.py, which conflicts with the enforced rule limiting exceptions to observability/setup.py and observability/sinks.py.

Suggested doc fix
-- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py`, `observability/sinks.py`, `observability/syslog_handler.py`, and `observability/http_handler.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for handler construction, bootstrap, and error reporting code that runs before or during logging system configuration)
+- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py` and `observability/sinks.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for bootstrap and handler-cleanup code)

As per coding guidelines, “Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py and observability/sinks.py may use stdlib logging and print(..., file=sys.stderr) for bootstrap and handler-cleanup code.”

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py`, `observability/sinks.py`, `observability/syslog_handler.py`, and `observability/http_handler.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for handler construction, bootstrap, and error reporting code that runs before or during logging system configuration)
- **Never** use `import logging` / `logging.getLogger()` / `print()` in application code (exception: `observability/setup.py` and `observability/sinks.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for bootstrap and handler-cleanup code)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CLAUDE.md` at line 141, The policy exception line in CLAUDE.md currently
expands allowed uses of stdlib logging/print to include
observability/syslog_handler.py and observability/http_handler.py; revert this
to match the enforced guideline by removing those two files so the exception
only names observability/setup.py and observability/sinks.py, and update the
sentence wording to: "Exception: observability/setup.py and
observability/sinks.py may use stdlib logging and print(..., file=sys.stderr)
for bootstrap and handler-cleanup code." Ensure the edited line replaces the
existing one exactly where the rule text appears.

Comment on lines +97 to +115
def _drain_and_flush(self) -> None:
"""Drain all queued records and POST as JSON batches."""
records: list[logging.LogRecord] = []
while True:
try:
records.append(self._queue.get_nowait())
except queue.Empty:
break

with self._pending_lock:
self._pending_count = max(
0,
self._pending_count - len(records),
)

for start in range(0, len(records), self._batch_size):
batch = records[start : start + self._batch_size]
if batch:
self._post_batch(batch)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't copy the entire backlog before sending.

_drain_and_flush() empties the whole SimpleQueue into a list and only then re-chunks it into batch_size slices. Under backpressure that duplicates the backlog in memory and delays the first POST until the full queue has been copied; drain and ship incrementally instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 97 - 115,
_drained_and_flush currently pulls the entire _queue into a local list before
batching which duplicates memory and delays sending; change _drain_and_flush to
build and send batches incrementally by repeatedly get_nowait from self._queue
into a current batch, calling self._post_batch(current_batch) each time it
reaches self._batch_size and after the loop for any remaining items, and update
self._pending_count under self._pending_lock by subtracting only the number of
records actually sent (not the entire queue size); reference symbols:
_drain_and_flush, self._queue, self._batch_size, self._post_batch,
self._pending_lock, self._pending_count.

Comment on lines +549 to +834
# -- Custom SYSLOG sinks ------------------------------------------


@pytest.mark.unit
class TestCustomSyslogSinks:
"""Custom sinks with sink_type='syslog'."""

def test_syslog_sink_created(self) -> None:
custom = json.dumps(
[
{
"sink_type": "syslog",
"syslog_host": "loghost.example.com",
}
]
)
result = _build(custom=custom)
assert len(result.config.sinks) == _DEFAULTS_COUNT + 1
syslog_sink = result.config.sinks[-1]
assert syslog_sink.sink_type == SinkType.SYSLOG
assert syslog_sink.syslog_host == "loghost.example.com"
assert syslog_sink.syslog_port == 514

def test_syslog_custom_settings(self) -> None:
custom = json.dumps(
[
{
"sink_type": "syslog",
"syslog_host": "10.0.0.1",
"syslog_port": 1514,
"syslog_facility": "local3",
"syslog_protocol": "tcp",
"level": "error",
}
]
)
result = _build(custom=custom)
syslog_sink = result.config.sinks[-1]
assert syslog_sink.syslog_port == 1514
assert syslog_sink.syslog_facility.value == "local3"
assert syslog_sink.syslog_protocol.value == "tcp"
assert syslog_sink.level == LogLevel.ERROR

def test_syslog_requires_host(self) -> None:
custom = json.dumps([{"sink_type": "syslog"}])
with pytest.raises(ValueError, match="syslog_host"):
_build(custom=custom)

def test_syslog_no_routing(self) -> None:
custom = json.dumps(
[
{
"sink_type": "syslog",
"syslog_host": "localhost",
}
]
)
result = _build(custom=custom)
assert len(result.routing_overrides) == 0

def test_syslog_unknown_field_rejected(self) -> None:
custom = json.dumps(
[
{
"sink_type": "syslog",
"syslog_host": "localhost",
"bad_field": True,
}
]
)
with pytest.raises(ValueError, match="Unknown fields"):
_build(custom=custom)

def test_syslog_invalid_facility_rejected(self) -> None:
custom = json.dumps(
[
{
"sink_type": "syslog",
"syslog_host": "localhost",
"syslog_facility": "invalid",
}
]
)
with pytest.raises(ValueError, match="facility"):
_build(custom=custom)

def test_syslog_invalid_protocol_rejected(self) -> None:
custom = json.dumps(
[
{
"sink_type": "syslog",
"syslog_host": "localhost",
"syslog_protocol": "quic",
}
]
)
with pytest.raises(ValueError, match="protocol"):
_build(custom=custom)


# -- Custom HTTP sinks ---------------------------------------------


@pytest.mark.unit
class TestCustomHttpSinks:
"""Custom sinks with sink_type='http'."""

def test_http_sink_created(self) -> None:
custom = json.dumps(
[
{
"sink_type": "http",
"http_url": "https://logs.example.com/ingest",
}
]
)
result = _build(custom=custom)
assert len(result.config.sinks) == _DEFAULTS_COUNT + 1
http_sink = result.config.sinks[-1]
assert http_sink.sink_type == SinkType.HTTP
assert http_sink.http_url == "https://logs.example.com/ingest"
assert http_sink.http_batch_size == 100

def test_http_custom_settings(self) -> None:
custom = json.dumps(
[
{
"sink_type": "http",
"http_url": "http://example-collector:3100/push",
"http_batch_size": 50,
"http_flush_interval_seconds": 2.0,
"http_timeout_seconds": 30.0,
"http_max_retries": 5,
"http_headers": [["Authorization", "Bearer tok"]],
"level": "warning",
}
]
)
result = _build(custom=custom)
http_sink = result.config.sinks[-1]
assert http_sink.http_batch_size == 50
assert http_sink.http_flush_interval_seconds == 2.0
assert http_sink.http_timeout_seconds == 30.0
assert http_sink.http_max_retries == 5
assert http_sink.http_headers == (("Authorization", "Bearer tok"),)
assert http_sink.level == LogLevel.WARNING

def test_http_requires_url(self) -> None:
custom = json.dumps([{"sink_type": "http"}])
with pytest.raises(ValueError, match="http_url"):
_build(custom=custom)

def test_http_rejects_non_http_scheme(self) -> None:
custom = json.dumps(
[
{
"sink_type": "http",
"http_url": "ftp://example.com",
}
]
)
with pytest.raises(ValueError, match="http_url must start with"):
_build(custom=custom)

def test_http_no_routing(self) -> None:
custom = json.dumps(
[
{
"sink_type": "http",
"http_url": "https://example.com/logs",
}
]
)
result = _build(custom=custom)
assert len(result.routing_overrides) == 0

def test_http_unknown_field_rejected(self) -> None:
custom = json.dumps(
[
{
"sink_type": "http",
"http_url": "https://example.com",
"bad_field": True,
}
]
)
with pytest.raises(ValueError, match="Unknown fields"):
_build(custom=custom)


# -- Custom sink type dispatch -------------------------------------


@pytest.mark.unit
class TestCustomSinkTypeDispatch:
"""The sink_type field controls which builder path is used."""

def test_default_omitted_remains_file(self) -> None:
custom = json.dumps([{"file_path": "extra.log"}])
result = _build(custom=custom)
assert result.config.sinks[-1].sink_type == SinkType.FILE

def test_explicit_file_type(self) -> None:
custom = json.dumps(
[
{
"sink_type": "file",
"file_path": "extra.log",
}
]
)
result = _build(custom=custom)
assert result.config.sinks[-1].sink_type == SinkType.FILE

def test_unknown_sink_type_rejected(self) -> None:
custom = json.dumps(
[
{
"sink_type": "kafka",
"endpoint": "broker:9092",
}
]
)
with pytest.raises(ValueError, match="sink_type"):
_build(custom=custom)

def test_mixed_types(self) -> None:
custom = json.dumps(
[
{"file_path": "extra.log"},
{
"sink_type": "syslog",
"syslog_host": "loghost",
},
{
"sink_type": "http",
"http_url": "https://example.com/logs",
},
]
)
result = _build(custom=custom)
assert len(result.config.sinks) == _DEFAULTS_COUNT + 3
types = [s.sink_type for s in result.config.sinks[-3:]]
assert types == [SinkType.FILE, SinkType.SYSLOG, SinkType.HTTP]


# -- Compress rotated in rotation overrides --------------------------------


@pytest.mark.unit
class TestCompressRotatedOverrides:
"""compress_rotated can be overridden in rotation settings."""

def test_compress_rotated_override(self) -> None:
overrides = json.dumps(
{
"audit.log": {
"rotation": {"compress_rotated": True},
},
}
)
result = _build(overrides=overrides)
audit = next(s for s in result.config.sinks if s.file_path == "audit.log")
assert audit.rotation is not None
assert audit.rotation.compress_rotated is True


# -- Boolean rejection for int/number fields -------------------------------


@pytest.mark.unit
class TestBooleanRejection:
"""Boolean values are rejected for integer and number fields."""

def test_http_batch_size_boolean_raises(self) -> None:
custom = json.dumps(
[
{
"sink_type": "http",
"http_url": "http://example.com/logs",
"http_batch_size": True,
}
]
)
with pytest.raises(ValueError, match="integer"):
_build(custom=custom)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Move the shipping-sink coverage into separate test modules.

These new syslog/http/compression cases push this file past the repo’s 800-line limit — it now runs through Line 834. Splitting them out would satisfy the size rule and make failures much easier to navigate.

As per coding guidelines, "Functions must be less than 50 lines, files must be less than 800 lines."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/observability/test_sink_config_builder.py` around lines 549 - 834,
The test file has grown past the 800-line limit; split the large test module by
moving the grouped test classes (e.g., TestCustomSyslogSinks,
TestCustomHttpSinks, TestCompressRotatedOverrides, TestBooleanRejection and
related test functions like test_syslog_sink_created, test_http_sink_created,
test_compress_rotated_override, test_http_batch_size_boolean_raises) into one or
more new test modules (for example
tests/unit/observability/test_sink_config_syslog.py and
test_sink_config_http.py) keeping each file under 800 lines, update any shared
helpers (like _build and _DEFAULTS_COUNT) to be imported from the original test
helper module, and run the test suite to ensure imports and pytest markers
remain correct.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
src/synthorg/observability/config.py (1)

162-180: ⚠️ Potential issue | 🟠 Major

Reject json_format=False for shipping sinks at the model boundary.

SinkConfig still accepts json_format=False for SYSLOG and HTTP, but build_syslog_handler() and build_http_handler() always install JSON renderers, so that value is silently ignored. Please enforce json_format is True in those branches as well, so direct SinkConfig(...) construction fails fast instead of accepting a setting the runtime never honors.

As per coding guidelines, "Validate at system boundaries (user input, external APIs, config files)."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/config.py` around lines 162 - 180, The SYSLOG and
HTTP branches in _validate_sink_type_fields currently allow json_format=False
even though build_syslog_handler() and build_http_handler() always use JSON; add
an explicit check in the SinkConfig validation so those branches reject
json_format=False: inside the case SinkType.SYSLOG and case SinkType.HTTP call a
validator (or add an inline check) that raises/uses the existing validation
pattern if self.json_format is False (referencing _validate_syslog_fields,
_validate_http_fields, and _reject_* helpers to mirror style), so constructing a
SinkConfig with json_format=False for SYSLOG/HTTP fails immediately.
src/synthorg/observability/http_handler.py (2)

97-115: ⚠️ Potential issue | 🟠 Major

Flush in batch_size chunks instead of copying the whole backlog first.

Materializing the entire SimpleQueue into records doubles peak memory under backpressure and lets one _drain_and_flush() call own arbitrarily many network sends. That also makes the close() join budget inaccurate: with a multi-batch backlog, the daemon can still be shipping long after join_timeout expires. Drain directly into a current batch and post each chunk as soon as it fills.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 97 - 115, The
current _drain_and_flush() copies the entire _queue into a local records list,
blowing up memory and allowing many network sends; change it to drain the
SimpleQueue into a single growing batch list that appends items via
_queue.get_nowait(), and whenever the batch reaches _batch_size call
_post_batch(batch) and reset the batch; after the loop, if batch is non-empty
post the final batch; maintain correct pending accounting by decrementing
_pending_count by the total number of items drained inside the same critical
section guarded by _pending_lock (use a local drained_count accumulator), and
keep using _post_batch to perform sends so close()/join_timeout behavior
reflects actual in-flight batches.

62-67: ⚠️ Potential issue | 🟠 Major

Emit the HTTP shipping lifecycle through the observability logger instead of stderr.

The new handler still reports flusher failures and dropped batches with print(..., file=sys.stderr), and it never emits the SHIPPING_HTTP_* lifecycle/error events added in this PR. That means the most important shipping diagnostics bypass correlation, routing, and structured searchability.

As per coding guidelines, "Every module with business logic must have from synthorg.observability import get_logger then logger = get_logger(__name__).", "Always use event name constants from domain-specific modules under synthorg.observability.events", and "All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO level."

Also applies to: 80-95, 139-151, 177-190

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 62 - 67, Replace
stderr prints in the HTTP handler with structured observability logging: import
get_logger and create logger = get_logger(__name__), and replace all print(...,
file=sys.stderr) calls inside methods _flush_loop, _handle_failed_batch and any
batch-shipping/error paths (e.g., where dropped batches or flusher failures are
reported) with logger.{info,warn,error} calls using the appropriate event
constants from synthorg.observability.events (use the SHIPPING_HTTP_* constants
added in the PR). Ensure state transitions use INFO-level logs, recoverable
errors use WARNING, terminal/errors use ERROR, and include contextual fields
(batch id, exception text) in the log call before raising or continuing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/synthorg/observability/config.py`:
- Around line 162-180: The SYSLOG and HTTP branches in
_validate_sink_type_fields currently allow json_format=False even though
build_syslog_handler() and build_http_handler() always use JSON; add an explicit
check in the SinkConfig validation so those branches reject json_format=False:
inside the case SinkType.SYSLOG and case SinkType.HTTP call a validator (or add
an inline check) that raises/uses the existing validation pattern if
self.json_format is False (referencing _validate_syslog_fields,
_validate_http_fields, and _reject_* helpers to mirror style), so constructing a
SinkConfig with json_format=False for SYSLOG/HTTP fails immediately.

In `@src/synthorg/observability/http_handler.py`:
- Around line 97-115: The current _drain_and_flush() copies the entire _queue
into a local records list, blowing up memory and allowing many network sends;
change it to drain the SimpleQueue into a single growing batch list that appends
items via _queue.get_nowait(), and whenever the batch reaches _batch_size call
_post_batch(batch) and reset the batch; after the loop, if batch is non-empty
post the final batch; maintain correct pending accounting by decrementing
_pending_count by the total number of items drained inside the same critical
section guarded by _pending_lock (use a local drained_count accumulator), and
keep using _post_batch to perform sends so close()/join_timeout behavior
reflects actual in-flight batches.
- Around line 62-67: Replace stderr prints in the HTTP handler with structured
observability logging: import get_logger and create logger =
get_logger(__name__), and replace all print(..., file=sys.stderr) calls inside
methods _flush_loop, _handle_failed_batch and any batch-shipping/error paths
(e.g., where dropped batches or flusher failures are reported) with
logger.{info,warn,error} calls using the appropriate event constants from
synthorg.observability.events (use the SHIPPING_HTTP_* constants added in the
PR). Ensure state transitions use INFO-level logs, recoverable errors use
WARNING, terminal/errors use ERROR, and include contextual fields (batch id,
exception text) in the log call before raising or continuing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d3518e3a-7fca-479f-bf28-49b61f7cb4ba

📥 Commits

Reviewing files that changed from the base of the PR and between 715346b and 4c0fb6f.

📒 Files selected for processing (8)
  • src/synthorg/observability/_shipping_sink_parsers.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/test_sink_config_builder.py
  • tests/unit/observability/test_sink_config_overrides.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations. Python 3.14 has native PEP 649 lazy annotations.
Use PEP 758 except syntax: use except A, B: (no parentheses). Ruff enforces this on Python 3.14.
All public functions must have type hints. Mypy strict mode is enforced.
Docstrings must use Google style and are required on public classes and functions. Enforced by ruff D rules.
Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction plus MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing and validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr (from core.types) for all identifier/name fields, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Maximum line length is 88 characters. Enforced by ruff.
Functions must be less than 50 lines, files must be less than 800 lines.
Handle errors explicitl...

Files:

  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/test_sink_config_overrides.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/_shipping_sink_parsers.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_sink_config_builder.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic must have from synthorg.observability import get_logger then logger = get_logger(__name__). Use logger as the variable name (not _logger, not log).
Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py and observability/sinks.py may use stdlib logging and print(..., file=sys.stderr) for bootstrap and handler-cleanup code.
Always use event name constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool, GIT_COMMAND_START from events.git). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Always use structured logging with logger.info(EVENT, key=value). Never use logger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
DEBUG logging for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.
All provider calls go through BaseCompletionProvider which applies retry and rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig. Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry.
RetryExhaustedError signals that all retries failed. The engine layer catches this to trigger fallback chains. The rate limiter respects RateLimitError.retry_after from providers and automatically pauses future requests.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, `example-large-...

Files:

  • src/synthorg/settings/definitions/observability.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/_shipping_sink_parsers.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/sink_config_builder.py
**/*

📄 CodeRabbit inference engine (CLAUDE.md)

**/*: No trailing whitespace. Enforced by trailing-whitespace pre-commit hook.
Files must end with a newline. Enforced by end-of-file-fixer pre-commit hook.

Files:

  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/test_sink_config_overrides.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/_shipping_sink_parsers.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_sink_config_builder.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow for test categorization.
Maintain 80% code coverage minimum. Enforced in CI.
Use asyncio_mode = "auto" for async tests. No manual @pytest.mark.asyncio needed.
Global timeout of 30 seconds per test. Do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed.
Prefer @pytest.mark.parametrize for testing similar cases.
Test vendor-agnostic naming: use test-provider, test-small-001, etc. instead of real vendor names.
Use Hypothesis for property-based testing with @given and @settings. Hypothesis profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var.
Never skip, dismiss, or ignore flaky tests. Always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() instead of widening timing margins. For tasks blocking indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/observability/test_sink_config_overrides.py
  • tests/unit/observability/test_sink_config_builder.py
🧠 Learnings (42)
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/test_sink_config_overrides.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/_shipping_sink_parsers.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks

Applied to files:

  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/test_sink_config_overrides.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/_shipping_sink_parsers.py
  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • tests/unit/observability/test_sink_config_overrides.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/_shipping_sink_parsers.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from the domain-specific module under `synthorg.observability.events` in logging calls

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Use `logger` as the variable name (not `_logger`, not `log`).

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code. Exception: `observability/setup.py` and `observability/sinks.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for bootstrap and handler-cleanup code.

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-16T20:14:00.937Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T20:14:00.937Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code; exceptions: `observability/setup.py` and `observability/sinks.py` may use stdlib logging and `print(..., file=sys.stderr)` for bootstrap code

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code; use the injected logger instead

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the synthorg logger instead

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {**/*.py,web/src/**/*.{ts,tsx}} : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/observability/config.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/observability/config.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to **/*.py : Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to **/*.py : Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {**/*.py,web/src/**/*.{ts,tsx}} : Keep functions under 50 lines and files under 800 lines

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to **/*.py : Functions must be less than 50 lines, files must be less than 800 lines.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Functions must be less than 50 lines; files must be less than 800 lines

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-03-17T13:48:18.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T13:48:18.114Z
Learning: Applies to **/*.py : Functions: < 50 lines, files < 800 lines.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-03-21T11:08:01.542Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-21T11:08:01.542Z
Learning: Applies to **/*.{py,md} : Line length: 88 characters (ruff).

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Maintain 80% minimum test coverage (enforced in CI)

Applied to files:

  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to tests/**/*.py : Maintain 80% code coverage minimum. Enforced in CI.

Applied to files:

  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {**/*.py,web/src/**/*.{ts,tsx}} : Line length: 88 characters (enforced by ruff)

Applied to files:

  • tests/unit/observability/test_sink_config_builder.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to tests/**/*.py : Test markers: pytest.mark.unit, pytest.mark.integration, pytest.mark.e2e, pytest.mark.slow. Coverage: 80% minimum (enforced in CI).

Applied to files:

  • tests/unit/observability/test_sink_config_builder.py

Aureliolo and others added 3 commits April 1, 2026 19:00
Add SinkType.SYSLOG for shipping structured JSON to configurable syslog
endpoints (UDP/TCP), SinkType.HTTP for batched JSON POST to HTTP
endpoints with background flushing and retries, and compressed archival
of rotated log files via gzip.

- SyslogFacility and SyslogProtocol enums for syslog configuration
- SinkConfig extended with syslog and HTTP fields, 4-branch validator
- _CompressingRotatingFileHandler with gzip-aware rotation chain
- HttpBatchHandler with SimpleQueue + daemon thread + urllib.request
- Event constants in events/shipping.py
- Runtime custom_sinks builder supports sink_type dispatch
- LogConfig validates endpoint uniqueness (host:port, URL)
- Centralized logging deployment guide with syslog, HTTP, Docker,
  and compression patterns

Closes #566

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add exception handling to HTTP flusher thread (prevents silent death)
- Add dropped-record counter to HttpBatchHandler for observability
- Fix pending count race condition (decrement by drained count)
- Add stderr warning on compression failure (was silently suppressed)
- Add fallback to uncompressed rotation on doRollover failure
- Use atomic temp file for gzip compression (prevents corrupt backups)
- Fail loudly in build_http_handler/build_syslog_handler for empty URLs
- Add compress_rotated to _ROTATION_FIELDS and parser (was silently rejected)
- Update operations.md: event count 54->55, custom_sinks/sink_overrides descriptions
- Update CLAUDE.md: package structure, logging exception list for new handlers

Pre-reviewed by 4 agents, 13 findings addressed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevents UnboundLocalError if an unknown sink type is encountered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aureliolo and others added 6 commits April 1, 2026 19:00
The default case in build_handler is a defensive guard for future enum
extensions. Mypy correctly identifies it as unreachable with the current
SinkType enum coverage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, and CodeRabbit

Source fixes (16 files, 480+/-181 lines):
- Fix _shift_gz_backups off-by-one: move oldest removal before shift loop
- Fix HTTP _post_batch double-encoding: use string joining instead of json.dumps
- Fix _compress_file error handling: re-raise after cleanup, no stale fallback
- Expand cross-type field rejection to all secondary syslog/http fields
- Close urlopen() response via context manager to prevent FD leaks
- Chunk _drain_and_flush by batch_size to prevent unbounded HTTP payloads
- Protect _dropped_count with _pending_lock for thread safety
- Stream compression in 1MB chunks instead of reading entire file into memory
- Reject compress_rotated=True with non-BUILTIN rotation strategy
- Wrap TCP syslog SysLogHandler construction in try/except
- Remove json_format from shipping sink allowed fields
- Extract _attach_formatter_and_routing to keep build_handler under 50 lines
- Add Final annotations, PEP 604 isinstance, logger declaration in builder
- Add events/shipping.py consumer guidance docstring

Test fixes (39 new tests, flaky test elimination):
- Replace time.sleep with threading.Event synchronization
- Replace conditional if-guard assertions with hard asserts
- Fix no-op qsize>=0 assertion to qsize>=1
- Verify JSON format output in syslog handler test
- Add build_handler SYSLOG/HTTP dispatch, guard clause, and dropped count tests
- Add _LoggerNameFilter blank prefix rejection tests
- Add compress_rotated override and boolean rejection tests
- Add backupCount retention verification test
- Move handler_cleanup fixture to conftest.py (deduplicate)
- Replace vendor name "loki" with "example-collector"

Doc fixes:
- Fix event taxonomy count 55->56 in operations.md
- Add events.shipping domain example to CLAUDE.md logging section
- Fix duplicate YAML anchor in centralized-logging.md
- Replace incorrect Loki/Elasticsearch endpoint examples with generic guidance
- Add conf language tags to fenced code blocks
- Update module docstring, RotationConfig docstring, sink_overrides description

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Guard close() against concurrent drain when flusher join times out
- Use model_validate instead of model_copy to enforce Pydantic field
  constraints (gt=0, ge=0) in rotation/sink override application
- Set append_nul=False on SysLogHandler to prevent NUL byte in JSON
- Assert uncompressed .1 backup retained on compression failure in test
- Replace vendor name "loki" with generic hostnames in test_config.py
- Close handler inside urlopen patch to prevent real network calls
- Clarify HTTP sink table: vendor APIs need collector/proxy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Strip and reject empty HTTP header names in _parse_http_headers
- Make append_nul protocol-dependent: False for UDP (self-framing),
  True for TCP (NUL serves as message delimiter for receivers)
- Close HTTPError response FP in retry loop to prevent socket/FD leaks;
  extract _send_with_retries helper to keep _post_batch under C901 limit
- Assert HTTP batch body elements are dicts with expected keys in test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Validate http_url has a host via urlparse (rejects "https://")
- Compute close() join timeout to account for in-flight retries
- Make syslog_host check reject whitespace-only strings
- Extract shipping sink parsers to _shipping_sink_parsers.py (704 lines)
- Reject compress_rotated + external strategy at parse time
- Split test_sink_config_overrides.py from test_sink_config_builder.py
  (792 lines); add compress_rotated+external rejection test
- Note compress_rotated is builtin-only in settings description

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both handlers unconditionally use JSONRenderer, so json_format=False
was silently ignored.  Now rejected at SinkConfig validation time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/synthorg/observability/sink_config_builder.py (1)

244-298: 🛠️ Refactor suggestion | 🟠 Major

Split _parse_rotation_override before it grows further.

Adding compress_rotated pushed this helper past the repo’s <50-line limit. Extracting the field-specific parsing or the merge/validate tail would keep it within the rule and make future rotation options easier to add.

As per coding guidelines, "Functions must be less than 50 lines, files must be less than 800 lines."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/sink_config_builder.py` around lines 244 - 298,
The _parse_rotation_override function exceeds the 50-line limit after adding
compress_rotated; refactor by extracting field parsing and the merge/validate
tail into one or two small helpers: e.g., create _parse_rotation_fields(raw)
that validates strategy/max_bytes/backup_count/compress_rotated (using
_STRATEGY_MAP and _parse_bool and raising the same errors) and returns the
updates dict, and create _merge_and_validate_rotation(base, updates) that
performs {**base.model_dump(), **updates}, calls RotationConfig.model_validate
and _check_compress_rotated_strategy, then returns the result; replace the body
of _parse_rotation_override to call these helpers and keep its public behavior
unchanged.
src/synthorg/observability/config.py (1)

44-67: ⚠️ Potential issue | 🟠 Major

Reject unsupported compress_rotated combinations in RotationConfig.

RotationConfig(strategy=EXTERNAL, compress_rotated=True) is currently a valid model and only fails later in _build_file_handler(). That turns a config error into a runtime handler-construction failure instead of rejecting it at parse time.

🔧 Minimal fix
 class RotationConfig(BaseModel):
@@
     compress_rotated: bool = Field(
         default=False,
         description="Gzip-compress rotated backup files",
     )
+
+    `@model_validator`(mode="after")
+    def _validate_compress_rotated_strategy(self) -> Self:
+        if (
+            self.compress_rotated
+            and self.strategy != RotationStrategy.BUILTIN
+        ):
+            msg = (
+                "compress_rotated is only supported with "
+                "RotationStrategy.BUILTIN"
+            )
+            raise ValueError(msg)
+        return self

As per coding guidelines, "Validate at system boundaries (user input, external APIs, config files)."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/config.py` around lines 44 - 67, Add validation in
the RotationConfig model to reject invalid combinations at parse time: implement
a Pydantic root or model validator on RotationConfig that checks the fields
strategy and compress_rotated and raises a ValueError when strategy ==
RotationStrategy.EXTERNAL and compress_rotated is True (so
RotationConfig(strategy=EXTERNAL, compress_rotated=True) is invalid). This
prevents deferring the error to _build_file_handler and ensures config parsing
fails fast with a clear message referencing strategy and compress_rotated.
♻️ Duplicate comments (4)
docs/design/operations.md (2)

1316-1323: ⚠️ Potential issue | 🟡 Minor

Clarify that Loki/Elasticsearch-style backends still need a collector or proxy here.

This table reads as if those backends are direct HTTP targets, but the HTTP sink emits raw JSON arrays. Please mirror the new centralized-logging guide’s adapter caveat here so the two docs do not disagree.

What payload does Grafana Loki `/loki/api/v1/push` expect? What payload does Elasticsearch `/_bulk` expect? Do either accept a raw JSON array of log objects?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/operations.md` around lines 1316 - 1323, The docs table currently
implies HTTP targets like Loki/Elasticsearch can be hit directly, but the HTTP
sink (referenced by the `HTTP` row and the `custom_sinks` configuration) emits
raw JSON arrays and therefore requires a collector/proxy/adapter for backends
that expect different payloads (e.g., Grafana Loki’s /loki/api/v1/push and
Elasticsearch’s /_bulk). Update the paragraph after the table to mirror the
Centralized Logging guide’s adapter caveat: state explicitly that the HTTP sink
sends raw JSON arrays and that backends such as Loki and Elasticsearch require
an intermediate collector/proxy or adapter to translate the payload into the
backend-specific format and endpoints.

1344-1347: ⚠️ Potential issue | 🟡 Minor

Call out that compress_rotated is builtin-only in this page too.

The builder/settings definitions now reject compress_rotated when rotation.strategy is external, but these sections still present it as a generic rotation flag. Add the same caveat here so operators do not copy an invalid combination.

Also applies to: 1457-1464

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/operations.md` around lines 1344 - 1347, Update the paragraph
that documents compress_rotated to explicitly state that compress_rotated is
supported only for built-in rotation and is rejected when rotation.strategy is
set to "external"; reference the config keys `compress_rotated` and
`rotation.strategy` (and the value `external`) so readers know the invalid
combination and avoid copying it. Make the same clarification in the duplicate
section around lines 1457–1464 so both occurrences warn operators that
compression is builtin-only.
src/synthorg/observability/http_handler.py (1)

97-115: ⚠️ Potential issue | 🟠 Major

close() still underestimates backlog flush time.

_drain_and_flush() removes the whole queue and can then spend many batch send cycles posting it, but close() only waits for one batch worth of retry time. If that join times out, the daemon thread still owns unsent records and they can disappear on process exit.

Also applies to: 177-190

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 97 - 115, close()
currently only waits one retry interval and can exit while _drain_and_flush()
(which empties _queue into several batches and increments/decrements
_pending_count via _post_batch) is still sending multiple batches; update
close() to compute and wait for the actual remaining work by checking _queue and
_pending_count (use _queue.empty() and self._pending_count == 0) or calculate
needed wait as ceil((len(queue) + self._pending_count) / self._batch_size) *
self._retry_interval, then loop/wait until both _queue is empty and
_pending_count == 0 (with an overall timeout) before letting the daemon thread
stop; ensure the daemon/_post_batch logic still decrements _pending_count and
that close() joins the daemon_thread only after this condition is satisfied.
src/synthorg/observability/sinks.py (1)

79-84: ⚠️ Potential issue | 🟠 Major

A gzip failure still drops one retained backup on the next rollover.

When _compress_file() raises, doRollover() leaves the plain .1 file behind, but _rotate_current_log() deletes that same .1 on the next rollover before rotating again. A second rollover after a compression failure therefore shortens retention by one generation instead of preserving an uncompressed fallback chain.

Also applies to: 110-119, 121-147

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/sinks.py` around lines 79 - 84, When a compression
failure occurs, the current flow (doRollover -> _shift_gz_backups() -> dfn =
_rotate_current_log() -> _compress_file(dfn)) can remove the plain .1 on the
next rollover and shorten retention; fix by deferring destructive
moves/deletions until compression succeeds: change the rollover logic so
_rotate_current_log() creates or renames the current log to a temporary/working
name (e.g., produce dfn like "log.1.tmp" or mark it un-deleted) and only call
_shift_gz_backups() / remove the previous plain ".1" after _compress_file(dfn)
returns successfully; alternatively, make _compress_file() atomically produce
the final compressed name and only then remove the plain file. Update
doRollover(), _rotate_current_log(), and _compress_file() accordingly so
deletions of the plain ".1" happen only on successful compression.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/guides/centralized-logging.md`:
- Around line 179-183: The Docker Logging Drivers section currently implies
Docker will capture logs but doesn't instruct operators to enable structured
JSON output for the console sink; update the docs to show the console JSON
override by adding guidance to set the console sink (__console__) to
json_format: true (and include an example YAML snippet showing the __console__
override) so Docker drivers receive structured JSON instead of colored stderr;
apply the same clarification and example to the other affected section
referenced as the block around 211-218.

In `@src/synthorg/observability/config.py`:
- Around line 137-160: SinkConfig currently exposes http_headers publicly but
lacks model-level validation, so callers can construct invalid headers (e.g.,
empty names) bypassing the custom-sink parser check; add a Pydantic validator on
the http_headers field in the SinkConfig model (e.g., a
`@validator`("http_headers") method) that verifies http_headers is an iterable of
(str, str) pairs and that each header name is a non-empty string (and optionally
reject illegal chars like ":"), and raise a ValueError with a clear message on
invalid input; apply the same validator logic to the other similar http_headers
field instance elsewhere in the file (the second occurrence) so both SinkConfig
usages enforce header validation at the config boundary.

In `@src/synthorg/observability/http_handler.py`:
- Around line 153-175: The _send_with_retries function indiscriminately retries
on all exceptions (including urllib.error.HTTPError for 4xx client errors) and
sends identical POST bodies, risking unnecessary retries and duplicate batches;
update _send_with_retries to only retry on transient failures (e.g., network
errors, timeouts, and HTTP 5xx) by inspecting exceptions and HTTPError.code,
and/or add an idempotency mechanism by including a unique
batch_id/idempotency-key in the POST headers or body (generate per-batch before
calling _send_with_retries) so retries are safe if the server supports
deduplication; ensure you reference and modify the _send_with_retries logic and
the code that builds the POST payload/headers to include the idempotency key.

In `@src/synthorg/observability/syslog_handler.py`:
- Around line 59-65: The validation currently checks sink.syslog_host.strip()
but then passes the original untrimmed sink.syslog_host to
logging.handlers.SysLogHandler; change the code to use the trimmed value (e.g.,
assign host = sink.syslog_host.strip() after the guard and pass host into
SysLogHandler address) and apply the same trimmed-value usage for the other
SysLogHandler instantiation referenced around lines 72-74 so the handler
receives a clean hostname instead of the raw, whitespace-padded string.

---

Outside diff comments:
In `@src/synthorg/observability/config.py`:
- Around line 44-67: Add validation in the RotationConfig model to reject
invalid combinations at parse time: implement a Pydantic root or model validator
on RotationConfig that checks the fields strategy and compress_rotated and
raises a ValueError when strategy == RotationStrategy.EXTERNAL and
compress_rotated is True (so RotationConfig(strategy=EXTERNAL,
compress_rotated=True) is invalid). This prevents deferring the error to
_build_file_handler and ensures config parsing fails fast with a clear message
referencing strategy and compress_rotated.

In `@src/synthorg/observability/sink_config_builder.py`:
- Around line 244-298: The _parse_rotation_override function exceeds the 50-line
limit after adding compress_rotated; refactor by extracting field parsing and
the merge/validate tail into one or two small helpers: e.g., create
_parse_rotation_fields(raw) that validates
strategy/max_bytes/backup_count/compress_rotated (using _STRATEGY_MAP and
_parse_bool and raising the same errors) and returns the updates dict, and
create _merge_and_validate_rotation(base, updates) that performs
{**base.model_dump(), **updates}, calls RotationConfig.model_validate and
_check_compress_rotated_strategy, then returns the result; replace the body of
_parse_rotation_override to call these helpers and keep its public behavior
unchanged.

---

Duplicate comments:
In `@docs/design/operations.md`:
- Around line 1316-1323: The docs table currently implies HTTP targets like
Loki/Elasticsearch can be hit directly, but the HTTP sink (referenced by the
`HTTP` row and the `custom_sinks` configuration) emits raw JSON arrays and
therefore requires a collector/proxy/adapter for backends that expect different
payloads (e.g., Grafana Loki’s /loki/api/v1/push and Elasticsearch’s /_bulk).
Update the paragraph after the table to mirror the Centralized Logging guide’s
adapter caveat: state explicitly that the HTTP sink sends raw JSON arrays and
that backends such as Loki and Elasticsearch require an intermediate
collector/proxy or adapter to translate the payload into the backend-specific
format and endpoints.
- Around line 1344-1347: Update the paragraph that documents compress_rotated to
explicitly state that compress_rotated is supported only for built-in rotation
and is rejected when rotation.strategy is set to "external"; reference the
config keys `compress_rotated` and `rotation.strategy` (and the value
`external`) so readers know the invalid combination and avoid copying it. Make
the same clarification in the duplicate section around lines 1457–1464 so both
occurrences warn operators that compression is builtin-only.

In `@src/synthorg/observability/http_handler.py`:
- Around line 97-115: close() currently only waits one retry interval and can
exit while _drain_and_flush() (which empties _queue into several batches and
increments/decrements _pending_count via _post_batch) is still sending multiple
batches; update close() to compute and wait for the actual remaining work by
checking _queue and _pending_count (use _queue.empty() and self._pending_count
== 0) or calculate needed wait as ceil((len(queue) + self._pending_count) /
self._batch_size) * self._retry_interval, then loop/wait until both _queue is
empty and _pending_count == 0 (with an overall timeout) before letting the
daemon thread stop; ensure the daemon/_post_batch logic still decrements
_pending_count and that close() joins the daemon_thread only after this
condition is satisfied.

In `@src/synthorg/observability/sinks.py`:
- Around line 79-84: When a compression failure occurs, the current flow
(doRollover -> _shift_gz_backups() -> dfn = _rotate_current_log() ->
_compress_file(dfn)) can remove the plain .1 on the next rollover and shorten
retention; fix by deferring destructive moves/deletions until compression
succeeds: change the rollover logic so _rotate_current_log() creates or renames
the current log to a temporary/working name (e.g., produce dfn like "log.1.tmp"
or mark it un-deleted) and only call _shift_gz_backups() / remove the previous
plain ".1" after _compress_file(dfn) returns successfully; alternatively, make
_compress_file() atomically produce the final compressed name and only then
remove the plain file. Update doRollover(), _rotate_current_log(), and
_compress_file() accordingly so deletions of the plain ".1" happen only on
successful compression.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 96251f7e-1d84-4eb7-84b1-046f77ebc5d5

📥 Commits

Reviewing files that changed from the base of the PR and between 4c0fb6f and b5c26bc.

📒 Files selected for processing (24)
  • CLAUDE.md
  • docs/design/operations.md
  • docs/guides/centralized-logging.md
  • mkdocs.yml
  • src/synthorg/observability/__init__.py
  • src/synthorg/observability/_shipping_sink_parsers.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/enums.py
  • src/synthorg/observability/events/shipping.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/sinks.py
  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/settings/definitions/observability.py
  • tests/unit/observability/conftest.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_config.py
  • tests/unit/observability/test_enums.py
  • tests/unit/observability/test_events.py
  • tests/unit/observability/test_http_handler.py
  • tests/unit/observability/test_sink_config_builder.py
  • tests/unit/observability/test_sink_config_overrides.py
  • tests/unit/observability/test_sinks.py
  • tests/unit/observability/test_syslog_handler.py

Comment on lines +137 to +160
http_headers: tuple[tuple[str, str], ...] = Field(
default=(),
description="Extra HTTP headers as (name, value) pairs",
)
http_batch_size: int = Field(
default=100,
gt=0,
description="Records per HTTP POST batch",
)
http_flush_interval_seconds: float = Field(
default=5.0,
gt=0,
description="Seconds between automatic flushes",
)
http_timeout_seconds: float = Field(
default=10.0,
gt=0,
description="HTTP request timeout in seconds",
)
http_max_retries: int = Field(
default=3,
ge=0,
description="Retry count on HTTP failure",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate http_headers in SinkConfig too.

Only the custom-sink parser checks header names today. A caller can still construct SinkConfig(sink_type=HTTP, http_headers=(("", "x"),)) directly, and the sender thread will discover the bad header only after records have already been dequeued.

🔧 Minimal fix
     def _validate_http_fields(self) -> None:
         if self.http_url is None:
             msg = "http_url is required for HTTP sinks"
             raise ValueError(msg)
@@
         parsed = urlparse(self.http_url)
         if not parsed.hostname:
             msg = "http_url must include a host"
             raise ValueError(msg)
+        for index, (name, _value) in enumerate(self.http_headers):
+            if not name.strip():
+                msg = (
+                    f"http_headers[{index}] must have a non-empty "
+                    "header name"
+                )
+                raise ValueError(msg)

As per coding guidelines, "Validate at system boundaries (user input, external APIs, config files)."

Also applies to: 235-252

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/config.py` around lines 137 - 160, SinkConfig
currently exposes http_headers publicly but lacks model-level validation, so
callers can construct invalid headers (e.g., empty names) bypassing the
custom-sink parser check; add a Pydantic validator on the http_headers field in
the SinkConfig model (e.g., a `@validator`("http_headers") method) that verifies
http_headers is an iterable of (str, str) pairs and that each header name is a
non-empty string (and optionally reject illegal chars like ":"), and raise a
ValueError with a clear message on invalid input; apply the same validator logic
to the other similar http_headers field instance elsewhere in the file (the
second occurrence) so both SinkConfig usages enforce header validation at the
config boundary.

Comment on lines +153 to +175
def _send_with_retries(
self,
request: urllib.request.Request,
) -> Exception | None:
"""Attempt to send *request*, returning the last error or None."""
last_error: Exception | None = None
for attempt in range(1 + self._max_retries):
try:
with urllib.request.urlopen( # noqa: S310
request,
timeout=self._timeout,
):
pass # Response body not needed
except Exception as exc:
# HTTPError wraps a response FP -- close to avoid FD leak.
if isinstance(exc, urllib.error.HTTPError):
exc.close()
last_error = exc
if attempt < self._max_retries:
continue
else:
return None
return last_error
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n src/synthorg/observability/http_handler.py

Repository: Aureliolo/synthorg

Length of output: 9669


🏁 Script executed:

rg -A 5 -B 5 "SinkConfig" src/synthorg/observability/config.py | head -100

Repository: Aureliolo/synthorg

Length of output: 2717


🏁 Script executed:

rg "http_" src/synthorg/observability/config.py | head -20

Repository: Aureliolo/synthorg

Length of output: 1019


🏁 Script executed:

rg -A 10 "http_max_retries:" src/synthorg/observability/config.py

Repository: Aureliolo/synthorg

Length of output: 733


🏁 Script executed:

rg -i "receiver\|idempotent\|dedup\|duplicate" src/synthorg/observability/ --type py

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

find . -type f \( -name "*.md" -o -name "*.rst" \) | xargs grep -l "http\|receiver\|log" 2>/dev/null | head -10

Repository: Aureliolo/synthorg

Length of output: 438


🏁 Script executed:

rg -i "http.*sink\|log.*receiver\|ingest" docs/ README.md 2>/dev/null | head -20

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

find . -name "*.md" -type f | xargs rg "HTTP\|http_url" 2>/dev/null | head -20

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

rg -i "http\|receiver" src/synthorg/observability/ --type py -B 2 -A 2 | grep -A 10 -B 10 "receiver\|idempotent\|duplicate" || echo "No matches found"

Repository: Aureliolo/synthorg

Length of output: 78


🏁 Script executed:

rg -B 5 -A 5 "_send_with_retries" src/synthorg/observability/http_handler.py

Repository: Aureliolo/synthorg

Length of output: 862


🏁 Script executed:

python3 << 'EOF'
import urllib.error

# Check what error types urllib provides
print("HTTPError exception hierarchy:")
print(f"HTTPError: {urllib.error.HTTPError.__bases__}")

# Simulate what happens with 4xx vs 5xx
try:
    # HTTPError is raised for any HTTP error status (4xx or 5xx)
    # but the code doesn't distinguish between them
    print("\nThe code catches 'Exception' and retries ALL errors:")
    print("- HTTPError (both 4xx like 400/404 and 5xx like 500/502)")
    print("- Timeout (URLError with timeout)")
    print("- Connection errors (URLError)")
    print("\nNo status code checking means 4xx errors are retried unnecessarily")
except Exception as e:
    print(f"Error: {e}")
EOF

Repository: Aureliolo/synthorg

Length of output: 419


🏁 Script executed:

rg "urllib.error.HTTPError\|exc.code\|status" src/synthorg/observability/ --type py

Repository: Aureliolo/synthorg

Length of output: 44


🏁 Script executed:

rg "BaseModel\|ConfigDict" src/synthorg/observability/config.py | head -5

Repository: Aureliolo/synthorg

Length of output: 44


Remove blanket retries or add idempotency mechanisms for HTTP log batches.

The _send_with_retries method retries on all exceptions without distinguishing between permanent client errors and transient failures. Specifically:

  1. 4xx responses retried unnecessarily: urllib.error.HTTPError is caught for any status code (400, 404, etc.), but these represent permanent client-side failures and should not be retried.

  2. Duplicate batch risk: POST bodies contain no batch ID or idempotency key. If the server accepts the POST but the connection drops before the response reaches the client, the retry sends an identical batch. Without receiver-side deduplication, this causes duplicate logs downstream.

Either filter exceptions to retry only transient failures (5xx, timeouts, connection errors), add idempotency metadata (batch ID) to the request body or headers, or disable automatic retries for arbitrary POST endpoints and document the receiver requirements.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 153 - 175, The
_send_with_retries function indiscriminately retries on all exceptions
(including urllib.error.HTTPError for 4xx client errors) and sends identical
POST bodies, risking unnecessary retries and duplicate batches; update
_send_with_retries to only retry on transient failures (e.g., network errors,
timeouts, and HTTP 5xx) by inspecting exceptions and HTTPError.code, and/or add
an idempotency mechanism by including a unique batch_id/idempotency-key in the
POST headers or body (generate per-batch before calling _send_with_retries) so
retries are safe if the server supports deduplication; ensure you reference and
modify the _send_with_retries logic and the code that builds the POST
payload/headers to include the idempotency key.

- Add RotationConfig model validator rejecting compress_rotated+external
  (fail-fast at parse time; remove redundant builder check)
- Validate http_headers names are non-empty at SinkConfig model level
- Skip retries on 4xx HTTP client errors (non-retryable)
- Use trimmed syslog_host value in SysLogHandler construction
- Add Docker console JSON override guidance to centralized-logging guide
- Add HTTP sink adapter caveat and compress_rotated builtin-only note
  to operations.md design spec
- Update existing tests to verify model-level rejection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/synthorg/observability/config.py`:
- Around line 362-374: The validator _validate_no_duplicate_syslog_endpoints is
comparing raw syslog_host strings so entries like " localhost " and "LOCALHOST"
slip through; canonicalize hosts before deduping by stripping whitespace (and
normalize case if you want consistent matching) when building the endpoints
list: replace (s.syslog_host, s.syslog_port) with the trimmed/normalized host
e.g. (s.syslog_host.strip(), s.syslog_port) (or .strip().lower()) so duplicates
are detected consistently with build_syslog_handler's behavior; keep the rest of
the logic (Counter, dupes, error) unchanged.

In `@src/synthorg/observability/http_handler.py`:
- Around line 117-125: The formatter exceptions in _post_batch are silently
dropping records because when self.format(record) raises we only call
handleError(record) and don't increment the dropped counter; update _post_batch
so that inside the except block you increment self._dropped_count (and/or use
whatever atomic counter the class exposes), call handleError(record) as before,
and continue (do not append the record). Keep the rest of _post_batch unchanged
so successfully formatted entries are still appended via
entries.append(self.format(record)).
- Around line 184-192: The close() timeout only accounts for one batch;
calculate how many outstanding batches remain and scale the join timeout
accordingly: compute pending_batches = ceil(self._queue.qsize() /
self._batch_size) (ensure you use the same queue and batch size used by
_drain_and_flush), then set join_timeout = pending_batches * (1 +
self._max_retries) * self._timeout (or use a capped maximum if desired) before
calling self._flusher.join(...); keep the existing is_alive() check and call to
_drain_and_flush() unchanged so the flusher is guaranteed time to finish all
batches.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8ea74479-7e81-4f70-a2ac-8fa8c5853a43

📥 Commits

Reviewing files that changed from the base of the PR and between b5c26bc and d3ab350.

📒 Files selected for processing (8)
  • docs/design/operations.md
  • docs/guides/centralized-logging.md
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
  • src/synthorg/observability/syslog_handler.py
  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_sinks.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Build Sandbox
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Dependency Review
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*

📄 CodeRabbit inference engine (CLAUDE.md)

**/*: No trailing whitespace. Enforced by trailing-whitespace pre-commit hook.
Files must end with a newline. Enforced by end-of-file-fixer pre-commit hook.

Files:

  • docs/guides/centralized-logging.md
  • tests/unit/observability/test_compression.py
  • src/synthorg/observability/syslog_handler.py
  • docs/design/operations.md
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Do not use from __future__ import annotations. Python 3.14 has native PEP 649 lazy annotations.
Use PEP 758 except syntax: use except A, B: (no parentheses). Ruff enforces this on Python 3.14.
All public functions must have type hints. Mypy strict mode is enforced.
Docstrings must use Google style and are required on public classes and functions. Enforced by ruff D rules.
Create new objects rather than mutating existing ones. For non-Pydantic internal collections (registries, BaseTool), use copy.deepcopy() at construction plus MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, serializing for persistence).
Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 (BaseModel, model_validator, computed_field, ConfigDict). Use allow_inf_nan=False in all ConfigDict declarations to reject NaN/Inf in numeric fields at validation time.
Use @computed_field for derived values instead of storing and validating redundant fields (e.g. TokenUsage.total_tokens).
Use NotBlankStr (from core.types) for all identifier/name fields, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (e.g. multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task. Existing code is being migrated incrementally.
Maximum line length is 88 characters. Enforced by ruff.
Functions must be less than 50 lines, files must be less than 800 lines.
Handle errors explicitl...

Files:

  • tests/unit/observability/test_compression.py
  • src/synthorg/observability/syslog_handler.py
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest markers @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow for test categorization.
Maintain 80% code coverage minimum. Enforced in CI.
Use asyncio_mode = "auto" for async tests. No manual @pytest.mark.asyncio needed.
Global timeout of 30 seconds per test. Do not add per-file pytest.mark.timeout(30) markers; non-default overrides like timeout(60) are allowed.
Prefer @pytest.mark.parametrize for testing similar cases.
Test vendor-agnostic naming: use test-provider, test-small-001, etc. instead of real vendor names.
Use Hypothesis for property-based testing with @given and @settings. Hypothesis profiles: ci (50 examples, default) and dev (1000 examples), controlled via HYPOTHESIS_PROFILE env var.
Never skip, dismiss, or ignore flaky tests. Always fix them fully and fundamentally. For timing-sensitive tests, mock time.monotonic() and asyncio.sleep() instead of widening timing margins. For tasks blocking indefinitely until cancelled, use asyncio.Event().wait() instead of asyncio.sleep(large_number).

Files:

  • tests/unit/observability/test_compression.py
  • tests/unit/observability/test_sinks.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/synthorg/**/*.py: Every module with business logic must have from synthorg.observability import get_logger then logger = get_logger(__name__). Use logger as the variable name (not _logger, not log).
Never use import logging / logging.getLogger() / print() in application code. Exception: observability/setup.py and observability/sinks.py may use stdlib logging and print(..., file=sys.stderr) for bootstrap and handler-cleanup code.
Always use event name constants from domain-specific modules under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool, GIT_COMMAND_START from events.git). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.
Always use structured logging with logger.info(EVENT, key=value). Never use logger.info("msg %s", val).
All error paths must log at WARNING or ERROR with context before raising.
All state transitions must log at INFO level.
DEBUG logging for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.
All provider calls go through BaseCompletionProvider which applies retry and rate limiting automatically. Never implement retry logic in driver subclasses or calling code.
RetryConfig and RateLimiterConfig are set per-provider in ProviderConfig. Retryable errors (is_retryable=True): RateLimitError, ProviderTimeoutError, ProviderConnectionError, ProviderInternalError. Non-retryable errors raise immediately without retry.
RetryExhaustedError signals that all retries failed. The engine layer catches this to trigger fallback chains. The rate limiter respects RateLimitError.retry_after from providers and automatically pauses future requests.
Never use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, `example-large-...

Files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py
  • src/synthorg/observability/sink_config_builder.py
🧠 Learnings (45)
📚 Learning: 2026-03-17T06:30:14.180Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:30:14.180Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability includes structured logging via `get_logger(__name__)`, correlation tracking, and log sinks.

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • docs/design/operations.md
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability package (observability/): structured logging, correlation tracking, log sinks; event constants organized by domain under observability/events/ (e.g., events.api, events.tool, events.git, events.context_budget, events.backup)

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • docs/design/operations.md
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/config.py
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/observability/**/*.py : Observability must use structured logging with correlation tracking and log sinks

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • docs/design/operations.md
  • tests/unit/observability/test_sinks.py
  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import `from synthorg.observability import get_logger` and define `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have `from synthorg.observability import get_logger` followed by `logger = get_logger(__name__)`.

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-16T06:24:56.341Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T06:24:56.341Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code. Exception: `observability/setup.py` and `observability/sinks.py` may use stdlib `logging` and `print(..., file=sys.stderr)` for bootstrap and handler-cleanup code.

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic must import logger via `from synthorg.observability import get_logger` and initialize with `logger = get_logger(__name__)`

Applied to files:

  • src/synthorg/observability/syslog_handler.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Handle errors explicitly, never silently swallow. Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-16T20:14:00.937Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T20:14:00.937Z
Learning: Applies to **/*.py : Validate: at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-19T07:13:44.964Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:13:44.964Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to **/*.py : Validate at system boundaries (user input, external APIs, config files).

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {**/*.py,web/src/**/*.{ts,tsx}} : Validate at system boundaries (user input, external APIs, config files)

Applied to files:

  • src/synthorg/observability/syslog_handler.py
  • src/synthorg/observability/config.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from `synthorg.observability.events.<domain>` modules (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`); import directly and use in structured logging

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-14T16:18:57.267Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T16:18:57.267Z
Learning: Applies to src/ai_company/!(observability)/**/*.py : Use event name constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-14T15:43:05.601Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-14T15:43:05.601Z
Learning: Applies to src/**/*.py : Use event name constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.) — import directly

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from domain-specific modules under synthorg.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget, etc.). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `API_REQUEST_STARTED` from `events.api`, `TOOL_INVOKE_START` from `events.tool`, `GIT_COMMAND_START` from `events.git`). Import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`.

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-18T21:23:23.586Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-18T21:23:23.586Z
Learning: Applies to src/synthorg/**/*.py : Event names: always use constants from the domain-specific module under synthorg.observability.events (e.g., API_REQUEST_STARTED from events.api, TOOL_INVOKE_START from events.tool). Import directly from synthorg.observability.events.<domain>.

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use event name constants from synthorg.observability.events domain-specific modules (e.g., PROVIDER_CALL_START from events.provider). Import directly: from synthorg.observability.events.<domain> import EVENT_CONSTANT.

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Always use event name constants from domain-specific modules under `synthorg.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`); import directly: `from synthorg.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

  • docs/design/operations.md
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code; exceptions: `observability/setup.py` and `observability/sinks.py` may use stdlib logging and `print(..., file=sys.stderr)` for bootstrap code

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T11:33:01.580Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T11:33:01.580Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code; use the injected logger instead

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-19T07:12:14.508Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-19T07:12:14.508Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging` / `logging.getLogger()` / `print()` in application code

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T18:28:13.207Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:28:13.207Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger` (not `_logger`, not `log`).

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use `import logging` / `logging.getLogger()` / `print()` in application code. Variable name: always `logger`.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Never use `import logging`, `logging.getLogger()`, or `print()` in application code — use the synthorg logger instead

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-17T06:43:14.114Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T06:43:14.114Z
Learning: Applies to src/synthorg/**/*.py : All error paths must log at WARNING or ERROR with context before raising. All state transitions must log at INFO. DEBUG for object creation, internal flow, entry/exit of key functions. Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Every module with business logic MUST have: `from synthorg.observability import get_logger` then `logger = get_logger(__name__)`. Never use import logging / logging.getLogger() / print() in application code.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-20T11:18:48.128Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T11:18:48.128Z
Learning: Applies to src/synthorg/**/*.py : Retryable errors (`is_retryable=True`): `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`. Non-retryable errors raise immediately without retry.

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to src/synthorg/{providers,engine}/**/*.py : Retryable errors are `RateLimitError`, `ProviderTimeoutError`, `ProviderConnectionError`, `ProviderInternalError`; non-retryable errors raise immediately; `RetryExhaustedError` signals all retries failed

Applied to files:

  • src/synthorg/observability/http_handler.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-20T21:44:04.528Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T21:44:04.528Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models (via `model_copy(update=...)`) for runtime state that evolves

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity. Use separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to **/*.py : Use frozen Pydantic models for config/identity; use separate mutable-via-copy models with `model_copy(update=...)` for runtime state that evolves

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to src/synthorg/**/*.py : Use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-20T08:28:32.845Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-20T08:28:32.845Z
Learning: Applies to **/*.py : Config vs runtime state: use frozen Pydantic models for config/identity; separate mutable-via-copy models (using `model_copy(update=...)`) for runtime state. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-17T22:08:13.456Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-17T22:08:13.456Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 conventions: `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. For derived values use `computed_field` instead of storing + validating redundant fields. Use `NotBlankStr` (from `core.types`) for all identifier/name fields — including optional (`NotBlankStr | None`) and tuple (`tuple[NotBlankStr, ...]`) variants — instead of manual whitespace validators.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T19:14:27.144Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T19:14:27.144Z
Learning: Applies to src/synthorg/**/*.py : Use Pydantic v2 BaseModel, model_validator, computed_field, ConfigDict.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-15T18:38:44.202Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T18:38:44.202Z
Learning: Applies to **/*.py : Config vs runtime state: frozen Pydantic models for config/identity; separate mutable-via-copy models (using model_copy(update=...)) for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to **/*.py : Use Pydantic v2 (`BaseModel`, `model_validator`, `computed_field`, `ConfigDict`). Use `allow_inf_nan=False` in all `ConfigDict` declarations to reject `NaN`/`Inf` in numeric fields at validation time.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-31T14:17:24.182Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-31T14:17:24.182Z
Learning: Applies to {**/*.py,web/src/**/*.{ts,tsx}} : Keep functions under 50 lines and files under 800 lines

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-04-01T15:26:40.428Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T15:26:40.428Z
Learning: Applies to **/*.py : Functions must be less than 50 lines, files must be less than 800 lines.

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
📚 Learning: 2026-03-16T07:22:28.134Z
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-16T07:22:28.134Z
Learning: Applies to **/*.py : Functions must be less than 50 lines; files must be less than 800 lines

Applied to files:

  • src/synthorg/observability/sink_config_builder.py
🪛 LanguageTool
docs/design/operations.md

[grammar] ~1351-~1351: Ensure spelling is correct
Context: ...ess_rotated` is only supported with the builtin rotation strategy; it is rejected when ...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

Comment on lines +184 to +192
# Allow enough time for in-flight retries to finish:
# worst case is (1 + max_retries) * timeout per batch.
join_timeout = (1 + self._max_retries) * self._timeout
self._flusher.join(timeout=join_timeout)
# Only drain from the calling thread if the flusher has exited.
# If join() timed out the flusher may still be in _drain_and_flush,
# and draining concurrently would race on the queue.
if not self._flusher.is_alive():
self._drain_and_flush()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

close() only waits long enough for one batch.

_drain_and_flush() can already be retrying multiple batches, but join_timeout is fixed to a single-batch worst case. With a backlog larger than batch_size, close() can return while the daemon flusher is still alive, so the advertised “flush remaining records, stop thread” behavior is not guaranteed.

🔧 Safer shutdown pattern
-        join_timeout = (1 + self._max_retries) * self._timeout
-        self._flusher.join(timeout=join_timeout)
+        while self._flusher.is_alive():
+            self._flusher.join(
+                timeout=(1 + self._max_retries) * self._timeout,
+            )

If you need a hard upper bound, derive the deadline from the number of outstanding batches instead of a single-batch timeout.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/synthorg/observability/http_handler.py` around lines 184 - 192, The
close() timeout only accounts for one batch; calculate how many outstanding
batches remain and scale the join timeout accordingly: compute pending_batches =
ceil(self._queue.qsize() / self._batch_size) (ensure you use the same queue and
batch size used by _drain_and_flush), then set join_timeout = pending_batches *
(1 + self._max_retries) * self._timeout (or use a capped maximum if desired)
before calling self._flusher.join(...); keep the existing is_alive() check and
call to _drain_and_flush() unchanged so the flusher is guaranteed time to finish
all batches.

- Strip syslog_host whitespace in duplicate endpoint detection to match
  build_syslog_handler trimming behavior
- Increment _dropped_count when self.format(record) raises in _post_batch
  so drop counter accurately reflects all lost records

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Aureliolo Aureliolo merged commit 84be9f8 into main Apr 1, 2026
26 of 28 checks passed
@Aureliolo Aureliolo deleted the feat/log-aggregation branch April 1, 2026 17:53
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview April 1, 2026 17:53 — with GitHub Actions Inactive
Aureliolo added a commit that referenced this pull request Apr 2, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.5.6](v0.5.5...v0.5.6)
(2026-04-02)


### Features

* calendar + hybrid ceremony scheduling strategies
([#985](#985))
([59a9b84](59a9b84)),
closes [#969](#969)
[#970](#970)
* landing page interactive components
([#984](#984))
([49868cb](49868cb))
* log aggregation and shipping (syslog, HTTP, compression)
([#964](#964))
([84be9f8](84be9f8))
* restructure builtin templates into inheritance tree
([#982](#982))
([3794c12](3794c12))
* sprint ceremony runtime scheduler with pluggable strategies
([#983](#983))
([43564a9](43564a9))


### Maintenance

* add no-bash-file-writes rule to CLAUDE.md
([#968](#968))
([a854dcc](a854dcc))
* bump web dependencies (lodash, eslint-react v4, storybook, playwright,
esbuild, codemirror)
([#987](#987))
([c344dfb](c344dfb))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: log aggregation and shipping (syslog, cloud logging)

2 participants