[Sprint 2.5] Non-blocking Cleanup#3
Conversation
S2.5-1 Ingest Hardening: - Add --data-dir / AIMX_DATA_DIR CLI option to override /var/lib/aimx/ - Add mailbox name validation (reject .., /, \, empty, null bytes) - Replace hand-rolled yaml_escape with serde_yaml struct serialization - Add full-pipeline integration tests for all fixture .eml files S2.5-2 Send Hardening: - Escape attachment filenames in MIME headers (quotes, newlines, CR) - Refactor duplicated header construction into write_common_headers() - Add integration tests for dkim-keygen CLI end-to-end - Add test verifying dkim_selector config value is used at runtime
uzyn
left a comment
There was a problem hiding this comment.
Sprint 2.5 Review: Non-blocking Cleanup
Sprint Goal Assessment
Goal: Address accumulated non-blocking improvements from Sprint 1 and Sprint 2 reviews.
Verdict: Goal fully met. All 9 improvement items from the review backlog have been addressed with clean implementations and thorough test coverage.
Acceptance Criteria Checklist
S2.5-1: Ingest Hardening + Testing
| Criterion | Status |
|---|---|
--data-dir / AIMX_DATA_DIR CLI option to override default data path |
Met -- cli.rs adds a global --data-dir arg with env = "AIMX_DATA_DIR". Threaded through main.rs to all subcommands via cli.data_dir.as_deref(). Integration test ingest_via_env_var confirms env var path. |
Mailbox name validation to reject .., /, \, empty, null bytes |
Met -- validate_mailbox_name() in mailbox.rs checks all four cases. Unit tests cover empty, ../etc, foo/bar, foo\bar. |
Replace yaml_escape with serde_yaml struct serialization |
Met -- EmailMetadata and AttachmentMeta now derive Serialize. format_markdown() uses serde_yaml::to_string(). The old yaml_escape function and its 59 lines of manual frontmatter building are removed. Tests updated to parse YAML structurally rather than matching string fragments. |
\r in yaml_escape |
Met (superseded) -- yaml_escape removed entirely; serde_yaml handles all escaping. |
Full-pipeline integration tests with fixture .eml files |
Met -- 4 fixture-based integration tests (plain, html, multipart, attachment) now exercise the actual CLI binary via --data-dir, then verify frontmatter fields structurally and body content. Additional routing tests (ingest_routes_to_named_mailbox, ingest_unknown_routes_to_catchall) also added. |
S2.5-2: Send Hardening + Testing
| Criterion | Status |
|---|---|
| Escape attachment filenames in MIME headers | Met -- escape_filename() handles \, ", \r, \n. Applied to both Content-Type and Content-Disposition name/filename parameters. Three unit tests cover quotes, newlines, and CR. |
Integration test for aimx dkim-keygen CLI |
Met -- Three integration tests: dkim_keygen_end_to_end (generates keys, checks DNS output), dkim_keygen_no_overwrite (existing keys block), dkim_keygen_force_overwrite (--force regenerates). |
Refactor duplicated header construction in compose_message() |
Met -- write_common_headers() extracted. From/To/Subject/Date/Message-ID/In-Reply-To/References/MIME-Version are now in one place. Both attachment and non-attachment paths call it. Net reduction of ~50 lines. |
Test verifying dkim_selector config value is used at runtime |
Met -- dkim_selector_config_used test sends with "myselector" and asserts s=myselector appears in the DKIM-Signature header. |
Test Coverage
- 64 unit tests passing (8 new in this sprint)
- 12 integration tests passing (7 new in this sprint)
cargo clippy -- -D warnings: cleancargo fmt -- --check: clean
Test coverage is comprehensive. The integration tests are a significant improvement -- they now exercise the real binary end-to-end rather than just testing mail-parser parseability.
Potential Bugs
None found. The serde_yaml migration produces clean, structurally valid YAML. The read: false field is correctly serialized as a YAML boolean. The frontmatter format (---\n<yaml>\n---\n\n<body>) is correct -- serde_yaml::to_string() does not emit a leading ---, so there is no double-delimiter issue.
Security Issues
None found. The mailbox name validation covers the important path traversal vectors (.., /, \, null bytes). The attachment filename escaping handles quote injection and header-breaking characters. Both were the primary security-relevant items in this sprint.
Alignment with PRD
Fully aligned. This sprint was a cleanup sprint with no new features -- it hardens the ingest and send pipelines that were built in Sprints 1 and 2. The --data-dir option enables testability and non-standard deployments, which supports the PRD's goal of a flexible single-binary tool.
Non-blocker Observations
-
serde_yaml0.9 is unmaintained -- Theserde_yamlcrate (v0.9) has been officially deprecated in favor of alternatives. This is not a problem now, but worth noting for a future dependency audit. No action needed in this sprint. -
unwrap_or_default()onserde_yaml::to_string-- Informat_markdown(),serde_yaml::to_string(meta).unwrap_or_default()silently produces empty frontmatter on serialization failure. In practice,EmailMetadataserialization should never fail (no maps with non-string keys, no custom serializers), but a future field change could make this fail silently. Considerexpect()or error propagation in a future sprint.
Summary
Clean implementation of all 9 cleanup items. The serde_yaml migration is the most impactful change -- it eliminates an entire class of YAML escaping bugs and reduces code by ~60 lines. The integration tests are now genuinely end-to-end (binary subprocess with real fixture files), which is a major improvement over the previous tests that only checked mail-parser parseability. No bugs, no security issues, no regressions.
Verdict: APPROVED -- Ready to merge.
Recommended merge commit message:
Harden ingest and send pipelines with data-dir override, input validation, and end-to-end tests
- Add --data-dir / AIMX_DATA_DIR CLI option to override default /var/lib/aimx path
- Add mailbox name validation (reject .., /, \, empty, null bytes)
- Replace hand-rolled yaml_escape with serde_yaml struct serialization for frontmatter
- Escape attachment filenames in MIME headers (quotes, newlines, CR)
- Extract write_common_headers() to deduplicate header construction in compose_message()
- Add dkim_selector config usage test
- Add 7 integration tests exercising full CLI pipeline with fixture .eml files
- Add 3 dkim-keygen CLI end-to-end tests (generate, no-overwrite, force)
…robustness - Reorder validate-before-sanitize in write_common_headers so compose_message() returns an error when From/To/Subject contain CRLF, with sanitization as a safety net (Bug #1) - Sanitize In-Reply-To and References headers to prevent CRLF injection from malicious inbound Message-ID values (Bug #2) - Add upper-bound guard (999,999) on counter in create_file_atomic to prevent theoretical u32 overflow (Suggestion #4) - Replace duplicate setup_verify_step_available test with a compile-time signature check that is distinct from verify_runner_trait_mock_pass (Suggestion #3) - Update CRLF injection tests to verify error return instead of silent sanitization, add new tests for reply_to/references injection
…lans Code-review-backed fix plans for each of the 10 findings, with file:line refs, effort estimates, and a priority order. No code changes yet — this consolidates the investigation so fixes can be sequenced. - #10 DKIM mismatch: not a code bug; DNS republish + optional startup check - #9 shell injection: pass trigger vars via env, not string substitution - #8 MCP writes: route state mutations through daemon UDS - #7 claude-code hint: print `claude mcp add` command post-install - #4 send config read: move mailbox resolution to daemon side - #2 SPF: plumb envelope MAIL FROM from smtp session through ingest - #5 wildcard send: remove wildcard branch from resolve_from_mailbox - #1 mailbox create: add restart hint (or route via daemon) - #3 plan wording: clarify "compose new" in docs/manual-test.md
* docs: add manual test results with findings Full execution of docs/manual-test.md against agent.zeroshot.lol. 10 findings recorded with severity and fix direction, notably: - P0 DKIM key on disk does not match DNS TXT (root cause of outbound dkim=fail at Gmail) - P0 Shell injection in on_receive cmd template expansion - P1 MCP write ops (email_mark_read, etc.) fail when MCP runs as non-root due to root:root 0644 mailbox files * docs: add Recommended fixes section with per-finding implementation plans Code-review-backed fix plans for each of the 10 findings, with file:line refs, effort estimates, and a priority order. No code changes yet — this consolidates the investigation so fixes can be sequenced. - #10 DKIM mismatch: not a code bug; DNS republish + optional startup check - #9 shell injection: pass trigger vars via env, not string substitution - #8 MCP writes: route state mutations through daemon UDS - #7 claude-code hint: print `claude mcp add` command post-install - #4 send config read: move mailbox resolution to daemon side - #2 SPF: plumb envelope MAIL FROM from smtp session through ingest - #5 wildcard send: remove wildcard branch from resolve_from_mailbox - #1 mailbox create: add restart hint (or route via daemon) - #3 plan wording: clarify "compose new" in docs/manual-test.md
Sprint 44 (post-launch security + quick fixes) addresses findings #9, #10, #7, #1-tier-1, #3. Sprint 45 (strict outbound + MCP writes via daemon addresses #4, #5, #8 and introduces UDS MARK-READ/MARK-UNREAD verbs. Sprint 46 (mailbox CRUD via UDS) addresses #1-tier-2 with MAILBOX-CREATE/MAILBOX-DELETE verbs so daemon picks up changes live. PRD FR-18d tightened: outbound send now requires a concrete non-wildcard mailbox under config.domain; catchall is inbound-only. PRD FR-18e added to cover the new state-mutation verbs on the UDS socket. Finding #2 (SPF envelope MAIL FROM) excluded — already shipped in cd22428. EOF )
Harden ingest and send pipelines with data-dir override, input validation, and end-to-end tests - Add --data-dir / AIMX_DATA_DIR CLI option to override default /var/lib/aimx path - Add mailbox name validation (reject .., /, \, empty, null bytes) - Replace hand-rolled yaml_escape with serde_yaml struct serialization for frontmatter - Escape attachment filenames in MIME headers (quotes, newlines, CR) - Extract write_common_headers() to deduplicate header construction in compose_message() - Add dkim_selector config usage test - Add 7 integration tests exercising full CLI pipeline with fixture .eml files - Add 3 dkim-keygen CLI end-to-end tests (generate, no-overwrite, force)
…lans Code-review-backed fix plans for each of the 10 findings, with file:line refs, effort estimates, and a priority order. No code changes yet — this consolidates the investigation so fixes can be sequenced. - #10 DKIM mismatch: not a code bug; DNS republish + optional startup check - #9 shell injection: pass trigger vars via env, not string substitution - #8 MCP writes: route state mutations through daemon UDS - #7 claude-code hint: print `claude mcp add` command post-install - #4 send config read: move mailbox resolution to daemon side - #2 SPF: plumb envelope MAIL FROM from smtp session through ingest - #5 wildcard send: remove wildcard branch from resolve_from_mailbox - #1 mailbox create: add restart hint (or route via daemon) - #3 plan wording: clarify "compose new" in docs/manual-test.md
* docs: add manual test results with findings Full execution of docs/manual-test.md against agent.zeroshot.lol. 10 findings recorded with severity and fix direction, notably: - P0 DKIM key on disk does not match DNS TXT (root cause of outbound dkim=fail at Gmail) - P0 Shell injection in on_receive cmd template expansion - P1 MCP write ops (email_mark_read, etc.) fail when MCP runs as non-root due to root:root 0644 mailbox files * docs: add Recommended fixes section with per-finding implementation plans Code-review-backed fix plans for each of the 10 findings, with file:line refs, effort estimates, and a priority order. No code changes yet — this consolidates the investigation so fixes can be sequenced. - #10 DKIM mismatch: not a code bug; DNS republish + optional startup check - #9 shell injection: pass trigger vars via env, not string substitution - #8 MCP writes: route state mutations through daemon UDS - #7 claude-code hint: print `claude mcp add` command post-install - #4 send config read: move mailbox resolution to daemon side - #2 SPF: plumb envelope MAIL FROM from smtp session through ingest - #5 wildcard send: remove wildcard branch from resolve_from_mailbox - #1 mailbox create: add restart hint (or route via daemon) - #3 plan wording: clarify "compose new" in docs/manual-test.md
Sprint 44 (post-launch security + quick fixes) addresses findings #9, #10, #7, #1-tier-1, #3. Sprint 45 (strict outbound + MCP writes via daemon addresses #4, #5, #8 and introduces UDS MARK-READ/MARK-UNREAD verbs. Sprint 46 (mailbox CRUD via UDS) addresses #1-tier-2 with MAILBOX-CREATE/MAILBOX-DELETE verbs so daemon picks up changes live. PRD FR-18d tightened: outbound send now requires a concrete non-wildcard mailbox under config.domain; catchall is inbound-only. PRD FR-18e added to cover the new state-mutation verbs on the UDS socket. Finding #2 (SPF envelope MAIL FROM) excluded — already shipped in f5cebd2. EOF )
Summary
--data-dir/AIMX_DATA_DIRCLI option to override default data path. Added mailbox name validation to reject..,/,\, empty, and null byte names. Replaced hand-rolledyaml_escapewithserde_yamlstruct serialization for frontmatter generation, eliminating edge cases. Added full-pipeline integration tests exercisingingest_email()with all fixture.emlfiles via CLI subprocess.escape_filename()to sanitize attachment filenames in MIME headers (quotes, newlines, carriage returns). Refactored duplicated header construction intowrite_common_headers(). Added end-to-end integration tests foraimx dkim-keygenCLI. Added test verifying customdkim_selectorconfig value is used in DKIM signing.Changes
Cargo.tomlenvfeature forAIMX_DATA_DIRenv var supportsrc/cli.rs--data-dirglobal option withAIMX_DATA_DIRenvsrc/main.rsdata_diroption through to all subcommand handlerssrc/ingest.rsyaml_escape+ manual frontmatter withserde_yamlstruct serialization; acceptdata_diroverridesrc/mailbox.rsvalidate_mailbox_name()with path traversal checks; acceptdata_diroverridesrc/send.rsescape_filename(), extractwrite_common_headers(), acceptdata_diroverridetests/integration.rsTest results
cargo clippy -- -D warningscleancargo fmt -- --checkcleanTest plan
--data-dirflag overrides default data directoryAIMX_DATA_DIRenv var overrides default data directory..,/,\, empty, null bytesdkim-keygenCLI generates keys and outputs DNS recorddkim_selectorvalue appears in DKIM signature (s=parameter)