Make IPv6 outbound opt-in via enable_ipv6 config flag#50
Conversation
aimx send now defaults to IPv4-only outbound delivery. Set enable_ipv6 = true in config.toml to opt into dual-stack (OS chooses). This matches the SPF record aimx setup generates by default (ip4: only) and avoids SPF failures at Gmail when the server has a global IPv6 but DNS isn't set up for it. - Config: new enable_ipv6: bool field, defaults to false - send.rs: LettreTransport carries the flag; re-introduces resolve_ipv4() helper and gates the connect target in try_deliver() - mcp.rs: pass config.enable_ipv6 when constructing transport - book/configuration.md: new "IPv6 delivery (advanced)" section documenting the required AAAA + ip6: SPF additions - book/setup.md: note under DNS table that AAAA / ip6: are only needed when enable_ipv6 is on - aimx setup is unchanged — this is a hidden config flag Verified live: default config connects over IPv4 (AF_INET); adding enable_ipv6 = true makes it connect over IPv6 (AF_INET6).
aimx setup now reads `enable_ipv6` from the existing config (if any) and only calls `get_server_ipv6()` when the flag is true. With the flag unset or false, AAAA is not advertised, `ip6:` is not added to SPF, and neither is verified — matching the IPv4-only default of `aimx send`. Any existing AAAA record in DNS is left alone. Re-entrant workflow: - Fresh install -> no config yet -> IPv4-only setup. - User edits config.toml and adds `enable_ipv6 = true`. - User re-runs `sudo aimx setup <domain>` -> setup detects the flag, shows AAAA + ip6: SPF in the DNS table, and verifies them. Verified live on a dual-stack VPS: default config produces an IPv4-only DNS table; adding `enable_ipv6 = true` produces AAAA + ip6: SPF in the same output. - src/setup.rs: read enable_ipv6 when loading existing config, gate get_server_ipv6() call in run_setup - book/configuration.md: updated IPv6-delivery section to describe the re-run-setup workflow and the ignore-when-disabled behavior - book/setup.md: note that AAAA / ip6: are only shown when the flag is on - docs/prd.md: FR-7 updated to reflect the new gate
uzyn
left a comment
There was a problem hiding this comment.
Review mode
This PR is a follow-up behavioral/policy change, not a sprint implementation. No [Sprint N] prefix, and the only sprint that would be in scope (Sprint 26) is already merged and marked DONE. The PRD is modified in this PR (FR-7, FR-19, decision #8) to document the new policy, so I'm reviewing in Standalone Mode against the PR's own stated intent plus consistency with the (now-updated) PRD.
Changes overview
- Adds a new
enable_ipv6: boolfield toConfig(defaultfalse, serde#[serde(default)], not exposed byaimx setup). LettreTransportbecomes stateful (LettreTransport::new(enable_ipv6)); when the flag is false,try_deliverresolves the MX host's A record viaToSocketAddrsand hands the raw IPv4 string tolettre::SmtpTransport::builder_dangerousas the connect target (falling back to the hostname if no A record is found). When the flag is true, the hostname is passed through and the OS chooses the family.aimx setupreadsenable_ipv6from an existingconfig.toml(if present) and skipsget_server_ipv6()entirely when the flag is false — AAAA andip6:SPF are not shown, advertised, or verified.- PRD FR-7, FR-19, and resolved-decision #8 rewritten to describe the opt-in default.
book/configuration.mdgets a new "IPv6 delivery (advanced)" section;book/setup.mdgets a small note.- Two unit tests added in
config.rscovering serde defaulting and parsing.
Scope alignment
Implementation matches the PR description: IPv4-only default, hidden flag, aimx setup gated behind the flag, PRD/book updated. No scope creep observed.
One small scope gap worth calling out: the PR body (and the book) describe aimx verify as "ignoring IPv6 entirely" when the flag is off. The code change gates IPv6 detection inside run_setup, which drives verification through verify_all_dns, so in practice that claim holds for the setup wizard. But aimx verify is its own subcommand — if it has its own IPv6-aware path (not touched in this diff), the claim in book/configuration.md ("aimx setup and aimx verify ignore IPv6 entirely") could be inaccurate. I did not see a verify.rs change in the diff; the author should confirm aimx verify doesn't independently probe IPv6 when the flag is off, or adjust the book wording.
Potential bugs
1. resolve_ipv4() uses a blocking, default-resolver lookup with no timeout — Non-blocker (but worth noting)
src/send.rs:27-36
format!("{host}:25").to_socket_addrs().ok()?.find_map(...)ToSocketAddrs is a blocking libc getaddrinfo call with no timeout. try_deliver is called from inside block_in_place on a tokio runtime, so a misbehaving/slow DNS resolver will block a worker thread until libc gives up (can be ~30 seconds). The existing resolve_mx uses hickory-resolver with proper async; this new codepath silently drops to libc. Not a correctness issue — just an inconsistency that's easy to overlook. If it matters, reuse hickory-resolver here too.
2. resolve_ipv4() picks the first A record non-deterministically — Non-blocker
ToSocketAddrs iterator order is OS-defined. If an MX host publishes multiple A records (load balancing), you'll connect to whichever one the OS returns first and never retry the others on failure. The outer try_deliver loop only retries across MX hosts, not across A records of a single host. Pre-Sprint-26, lettre itself was doing this resolution and arguably had the same behavior, so this is not a regression — but it's worth a comment in the code so the next reader understands the choice.
3. Error message when A-record lookup fails is silently masked — Non-blocker
If resolve_ipv4(host) returns None (host has AAAA only, or DNS failed), the code falls back to host.to_string() as the connect target. With enable_ipv6 = false, the user's intent is "IPv4 only", but this fallback hands the hostname to lettre, which will then pick whatever family the OS returns — possibly IPv6 — directly contradicting the flag. The user will see a normal "delivered" success and not realize IPv6 was used. Two reasonable alternatives:
- Treat
Noneas an error: "no A record for {host}, skipping (enable_ipv6 = false)" and move on to the next MX. - Log a visible warning when falling back.
Silently ignoring the flag under a specific edge condition is surprising. At minimum, this deserves a code comment so the behavior is intentional and documented.
4. TLS SNI vs. connect target mismatch is fine here, but worth noting — Non-blocker
TlsParameters::builder(host.to_string()) uses the hostname for SNI / cert verification, while the transport is built with the bare IPv4 string as the connect target. Because dangerous_accept_invalid_certs(true) is set, cert-name mismatch is accepted. Correct, but fragile: if anyone ever flips that flag to do real cert verification, SNI will present host while the TLS layer sees an IP and the behavior gets subtle. Add a comment.
Test coverage
What's covered
config::tests::enable_ipv6_defaults_to_falseandparse_enable_ipv6_true— serde round-trip for the new flag. Fine.- Every test fixture updated to set
enable_ipv6: false. Good hygiene.
Gaps
- No test for
LettreTransport::new(true)vsnew(false)picking different connect targets. The core behavioral change of the PR has zero direct test coverage.resolve_ipv4is a private helper, andtry_deliverneeds a live SMTP receiver — but you could at least unit-testresolve_ipv4("127.0.0.1")returnsSome(127.0.0.1)andresolve_ipv4("::1")returnsNone, or refactor the connect-target selection into a pure function that can be tested without network. As-is, the only verification is the manual strace the author ran on his VPS. - No test for the
aimx setupgating.run_setupnow has two branches forenable_ipv6(true/false). There's no test that, given aconfig.tomlwithenable_ipv6 = true,get_server_ipv6()is called, and that withenable_ipv6 = falseit is not. The trait-basedMockNetworkOpspattern exists specifically for this; the PR did not add an assertion here. - No test that DNS guidance / SPF generation omits AAAA and
ip6:when the flag is false.generate_dns_recordsis called withserver_ipv6_str.as_deref(), so when the flag is false the argument isNone. A small test confirminggenerate_dns_records(..., None, ...)produces no AAAA and noip6:would lock in the contract.
These aren't theoretical — the PR's whole raison d'être is changing which family is used, and none of that change is under test.
Security issues
None identified. The change is a narrowing of behavior (default to IPv4 instead of OS-choice) and an opt-in flag. No new attack surface.
Code quality
Non-blocker observations
resolve_ipv4is defined inimpl LettreTransport(line 22) but is an associated function with no self. It could live anywhere; fine as-is, but noting it isn't really a method.- The PR description mentions "re-introduces
resolve_ipv4()" — Sprint 25 removed a similar helper and Sprint 26 cleaned it up. The codebase has now had this helper added, removed, and re-added in three consecutive sprints. Worth a code comment explaining why it's back (to preserve the flag semantic, not because lettre can't connect) so the next engineer doesn't delete it again. enable_ipv6: boolfield onConfigis#[serde(default)]but does not haveskip_serializing_if. Whensave()writes the config, it will always emitenable_ipv6 = falsefor existing configs that didn't have it. That's probably fine (self-documenting), but inconsistent withverify_hostwhich usesskip_serializing_if = "Option::is_none". Decide which pattern you want and apply uniformly.
Alignment with PRD
PRD FR-7, FR-19, and decision #8 are updated in this same PR to match the new behavior. The updates read cleanly and are self-consistent. No PRD-vs-sprint divergence, because no sprint in docs/sprint.md describes this work — it's a post-sprint policy change.
One minor: docs/sprint.md still has Sprint 26 marked [DONE] with acceptance criteria that say the OS should choose IPv4 or IPv6 naturally. That's now technically inaccurate for the shipped behavior (IPv4 by default). Not a blocker for this PR, but the sprint file would benefit from a note — either via the scrum-plan-updater or a one-line addendum to Sprint 26 — that Sprint 26's default was flipped by this follow-up. Otherwise someone reading sprint.md will think IPv6-by-default is still the behavior.
Human review comments
No human comments on PR #50 or referenced issues. Skipping.
Summary and recommended actions
Overall verdict: Needs minor fixes
The behavioral change itself is sound, small, and well-motivated. The implementation has no correctness blockers I can identify, but the test coverage gaps and the silent resolve_ipv4 → None fallback are worth addressing before merge, because the whole point of the PR is that the default family selection is reliable and predictable.
Blockers (must fix before merge): none.
Non-blockers (should fix):
- Add direct test coverage for the IPv4-vs-hostname connect-target choice in
LettreTransport(at minimum refactor the selection into a pure testable function; ideally a small unit test). - Add a test that
run_setupcallsget_server_ipv6()iffenable_ipv6is true (use the existingMockNetworkOps). - Decide what happens when
enable_ipv6 = falseand the MX host has no A record. Current behavior silently falls through to the hostname and may use IPv6 anyway, contradicting the flag. Either skip that MX, warn, or at minimum document the fallback in a code comment. - Verify and, if needed, correct the book claim that "
aimx verifyignores IPv6 entirely" — the diff only gatesrun_setup; confirmaimx verifyas a standalone command behaves the same. - Update
docs/sprint.mdso Sprint 26's acceptance criteria / description note that IPv6 became opt-in via this follow-up. Prevents future confusion.
Nice-to-haves:
- Replace
ToSocketAddrswithhickory-resolverinresolve_ipv4for consistency and timeout control. - Add a one-line code comment in
resolve_ipv4explaining why it exists (flag semantics, not a lettre bug) to prevent a fourth round of add/remove. - Decide a single convention for optional vs. defaulted config fields (
skip_serializing_ifvs. bare#[serde(default)]) and apply consistently.
- Extract connect-target selection into pure `select_connect_target()` with `ConnectTarget` enum; add unit tests covering all four IPv4/IPv6 combinations. - Skip MX hosts with no A record when `enable_ipv6 = false` instead of silently falling through to the hostname (which could still resolve to IPv6 and violate the flag). Emits a clear "no A record; skipping" error so `try_deliver` moves on to the next MX. - Replace blocking `ToSocketAddrs` in `resolve_ipv4` with a new `mx::resolve_a()` helper built on `hickory-resolver`, for consistency with MX resolution and to avoid blocking getaddrinfo in tokio workers. - Add a doc comment on `resolve_ipv4` explaining why it exists (flag semantics, third add/remove in three sprints) to prevent future deletion. - Add a doc comment on TLS SNI vs. connect-target mismatch while `dangerous_accept_invalid_certs` is set. - Extract `detect_server_ipv6(enable_ipv6, net)` gate in setup.rs and add tests using `get_server_ipv6_calls` counter on `MockNetworkOps` to verify the network call is made iff the flag is true. - Correct book/configuration.md: `aimx verify` only probes port 25, not IPv6; the flag only affects `aimx setup`. - Add post-merge addendum under Sprint 26 in docs/sprint.md noting that this follow-up flipped the default to opt-in IPv6.
Review follow-upPushed 85ec2ed addressing the 5 non-blockers and all 3 nice-to-haves. Addressed
Intentionally left as-is
Verification
|
uzyn
left a comment
There was a problem hiding this comment.
Re-review (standalone mode) — follow-up on commit 85ec2ed
Verified the five previously flagged non-blockers and all nice-to-haves are addressed. cargo test (365 unit + 44 integration), cargo clippy -- -D warnings, and cargo fmt -- --check all pass on the PR head.
Resolved issues
-
Connect-target selection test coverage (non-blocker #1) — Resolved.
select_connect_target(host, enable_ipv6, &ipv4_addrs) -> ConnectTargetextracted as a pure function with aConnectTarget::{Target, SkipNoIpv4}enum. Four unit tests insrc/send.rscover: IPv6-on + no addrs, IPv6-on + ignored addrs, IPv4-mode + first A record, IPv4-mode + no A record. All paths of the decision are now under test. -
run_setupgating test (non-blocker #2) — Resolved.detect_server_ipv6(enable_ipv6, net)extracted fromrun_setupandMockNetworkOpsgrew aget_server_ipv6_calls: Cell<u32>counter. Three tests assert the exact contract: flag=false → 0 calls + None; flag=true → exactly 1 call; flag=true + underlying None → 1 call + None. This is exactly the test shape that was missing. -
Silent fallback in
resolve_ipv4(non-blocker #3) — Resolved.ConnectTarget::SkipNoIpv4is returned whenenable_ipv6 = falseand the MX host has no A record;try_deliverconverts that into an error ("<host>: no A record (enable_ipv6 = false); skipping") so the outer MX loop moves to the next MX instead of silently handing the hostname to lettre. Flag contract is no longer silently violated.Small residual note (not a blocker):
Self::resolve_ipv4(host).unwrap_or_default()atsrc/send.rs:177conflates transient DNS errors (SERVFAIL, timeout, resolver init failure) with "no A record" — both produce an empty vec and the sameSkipNoIpv4message. Because the outer MX loop handles failover, the flag semantic is preserved, but a user troubleshooting DNS issues will see "no A record" even when the real problem is a resolver/network error. Worth a follow-up, not blocking. -
Doc/behavior mismatch for
aimx verify(non-blocker #4) — Resolved. Inspectedsrc/verify.rsdirectly: it only probes port 25 connectivity (binds a temp listener if free, thenrun_with_net). No IPv6-aware path. The book copy is now accurate: "aimx verifyonly probes port 25 connectivity and is unaffected by this flag." -
Sprint 26 historical record (non-blocker #5) — Resolved. A post-merge addendum block under Sprint 26 in
docs/sprint.mdclearly states the default was flipped by this PR and points to FR-7, FR-19, decision #8, and the book. Original ACs preserved so the sprint record stays historically accurate.
Nice-to-haves addressed
hickory-resolverin place ofToSocketAddrs:mx::resolve_ausesipv4_lookup, treatsNoRecordsFound/NxDomainas empty vec, and surfaces other errors. Called fromresolve_ipv4viablock_in_place— the same pattern asresolve_mx— so no blocking getaddrinfo in tokio workers.- Doc comment on
resolve_ipv4: added, explicitly documenting the three-sprint add/remove history and the flag-semantics reason it must stay. - TLS SNI vs connect-target code comment: added in
try_deliver, documenting that this is safe only whiledangerous_accept_invalid_certs(true)is set.
Intentionally left as-is — acknowledged
- Serde convention unification: author's reasoning is sound —
boolandOption<String>have genuinely different "missing" semantics, and addingskip_serializing_if+ afn is_falsehelper for a bool default would add noise for no operator benefit. Accepted. - Multi-A-record retry on connect failure: separable concern; the outer MX loop provides sufficient failover for the flag's intent. Accepted.
New issues found
None material. Just the residual unwrap_or_default() note above, which is non-blocking.
Summary
All five previously flagged non-blockers and the three nice-to-haves are resolved with appropriate test coverage. The behavioral change (IPv4-only default, flag honoured even in AAAA-only edge cases) is now properly under test via the pure-function extractions in both send.rs and setup.rs. Documentation is consistent with the implementation. No regressions observed.
Overall verdict: Ready to merge
Recommended merge commit message
Make IPv6 outbound opt-in via enable_ipv6 config flag
Defaults `aimx send` (and `aimx setup`'s DNS guidance/verification)
back to IPv4-only, matching the SPF record `aimx setup` generates by
default. IPv6 outbound and dual-stack DNS guidance become opt-in via
a new hidden `enable_ipv6: bool` field in `config.toml`.
- Config: add `enable_ipv6` (default false) with serde round-trip tests
- send.rs: `LettreTransport` carries the flag; extract pure
`select_connect_target()` + `ConnectTarget` enum; skip MX hosts with
no A record when flag is off (via `SkipNoIpv4`) so the flag is not
silently violated; resolve A records via `hickory-resolver`
- setup.rs: extract `detect_server_ipv6(enable_ipv6, net)` so
`get_server_ipv6()` is only called when the flag is true
- mx.rs: new `resolve_a()` helper (ipv4_lookup, NoRecords → empty vec)
- Book: add "IPv6 delivery (advanced)" section; correct `aimx verify`
wording (port 25 only, unaffected by flag)
- PRD: update FR-7, FR-19, resolved-decision #8 to describe opt-in
- Sprint plan: post-merge addendum under Sprint 26 noting the flip
Verified live: default config connects over IPv4 (AF_INET); adding
`enable_ipv6 = true` makes it connect over IPv6 (AF_INET6).
- Extract connect-target selection into pure `select_connect_target()` with `ConnectTarget` enum; add unit tests covering all four IPv4/IPv6 combinations. - Skip MX hosts with no A record when `enable_ipv6 = false` instead of silently falling through to the hostname (which could still resolve to IPv6 and violate the flag). Emits a clear "no A record; skipping" error so `try_deliver` moves on to the next MX. - Replace blocking `ToSocketAddrs` in `resolve_ipv4` with a new `mx::resolve_a()` helper built on `hickory-resolver`, for consistency with MX resolution and to avoid blocking getaddrinfo in tokio workers. - Add a doc comment on `resolve_ipv4` explaining why it exists (flag semantics, third add/remove in three sprints) to prevent future deletion. - Add a doc comment on TLS SNI vs. connect-target mismatch while `dangerous_accept_invalid_certs` is set. - Extract `detect_server_ipv6(enable_ipv6, net)` gate in setup.rs and add tests using `get_server_ipv6_calls` counter on `MockNetworkOps` to verify the network call is made iff the flag is true. - Correct book/configuration.md: `aimx verify` only probes port 25, not IPv6; the flag only affects `aimx setup`. - Add post-merge addendum under Sprint 26 in docs/sprint.md noting that this follow-up flipped the default to opt-in IPv6.
* Make IPv6 outbound opt-in via enable_ipv6 config flag aimx send now defaults to IPv4-only outbound delivery. Set enable_ipv6 = true in config.toml to opt into dual-stack (OS chooses). This matches the SPF record aimx setup generates by default (ip4: only) and avoids SPF failures at Gmail when the server has a global IPv6 but DNS isn't set up for it. - Config: new enable_ipv6: bool field, defaults to false - send.rs: LettreTransport carries the flag; re-introduces resolve_ipv4() helper and gates the connect target in try_deliver() - mcp.rs: pass config.enable_ipv6 when constructing transport - book/configuration.md: new "IPv6 delivery (advanced)" section documenting the required AAAA + ip6: SPF additions - book/setup.md: note under DNS table that AAAA / ip6: are only needed when enable_ipv6 is on - aimx setup is unchanged — this is a hidden config flag Verified live: default config connects over IPv4 (AF_INET); adding enable_ipv6 = true makes it connect over IPv6 (AF_INET6). * Update PRD for enable_ipv6 opt-in (FR-7, FR-19, resolved decision #8) * Gate aimx setup IPv6 detection behind enable_ipv6 flag aimx setup now reads `enable_ipv6` from the existing config (if any) and only calls `get_server_ipv6()` when the flag is true. With the flag unset or false, AAAA is not advertised, `ip6:` is not added to SPF, and neither is verified — matching the IPv4-only default of `aimx send`. Any existing AAAA record in DNS is left alone. Re-entrant workflow: - Fresh install -> no config yet -> IPv4-only setup. - User edits config.toml and adds `enable_ipv6 = true`. - User re-runs `sudo aimx setup <domain>` -> setup detects the flag, shows AAAA + ip6: SPF in the DNS table, and verifies them. Verified live on a dual-stack VPS: default config produces an IPv4-only DNS table; adding `enable_ipv6 = true` produces AAAA + ip6: SPF in the same output. - src/setup.rs: read enable_ipv6 when loading existing config, gate get_server_ipv6() call in run_setup - book/configuration.md: updated IPv6-delivery section to describe the re-run-setup workflow and the ignore-when-disabled behavior - book/setup.md: note that AAAA / ip6: are only shown when the flag is on - docs/prd.md: FR-7 updated to reflect the new gate * Address PR #50 review feedback on enable_ipv6 flag - Extract connect-target selection into pure `select_connect_target()` with `ConnectTarget` enum; add unit tests covering all four IPv4/IPv6 combinations. - Skip MX hosts with no A record when `enable_ipv6 = false` instead of silently falling through to the hostname (which could still resolve to IPv6 and violate the flag). Emits a clear "no A record; skipping" error so `try_deliver` moves on to the next MX. - Replace blocking `ToSocketAddrs` in `resolve_ipv4` with a new `mx::resolve_a()` helper built on `hickory-resolver`, for consistency with MX resolution and to avoid blocking getaddrinfo in tokio workers. - Add a doc comment on `resolve_ipv4` explaining why it exists (flag semantics, third add/remove in three sprints) to prevent future deletion. - Add a doc comment on TLS SNI vs. connect-target mismatch while `dangerous_accept_invalid_certs` is set. - Extract `detect_server_ipv6(enable_ipv6, net)` gate in setup.rs and add tests using `get_server_ipv6_calls` counter on `MockNetworkOps` to verify the network call is made iff the flag is true. - Correct book/configuration.md: `aimx verify` only probes port 25, not IPv6; the flag only affects `aimx setup`. - Add post-merge addendum under Sprint 26 in docs/sprint.md noting that this follow-up flipped the default to opt-in IPv6.
Summary
aimx sendnow defaults to IPv4-only outbound deliveryenable_ipv6: bool(defaultfalse) inconfig.toml— set totrueto opt into dual-stack (OS chooses address family)aimx setupis untouched; the flag is only discoverable via config docsbook/configuration.mdexplaining the required AAAA +ip6:SPF additions; a short note under the DNS records table inbook/setup.mdWhy
Sprint 26 removed the IPv4-only workaround so lettre could connect over either family. Live testing confirmed IPv6 connects work, but the server's IPv6 isn't in the default SPF record that
aimx setupgenerates — so messages sent over IPv6 fail SPF at Gmail (observed on `agent.zeroshot.lol`). Most single-VPS deployments only need IPv4; a hidden opt-in flag is the right shape for the advanced case.Behavior verified on this VPS
Test plan