Skip to content

fix(transfer): batch AXFR records by message size instead of count#8002

Merged
yongtang merged 1 commit into
coredns:masterfrom
umut-polat:fix/transfer-batch-size
Apr 4, 2026
Merged

fix(transfer): batch AXFR records by message size instead of count#8002
yongtang merged 1 commit into
coredns:masterfrom
umut-polat:fix/transfer-batch-size

Conversation

@umut-polat

Copy link
Copy Markdown
Contributor

The transfer plugin batches zone transfer records using a fixed count of 500 records per message. When a zone contains many large records (long TXT strings, DNSSEC RRSIG, etc.), the combined size can exceed the 64KB TCP message limit, causing truncation or silent record loss during AXFR.

Changes

  • plugin/transfer/transfer.go: replace the len(rrs) > 500 threshold with a byte-size check using dns.Len() per record, flushing the batch before it exceeds 63000 bytes (leaving headroom for message header and question section)
  • fix a potential deadlock in the final batch send: if the client disconnects mid-transfer, the trailing ch <- rrs could block forever since tr.Out already exited. Wrap it in a select on errCh
  • plugin/transfer/transfer_test.go: add TestTransferLargeRecordBatching with 300 TXT records (~250 bytes each, ~75KB total) asserting that every packed message stays within dns.MaxMsgSize and no records are lost

Repro (before fix)

Create 500 TXT records with 250-byte payloads, serve with the file plugin + transfer, then:

dig @localhost testzone. AXFR
;; XFR size: 248 records  <-- expected 502

Fixes #7859

The transfer plugin batched zone transfer records using a fixed count
of 500 records per message. When a zone has many large records (e.g.
long TXT or DNSSEC RRSIG), the resulting message can exceed the 64KB
TCP limit, causing truncation or silent record loss.

Replace the record count threshold with a byte size check using
dns.Len per record, flushing before the batch exceeds 63000 bytes.

Also fix a potential deadlock in the final batch send when the client
disconnects mid-transfer by using a select on errCh.

Fixes coredns#7859

Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>
@yongtang yongtang merged commit 61f4145 into coredns:master Apr 4, 2026
13 checks passed
Filippo125 pushed a commit to Filippo125/coredns that referenced this pull request Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

plugin/transfer: batching causes lost records when zone contains large records

2 participants