Skip to content

Engine/format sweep 2026 05 17#6

Draft
traceyyoshima wants to merge 2 commits into
trunkfrom
engine/format-sweep-2026-05-17
Draft

Engine/format sweep 2026 05 17#6
traceyyoshima wants to merge 2 commits into
trunkfrom
engine/format-sweep-2026-05-17

Conversation

@traceyyoshima

@traceyyoshima traceyyoshima commented May 17, 2026

Copy link
Copy Markdown
Owner

Draft — diff review only

This PR is not for merge. It's a one-shot application of the language-engine Java formatter across the full kafka tree so the resulting diff can be inspected.

Sweep stats

Metric Value
Files scanned (under src/<sourceSet>/java/) 5,834
Files changed 1,038 (17.8%)
  main scope 521 / 3,685 (14.1%)
  test scope 517 / 2,149 (24.1%)
Byte delta −6,661
Parse failures 3 (archetype templates with ${package} placeholders — expected)
Format/write failures 0
Reformat wall time 7.2s total across 5,834 files (1.237 ms/file)

How it works

Style is auto-detected, not configured. The formatter does not impose IntelliJ or Google style. Instead, it observes the source — counting per-slot whitespace/indent patterns across the whole project — and derives a Style object that matches the project's existing conventions. Where the project's existing usage is bimodal (e.g. half the files use one convention, half another), the framework records the slot as ambiguous and preserves the source as-is rather than picking a winner. This sweep derived kafka's main and test scopes independently and observed 12 options where the test scope's conventions diverge from main; both were applied.

The formatter rules are largely auto-generated. The framework structure (StyleOption, slot keys, Bridge.dominant decision policy, per-rule catalogs) is hand-written, but the ~250 individual format rules — one per whitespace/indent slot in the Java grammar — are generated from the grammar's slot inventory and a small set of rule-shape templates (TokenAdjacent, ListElement, ModifierPostfix, etc.). Adding a rule is a data change, not a code change.

What you'll see in the diff

  • Whitespace normalization at slots where kafka's source is internally consistent and the framework's derived Style agrees.
  • Preservation (no change) at slots where kafka's source is bimodal or where the framework couldn't reach a confident decision.
  • A handful of indent corrections from the catalog framework's continuation-shape classes (StructuralContinuation, FirstOperandAligned, ParenAnchor, ChainAnchor, ArgContinuation2x, ArgContinuation4x).

Reproduce

language-engine-cli \
  --format-apply \
  --source-dir /path/to/kafka \
  --write

Built from language-engine main at commit ace1ebc7 (post-PR apache#252).

Draft for review of the formatter's output across kafka's source
modules. Generated by `language-engine-cli --format-apply --write`
on the language-engine main at 80469e1b (post-PR apache#251).

Stats:
- Files scanned:      5,834 (under src/<sourceSet>/java/)
- Files changed:      1,038 (17.8%)
  - main scope:         521 / 3,685 (14.1%)
  - test scope:         517 / 2,149 (24.1%)
- Byte delta:        -6,661 bytes
- Parse failures:         3 (archetype templates with ${package} placeholders — expected)
- Format/write failures:  0
- Skipped (not under src/<set>/java/): 1,007

Per-scope Style divergence detected: 12 options resolved
differently between test and main scopes — e.g. test allows up to
3 blank lines before close-brace vs. main's 2, test resolves
arrayInitializerTrailingComma=false while main is unresolved, etc.

Not for merge — diff review only.

🤖 Generated with language-engine + Claude Code
traceyyoshima added a commit that referenced this pull request May 17, 2026
…pache#257)

Refreshed snapshot from the kafka format sweep review. Five
language-engine PRs landed in this session:

- PR apache#253 — Annotation paren anchor for continuation slots
- PR apache#254 — StructuralBody2x catalog class for inline lambda bodies
- PR apache#255 — Lambda body as continuation enclosing scope
- PR apache#256 — Inline-lambda body close brace catalog (+0 / +projectDelta)
- PR apache#257 — emittedLineStarts restricted to OWNER's prefix slot

Sweep stats (post-apache#257):
  files scanned : 5,834
  files changed : 1,027 (17.6%)   [initial PR #6: 1,038]
  byte delta    : -10,877          [initial PR #6: -6,661]
  reformat time : 7.2s total, 1.23 ms/file

The larger byte delta vs. the initial sweep reflects PR apache#257
fixing the AsyncKafkaConsumerTest-style "body at col 48" pattern
across multiple files — body statements no longer indent to the
inflated close-paren-line column, so each affected line shrinks
by 30+ characters.

Remaining patterns deferred to subsequent PRs:
  - Chain-dot over-application on Class.STATIC_FIELD / new X()
    (AdminClientConfig L273/L282, WriteTxnMarkersRequest L135,
    TxnOffsetCommitRequest L156/L159)
  - Partial-shift inconsistencies (StreamsResetter, ClientUtilsTest,
    RequestResponseTest, LeaveGroupRequest*, AlterReplicaLogDirs)
  - Trailing-whitespace cleanups on blank lines (incidental noise)
traceyyoshima added a commit that referenced this pull request May 17, 2026
…pache#258)

Refreshed snapshot. Six language-engine PRs landed this session:

- PR apache#253 — Annotation paren anchor for continuation slots
- PR apache#254 — StructuralBody2x catalog class for inline lambda bodies
- PR apache#255 — Lambda body as continuation enclosing scope
- PR apache#256 — Inline-lambda body close brace catalog (+0 / +projectDelta)
- PR apache#257 — emittedLineStarts restricted to OWNER's prefix slot
- PR apache#258 — alignmentTargets restricted to OWNER's prefix slot

Sweep stats (post-apache#258):
  files scanned : 5,834
  files changed : 1,000 (17.1%)   [initial PR #6: 1,038]
  byte delta    : -10,630          [initial PR #6: -6,661]
  reformat time : 7.1s total, 1.22 ms/file

Remaining patterns (deferred):
  - Outer chain-dot receiver-anchor variations (TxnOffsetCommitRequest
    L156/L159, RequestResponseTest L3373+, etc.) — separate path from
    PR apache#258's inner-arg fix
  - Small continuation normalizations of non-standard source
    (StreamsResetter, ClientUtilsTest, LeaveGroupRequestTest)
  - Trailing-whitespace cleanups on blank lines (incidental noise)
traceyyoshima added a commit that referenced this pull request May 17, 2026
…pache#259)

Refreshed snapshot. Seven language-engine PRs landed this session:

- PR apache#253 — Annotation paren anchor for continuation slots
- PR apache#254 — StructuralBody2x catalog class for inline lambda bodies
- PR apache#255 — Lambda body as continuation enclosing scope
- PR apache#256 — Inline-lambda body close brace catalog
- PR apache#257 — emittedLineStarts restricted to OWNER's prefix slot
- PR apache#258 — alignmentTargets restricted to OWNER's prefix slot
- PR apache#259 — Chain wrap preservation when siblings agree on column

Sweep stats (post-apache#259):
  files scanned : 5,834
  files changed : 990 (17.0%)     [initial PR #6: 1,038]
  byte delta    : -10,685          [initial PR #6: -6,661]
  reformat time : ~7s total

Most called-out files from the user's original review now fully
closed:
  - ShareConsumerTest (body + close brace)
  - ProducerIdExpirationTest
  - DeleteAclsResponse
  - CustomQuotaCallbackTest
  - AsyncKafkaConsumerTest (body indent regression)
  - AdminClientConfig L273/L282
  - WriteTxnMarkersRequest L135
  - TxnOffsetCommitRequest L156-159
  - RequestResponseTest chain shifts
  - AlterReplicaLogDirsRequestTest (cascade-closed)

Remaining residuals (~5 files, small):
  - StreamsResetter L580 (single `||` continuation normalisation)
  - ClientUtilsTest L79/L143 (continuation normalisation)
  - LeaveGroupRequestTest L55 (continuation shift)
  - LeaveGroupResponseTest (trailing-ws on blank lines)
  - ReassignPartitionsUnitTest (single line)

These are mostly the formatter normalising slightly non-standard
source — not really bugs.
traceyyoshima added a commit that referenced this pull request May 17, 2026
…pache#260)

Refreshed snapshot. Eight language-engine PRs landed this session:

- PR apache#253 — Annotation paren anchor for continuation slots
- PR apache#254 — StructuralBody2x catalog class for inline lambda bodies
- PR apache#255 — Lambda body as continuation enclosing scope
- PR apache#256 — Inline-lambda body close brace catalog
- PR apache#257 — emittedLineStarts restricted to OWNER's prefix slot
- PR apache#258 — alignmentTargets restricted to OWNER's prefix slot
- PR apache#259 — Chain wrap preservation when siblings agree on column
- PR apache#260 — Binary OPERATOR_PREFIX consults outermost binary's anchor

Sweep stats (post-apache#260):
  files scanned : 5,834
  files changed : 989 (17.0%)     [initial PR #6: 1,038]
  byte delta    : -10,663
  reformat time : ~7s total

Almost all called-out files from the user's original review now
fully closed. Remaining residuals (<5 files) are correct
normalisations of slightly non-standard source or trailing
whitespace inside comment content (intentionally out of scope for
this engine version per user design choice).
traceyyoshima added a commit that referenced this pull request May 17, 2026
…pache#261)

Refreshed snapshot. Nine language-engine PRs landed this session:

- PR apache#253 — Annotation paren anchor for continuation slots
- PR apache#254 — StructuralBody2x catalog class for inline lambda bodies
- PR apache#255 — Lambda body as continuation enclosing scope
- PR apache#256 — Inline-lambda body close brace catalog
- PR apache#257 — emittedLineStarts restricted to OWNER's prefix slot
- PR apache#258 — alignmentTargets restricted to OWNER's prefix slot
- PR apache#259 — Chain wrap preservation when siblings agree on column
- PR apache#260 — Binary OPERATOR_PREFIX consults outermost binary's anchor
- PR apache#261 — Nested continuation slots use innermost enclosing scope

Sweep stats (post-apache#261):
  files scanned : 5,834
  files changed : 986 (16.9%)     [initial PR #6: 1,038]
  byte delta    : -11,137          [initial PR #6: -6,661]
  reformat time : ~7s total

PR #7 review comments status:
  - Comments 1+2 closed (PR apache#261)
  - Comments 3 + 4/5 deferred (workbench-first investigation; diagnosis
    captured in language-engine memory project_pr7_deferred_comments_2026_05_17.md)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant