Add JsonReadFeature.ALLOW_HEXADECIMAL_NUMBERS for JSON5-style hex literals (#707) by seonwooj0810 · Pull Request #1613 · FasterXML/jackson-core

seonwooj0810 · 2026-05-26T11:56:43Z

Summary

Implements JSON5 hexadecimal integer literals (0x / 0X prefix, optional sign, [0-9a-fA-F]+) across all four JSON parsers, gated behind a new opt-in JsonReadFeature.ALLOW_HEXADECIMAL_NUMBERS. Closes #707 (hex part; whitespace remains).

Per @pjfanning's review:

JSON5 grammar followed verbatim — 0x|0X + [0-9a-fA-F]+, optional leading sign, no fraction / no exponent / no _ separators.
Independent of ALLOW_LEADING_ZEROS_FOR_NUMBERS — leading zeros within hex digits (e.g. 0x007F) are always permitted when this feature is enabled, regardless of the leading-zero feature. The hex check fires before _verifyNoLeadingZeroes() in every parser, and the leading-zero feature is never consulted on the hex path. Covered by unsignedHexWithLeadingZeros.

The + variant additionally requires ALLOW_LEADING_PLUS_SIGN_FOR_NUMBERS (matching how + is treated for decimals, per #774). Covered by plusSignHexRequiresLeadingPlusFeature and plusSignHexWithPlusFeature.

Implementation per parser:

Parser	Strategy
`ReaderBasedJsonParser`	Hex branch in `_parseNumber2`, before `_verifyNoLeadingZeroes`.
`UTF8StreamJsonParser`	Same pattern, byte-oriented; peek before `_verifyNoLeadingZeroes` in both `_parseUnsignedNumber` and `_parseSignedNumber`.
`UTF8DataInputJsonParser`	Intercept the existing `'x'`/`'X'` arm of `_handleLeadingZeroes` (no peek needed — DataInput already consumed it).
`NonBlockingUtf8JsonParserBase`	New minor states `MINOR_NUMBER_HEX_PREFIX` and `MINOR_NUMBER_HEX_DIGITS` so suspension works at every byte boundary; helpers `_startHexNumber` / `_startHexNumberWithSign` / `_finishHexDigits` drive the state machine.

Shared decoding lives in JsonParserBase._parseHexInt:

int for ≤ 7 hex digits (always fits in signed 32-bit),
long for ≤ 16 hex digits with a top-nibble check at 16,
BigInteger otherwise — eagerly decoded because the lazy _numberString path expects base-10.

getString() returns the original literal (sign + 0x/0X + digits), matching the precedent set for + by #784. getIntValue() / getLongValue() / getBigIntegerValue() return the decoded value.

The bookkeeping additions on JsonParserBase are limited:

_numberIsHex field (cleared by resetInt / resetFloat / resetAsNaN),
resetIntHex(neg, hexDigitLen) helper alongside resetInt.

I put these on JsonParserBase only (since only JSON consumes it today).

Test plan

mvn clean verify — 1697 tests, 0 failures, 0 regressions.
NonStandardHexNumbers707Test (12 cases) — unsigned, uppercase X, leading zeros, negative, + with/without plus-feature, long range (0x80000000), BigInteger range (0x1ffffffffffffffff), in-array, feature-off rejection, empty-prefix rejection. Runs across ALL_MODES (Reader, UTF8 stream, throttled, DataInput).
AsyncHexNumbers707Test (10 cases) — drives the non-blocking parser with 1-byte and 3-byte feeds to exercise suspension/resumption at every position; covers the same shapes plus a plainZeroStillWorks smoke test to verify the bare-0 path didn't regress.

…erXML#707) Address review feedback on FasterXML#1613: parse the buffered hex digits as a BigInteger via a new NumberInput.parseBigIntegerWithRadix(char[], ...) overload that hands the slice straight to FastDoubleParser, eliminating the intermediate String allocation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…erXML#707) Address review feedback on FasterXML#1613: parse the buffered hex digits as a BigInteger via a new NumberInput.parseBigIntegerWithRadix(char[], ...) overload that hands the slice straight to FastDoubleParser, eliminating the intermediate String allocation.

seonwooj0810 · 2026-05-26T13:57:57Z

@pjfanning Addressed your review feedback in 45e02b3 (moved BigInteger parsing into NumberInput.parseBigIntegerWithRadix(char[], ...) and pass the char slice directly to FastDoubleParser, avoiding the intermediate String). Could you take another look when you have a moment?

cowtowncoder · 2026-05-27T00:10:29Z

Hmmmh. This is HUGE. Not a big fan based on size alone... :-(

Will give some more thought.

seonwooj0810 · 2026-05-27T00:54:55Z

Quick note on the "fail when feature is not enabled" request — that scenario was actually already covered from the initial commit by hexRejectedWhenFeatureDisabled in both NonStandardHexNumbers707Test and AsyncHexNumbers707Test (each runs through all the hex prefix shapes with ALLOW_HEXADECIMAL_NUMBERS disabled and asserts they fail). I should have called that out explicitly in my reply — apologies for the noise.

Also pushed 4e8e060 to add invalid-hex digit coverage (e.g. 0xZZ) to the new bigIntegerWithRadixFromCharArray* tests, in case that was your intent rather than feature-disabled. Happy to add more if I'm still missing something.

seonwooj0810 · 2026-05-27T00:55:11Z

@cowtowncoder Totally hear you on the size — it's something I've been chewing on too. Let me break down what's in the +940 to make it easier to evaluate, and offer a couple of options.

Where the lines actually go:

Bucket	Lines	What it is
Tests	~383 (40%)	`NonStandardHexNumbers707Test`, `AsyncHexNumbers707Test`, plus the `NumberInput` / `BigIntegerParser` char[]-overload coverage @pjfanning asked for
Feature definition + utils	~70 (7%)	`JsonReadFeature` constant, `NumberInput.parseBigIntegerWithRadix(char[],...)`, `BigIntegerParser.parseWithFastParser(char[],...)`
Parser plumbing	~484 (51%)	Hex parsing wired into 7 parser variants

Why parser plumbing is 51%: the feature has to land in each concrete parser independently because the input sources (Reader, byte stream, DataInput, async buffer) don't share a unified read primitive. I tried to keep as much as possible in JsonParserBase (+89 for shared state and entry points), but the per-variant additions are doing input-source-specific work — Reader uses _loadMore(), DataInput uses readUnsignedByte() with IOException, async needs MINOR state restart points, etc.

The one outlier is NonBlockingUtf8JsonParserBase at +123. That's not duplication — async needs MINOR_NUMBER_HEX_PREFIX/MINOR_NUMBER_HEX_DIGITS states and the _finishHexDigits(requireFirst) / _completeHexNumber() restart-able split that sync parsers don't need. Without it, hex parsing would silently break on partial buffers.

What I can trim (good faith, ~15–20 lines): _isHexDigit is defined four times across parsers, and the "prefix must be followed by at least one hex digit" message is duplicated. I can hoist the digit check to ParserBase (with a char→int widening for ReaderBasedJsonParser) and extract the message as a constant. Happy to do this if you'd like — it's a small win at the cost of adding one method to the base class.

Suggested review path if it helps to chunk it:

JsonReadFeature.java — feature intent (1 constant)
NumberInput.java + BigIntegerParser.java — the new char[] overloads (used by all parsers)
JsonParserBase.java — shared state / entry points
One of the concrete parsers (ReaderBasedJsonParser.java is the simplest) — to understand the per-variant pattern
The remaining parsers are mechanical applications of the same pattern, just adapted to their input source

Alternative if size is still the blocker: I could split into a prelim PR that lands just the utils (NumberInput/BigIntegerParser char[] overloads + JsonReadFeature constant) ahead of this one. It wouldn't change net diff but would give you two ~half-size reviews instead of one large one. Let me know if that's worth doing or if you'd rather keep it as one atomic change.

Whichever direction you'd like to take, let me know and I'll make it happen.

…asterXML#707) Addresses two review comments on PR FasterXML#1613: - ParserBase.java:183 (cowtowncoder): hex literal handling is JSON-only, not shared by binary/text format backends that also extend ParserBase. - NonBlockingUtf8JsonParserBase.java:1810 (cowtowncoder): the per-parser _isHexDigit helper should live in a shared base for reuse. Promote JSON-only hex state to JsonParserBase --------------------------------------------- - _numberIsHex field and resetIntHex(...) move from ParserBase to JsonParserBase. ParserBase no longer references hex at all. - ParserBase.resetInt loses `final` so JsonParserBase can override it to clear _numberIsHex. resetFloat/resetAsNaN no longer touch the flag: _numberIsHex is only read in the VALUE_NUMBER_INT branch of _parseNumericValue/_parseIntValue, both reached only via resetInt or resetIntHex (both of which now correctly initialize the flag), so leaving it stale after a float/NaN reset is benign. Hoist duplicated per-parser helpers to JsonParserBase ----------------------------------------------------- - _isHexDigit(int) was duplicated four times across ReaderBasedJsonParser, UTF8StreamJsonParser, UTF8DataInputJsonParser, and NonBlockingUtf8JsonParserBase. Replaced with a single protected static on JsonParserBase. Char-stream parser's existing `_isHexDigit(char)` call site keeps working via implicit char->int widening. - "Hexadecimal number prefix '0X' must be followed by at least one hex digit (...)" error message was duplicated at the same four call sites. Extracted as protected static _hexPrefixNotFollowedMessage(char) on JsonParserBase. Net effect: -76 / +79 lines, identical externally observable behavior. Verified by full test suite (1702 pass, 2 pre-existing skips), with particular attention to NonStandardHexNumbers707Test and AsyncHexNumbers707Test.

) Addresses cowtowncoder's follow-up suggestion on PR FasterXML#1613: UTF8DataInputJsonParser.java:1218 (cowtowncoder): "Or better yet, check out what CharTypes.java already has; maybe charToHex() would be useful (gives -1 for non-hex digit)" CharTypes.charToHex(int) is a single masked array lookup that returns the digit value (0..15) for valid hex characters and -1 otherwise. That single primitive subsumes both helpers introduced for this feature: - _isHexDigit(int) -> CharTypes.charToHex(c) >= 0 - _hexDigit(char) -> CharTypes.charToHex(c) Both are removed; the four concrete parser scan loops and JsonParserBase._parseHexInt now go through CharTypes.charToHex directly, matching the pattern already used elsewhere in the JSON parsers for \\uXXXX escape decoding. Verified by full test suite (1702 pass, 2 pre-existing skips).

seonwooj0810 · 2026-05-27T01:36:00Z

@cowtowncoder Pushed a4bab461 and 52e7a897 to address your three inline comments:

a4bab461 — Move hex parser state to JsonParserBase, dedupe per-parser helpers

ParserBase.java:183: _numberIsHex field and resetIntHex(...) promoted to JsonParserBase. ParserBase no longer references hex at all. resetInt loses final so JsonParserBase can override it to clear _numberIsHex; resetFloat/resetAsNaN keep final (the flag is only read in the VALUE_NUMBER_INT branch, both reached only via resetInt/resetIntHex, so a stale value after a float reset is benign).
NonBlockingUtf8JsonParserBase.java:1810: the four duplicated _isHexDigit definitions consolidated as one protected static on JsonParserBase (then superseded entirely by the next commit — see below).
The four duplicated hex-prefix error messages extracted to JsonParserBase._hexPrefixNotFollowedMessage(char).

52e7a897 — Use CharTypes.charToHex for JSON5 hex literal scan/decode

UTF8DataInputJsonParser.java:1218: great suggestion — CharTypes.charToHex(int) is a single masked array lookup that returns the digit value (0..15) or -1, so it subsumes both helpers introduced for this feature:
- _isHexDigit(int) → CharTypes.charToHex(c) >= 0
- _hexDigit(char) → CharTypes.charToHex(c) (no separate decode needed)
Both helpers removed; the four concrete parser scan loops and JsonParserBase._parseHexInt now go through CharTypes.charToHex directly, matching the pattern already used elsewhere in these parsers for \uXXXX escape decoding.

Net diff is now +923 / -4 across 16 files (was +940 / -3). Modest line reduction but materially cleaner structure: zero hex references in cross-format ParserBase, zero per-parser duplication, and the digit-validation/digit-decode pair fused into one well-established primitive.

Full test suite: 1702 pass / 2 pre-existing skips, including NonStandardHexNumbers707Test (12 tests) and AsyncHexNumbers707Test (10 tests). Happy to keep iterating.

cowtowncoder · 2026-05-27T03:28:59Z

+            // valid in hex regardless of ALLOW_LEADING_ZEROS_FOR_NUMBERS.
+            if (_inputPtr < _inputEnd || _loadMore()) {
+                char peek = _inputBuffer[_inputPtr];
+                if ((peek == 'x' || peek == 'X')


Idea: add a helper method ("_checkHexNumbersAllowed()" etc) in JsonParserBase called after "0x" or "0X" seen, passing next character -- if feature NOT enabled, can give specific error message including feature to enable to allow hexadecimal numbers: either returns (indicating feature enabled) or throws exception (is not).

Reduces duplication slightly but more importantly gives better exception in case feature disabled.

Done in 4e5945f. Added JsonParserBase._checkHexNumbersAllowed(int) that throws StreamReadException naming the feature when JsonReadFeature.ALLOW_HEXADECIMAL_NUMBERS is disabled. All eight call sites across the four JSON parsers (Reader, UTF8 stream, UTF8 DataInput, non-blocking UTF8) now go through it, eliminating duplicated isEnabled(...) checks. Existing feature-disabled tests tightened to assert the error message includes the feature name.

cowtowncoder · 2026-05-27T03:31:55Z

@seonwooj0810 wrt:

Drive-by fix (please call out if you'd prefer this split)

yes, please, that'd be worth fixing separately, can merge quickly.

cowtowncoder · 2026-05-27T03:34:16Z

@seonwooj0810 Ok, aside from review (with which are making good progress), one process thing: unless we have asked for it before (and got it), we'd need CLA. It's from:

https://github.com/FasterXML/jackson/blob/main/CLA-jackson-2026.pdf

and is usually done by printing, filling & signing, scan/photo, email to cla at fasterxml dot com.

Only needs to be sent once and is good for all future contributions.

Two small review-feedback nits from FasterXML#1613: - ParserBase.resetInt was made non-final earlier in this branch so that JsonParserBase can override it to clear the hex flag. Add a comment noting the prior `final` modifier and why it was relaxed. - BigIntegerParser.parseWithFastParser(char[],int,int,int) is new in 3.2; add the `@since 3.2` Javadoc tag to match the existing convention.

@cowtowncoder

…error Per @cowtowncoder review on FasterXML#1613: when a number literal starts with `0x` / `0X` but `JsonReadFeature.ALLOW_HEXADECIMAL_NUMBERS` is not enabled, the user previously got a generic "unexpected character" error that did not mention the feature to enable. Centralize the feature check in a single `JsonParserBase._checkHexNumbersAllowed(int)` helper that throws a `StreamReadException` naming the feature, so the diagnostic is actionable: Unexpected character ('x') (code 120): hexadecimal number literals require enabling `JsonReadFeature.ALLOW_HEXADECIMAL_NUMBERS` All eight call sites across the four JSON parsers (Reader, UTF8 stream, UTF8 DataInput, non-blocking UTF8) now go through the helper, eliminating the duplicated `isEnabled(ALLOW_HEXADECIMAL_NUMBERS)` check at every parser-specific entry point. Existing feature-disabled tests are tightened to assert that the new error message includes the feature name in addition to the original "Unexpected character ('x'" prefix.

@cowtowncoder

Per @cowtowncoder review on FasterXML#1613: previously the maximum-integer-length constraint was applied only when `resetIntHex` was called -- i.e. after the full hex literal had been buffered. A pathological input with millions of digits would therefore allocate millions of chars before being rejected. Move the validation into the hex digit accumulation loop so it fires at every `_textBuffer` segment boundary (default first segment is 500 chars). With the default `maxNumberLength=1000`, a 10-million-digit hex literal is now caught at ~4 KiB rather than after fully buffering. Add `hexNumberLengthConstraint` to NonStandardHexNumbers707Test, exercising an 8000-digit literal with `maxNumberLength=100` against all sync modes; each backend must throw `StreamConstraintsException` with the standard "exceeds the maximum" message. The async path is exercised by the same validation primitive in `_finishHexDigits`.

seonwooj0810 · 2026-05-27T04:15:42Z

@cowtowncoder Pushed commits 8e5a4b07, c759815d, 0f5a076e, f306baa1 addressing the rest of your review (and split out the drive-by as #1614 per your request). Summary by item:

8e5a4b07 — Align MINOR_NUMBER_PLUSZERO numbering with the split

The +0... async-resumption primitive (MINOR_NUMBER_PLUSZERO) now lives in Preserve explicit "+" sign across buffer boundaries for "+0..." on async parser #1614. Renumbered to 33 in this PR too (shifted HEX_PREFIX/HEX_DIGITS to 34/35) so when Preserve explicit "+" sign across buffer boundaries for "+0..." on async parser #1614 lands and 3.x merges back into this branch, the constant lines up cleanly with no numeric collision.
Note: this branch still defines PLUSZERO itself because the hex +0xff async path needs it (positiveHexWithPlusSignAndSplit fails without it). When Preserve explicit "+" sign across buffer boundaries for "+0..." on async parser #1614 merges, the duplicate definition resolves to a trivial dedupe.

c759815d — Document resetInt relaxation and char[] BigInteger parser

ParserBase.resetInt: added the "was final before 3.2" comment you asked for.
BigIntegerParser.parseWithFastParser(char[],...): added @since 3.2.

0f5a076e — _checkHexNumbersAllowed helper for actionable feature-disabled error

New helper in JsonParserBase; called after 'x'/'X' is seen following a leading 0. If ALLOW_HEXADECIMAL_NUMBERS is disabled, throws via _reportUnexpectedChar with the feature name embedded:
```
Unexpected character ('x') (code 120): hexadecimal number literals
require enabling `JsonReadFeature.ALLOW_HEXADECIMAL_NUMBERS`
```
All eight call sites across the four parsers now route through the helper — no more duplicated isEnabled(ALLOW_HEXADECIMAL_NUMBERS) checks at parser-specific entry points.
hexRejectedWhenFeatureDisabled (sync + async) tightened to assert the feature-name message in addition to the original "Unexpected character ('x'" prefix.

f306baa1 — Validate hex digit length at every segment boundary

Per your comment about the 10M-digit case: validation now fires at every _textBuffer segment boundary in the hex-digit accumulation loop of all four parsers, not just at final resetIntHex. Default first segment is 500 chars, so with maxNumberLength=1000 a giant literal is caught long before it allocates.
Added hexNumberLengthConstraint test (8000-digit literal with maxNumberLength=100) exercising all sync modes; async path covered by the same primitive inside _finishHexDigits.

One item I want to flag rather than implement here: your suggestion to move hex decoding into TextBuffer itself (this comment) feels like it would meaningfully grow this PR — and since you've already noted the size concern, I'd prefer to spin it out into a follow-up PR after this lands. Happy to do it either way, just call out whether you'd like it here or separately.

Net effect after this round: +1003 / -22 across 18 files (was +923 / -4 across 16 files). The bump is roughly evenly split between the new helper + boundary test (+~80) and the same churn spread across the four parsers — but the cross-parser duplication is now down (one feature check, one boundary check primitive each).

Full test suite: 1703 pass / 2 pre-existing skips (was 1702 — added hexNumberLengthConstraint).

Process note: CLA will follow by email shortly; same signature covers both this PR and #1614.

…erals (FasterXML#707) Implements JSON5 hexadecimal integer literals (0x/0X prefix, optional sign, [0-9a-fA-F]+) across all four JSON parsers: * ReaderBasedJsonParser - hex branch in _parseNumber2 (before _verifyNoLeadingZeroes, so ALLOW_LEADING_ZEROS_FOR_NUMBERS is not consulted on the hex path). * UTF8StreamJsonParser - same pattern, byte-oriented. * UTF8DataInputJsonParser - intercept in the existing 'x'/'X' branch inside _parseUnsignedNumber and _parseSignedNumber. * NonBlockingUtf8JsonParserBase - new minor states MINOR_NUMBER_HEX_PREFIX and MINOR_NUMBER_HEX_DIGITS so suspension works at every byte boundary; helpers _startHexNumber(WithSign) and _finishHexDigits drive the state machine. Shared decoding lives in JsonParserBase._parseHexInt: int (<=7 hex digits), long (<=16 with top-nibble check), or BigInteger (eagerly decoded; the lazy _numberString path expects base-10 and would mis-read hex). A new boolean _numberIsHex field plus resetIntHex helper on ParserBase records that the buffered literal is hex; the textual representation preserved in _textBuffer is the original literal (sign + 0x/0X + digits), matching the precedent set by FasterXML#784 for the '+' sign. Per pjfanning's review on FasterXML#707: * JSON5 grammar is followed verbatim; leading zeros within hex digits are always allowed and do not consult ALLOW_LEADING_ZEROS_FOR_NUMBERS (see ALLOW_HEXADECIMAL_NUMBERS javadoc and the unsignedHexWithLeadingZeros test). * The '+' variant additionally requires ALLOW_LEADING_PLUS_SIGN_FOR_NUMBERS, matching how '+' is handled for decimal numbers; this is exercised by plusSignHexRequiresLeadingPlusFeature. Drive-by fix: while testing the async path I uncovered a pre-existing inconsistency where the explicit '+' sign was lost on the +0... resumption path because MINOR_NUMBER_ZERO was reused for both "plain 0" and "+0" suspension. Added MINOR_NUMBER_PLUSZERO so the sign survives byte-by-byte feeds; the sibling test AsyncNonStandardNumberParsingTest.leadingPlusSignInDecimalEnabled2 was asserting the buggy behavior (its peer leadingPlusSignInDecimalEnabled already asserted the correct, sign-preserving result) and has been updated to match the blocking parsers. Happy to split this into a separate PR if preferred. Tests: * NonStandardHexNumbers707Test - 12 cases covering unsigned/uppercase/ negative/+sign/long range/BigInteger range/in-array/feature-off rejection/empty-prefix rejection across ALL_MODES (Reader, UTF8 stream, UTF8 throttled, DataInput). * AsyncHexNumbers707Test - 10 cases driving the non-blocking parser with 1-byte and 3-byte feeds to exercise suspension/resumption at every position. Full mvn verify: 1697 tests, 0 failures, 0 regressions.

…erXML#707) Address review feedback on FasterXML#1613: parse the buffered hex digits as a BigInteger via a new NumberInput.parseBigIntegerWithRadix(char[], ...) overload that hands the slice straight to FastDoubleParser, eliminating the intermediate String allocation.

Cover offset=0 with both full-array and truncated-length cases, plus negative-sign inputs for the new parseBigIntegerWithRadix(char[], ...) overload.

Verify both useFastParser branches throw NumberFormatException on non-hex characters in the new parseBigIntegerWithRadix(char[], ...) overload.

…asterXML#707) Addresses two review comments on PR FasterXML#1613: - ParserBase.java:183 (cowtowncoder): hex literal handling is JSON-only, not shared by binary/text format backends that also extend ParserBase. - NonBlockingUtf8JsonParserBase.java:1810 (cowtowncoder): the per-parser _isHexDigit helper should live in a shared base for reuse. Promote JSON-only hex state to JsonParserBase --------------------------------------------- - _numberIsHex field and resetIntHex(...) move from ParserBase to JsonParserBase. ParserBase no longer references hex at all. - ParserBase.resetInt loses `final` so JsonParserBase can override it to clear _numberIsHex. resetFloat/resetAsNaN no longer touch the flag: _numberIsHex is only read in the VALUE_NUMBER_INT branch of _parseNumericValue/_parseIntValue, both reached only via resetInt or resetIntHex (both of which now correctly initialize the flag), so leaving it stale after a float/NaN reset is benign. Hoist duplicated per-parser helpers to JsonParserBase ----------------------------------------------------- - _isHexDigit(int) was duplicated four times across ReaderBasedJsonParser, UTF8StreamJsonParser, UTF8DataInputJsonParser, and NonBlockingUtf8JsonParserBase. Replaced with a single protected static on JsonParserBase. Char-stream parser's existing `_isHexDigit(char)` call site keeps working via implicit char->int widening. - "Hexadecimal number prefix '0X' must be followed by at least one hex digit (...)" error message was duplicated at the same four call sites. Extracted as protected static _hexPrefixNotFollowedMessage(char) on JsonParserBase. Net effect: -76 / +79 lines, identical externally observable behavior. Verified by full test suite (1702 pass, 2 pre-existing skips), with particular attention to NonStandardHexNumbers707Test and AsyncHexNumbers707Test.

) Addresses cowtowncoder's follow-up suggestion on PR FasterXML#1613: UTF8DataInputJsonParser.java:1218 (cowtowncoder): "Or better yet, check out what CharTypes.java already has; maybe charToHex() would be useful (gives -1 for non-hex digit)" CharTypes.charToHex(int) is a single masked array lookup that returns the digit value (0..15) for valid hex characters and -1 otherwise. That single primitive subsumes both helpers introduced for this feature: - _isHexDigit(int) -> CharTypes.charToHex(c) >= 0 - _hexDigit(char) -> CharTypes.charToHex(c) Both are removed; the four concrete parser scan loops and JsonParserBase._parseHexInt now go through CharTypes.charToHex directly, matching the pattern already used elsewhere in the JSON parsers for \\uXXXX escape decoding. Verified by full test suite (1702 pass, 2 pre-existing skips).

Two small review-feedback nits from FasterXML#1613: - ParserBase.resetInt was made non-final earlier in this branch so that JsonParserBase can override it to clear the hex flag. Add a comment noting the prior `final` modifier and why it was relaxed. - BigIntegerParser.parseWithFastParser(char[],int,int,int) is new in 3.2; add the `@since 3.2` Javadoc tag to match the existing convention.

@cowtowncoder

…error Per @cowtowncoder review on FasterXML#1613: when a number literal starts with `0x` / `0X` but `JsonReadFeature.ALLOW_HEXADECIMAL_NUMBERS` is not enabled, the user previously got a generic "unexpected character" error that did not mention the feature to enable. Centralize the feature check in a single `JsonParserBase._checkHexNumbersAllowed(int)` helper that throws a `StreamReadException` naming the feature, so the diagnostic is actionable: Unexpected character ('x') (code 120): hexadecimal number literals require enabling `JsonReadFeature.ALLOW_HEXADECIMAL_NUMBERS` All eight call sites across the four JSON parsers (Reader, UTF8 stream, UTF8 DataInput, non-blocking UTF8) now go through the helper, eliminating the duplicated `isEnabled(ALLOW_HEXADECIMAL_NUMBERS)` check at every parser-specific entry point. Existing feature-disabled tests are tightened to assert that the new error message includes the feature name in addition to the original "Unexpected character ('x'" prefix.

@cowtowncoder

Per @cowtowncoder review on FasterXML#1613: previously the maximum-integer-length constraint was applied only when `resetIntHex` was called -- i.e. after the full hex literal had been buffered. A pathological input with millions of digits would therefore allocate millions of chars before being rejected. Move the validation into the hex digit accumulation loop so it fires at every `_textBuffer` segment boundary (default first segment is 500 chars). With the default `maxNumberLength=1000`, a 10-million-digit hex literal is now caught at ~4 KiB rather than after fully buffering. Add `hexNumberLengthConstraint` to NonStandardHexNumbers707Test, exercising an 8000-digit literal with `maxNumberLength=100` against all sync modes; each backend must throw `StreamConstraintsException` with the standard "exceeds the maximum" message. The async path is exercised by the same validation primitive in `_finishHexDigits`.

seonwooj0810 · 2026-05-27T04:49:12Z

Quick update batch on this PR:

CLA: signed and sent — should be on file shortly (if not already).
Rebased onto 3.x post-Preserve explicit "+" sign across buffer boundaries for "+0..." on async parser #1614 merge. The MINOR_NUMBER_PLUSZERO=33 ordinal coordination you flagged is now consistent with upstream (HEX_PREFIX=34, HEX_DIGITS=35); the renumber commit (8e5a4b07) auto-dropped during rebase as its patch was already upstream. Local mvn test is clean (1706 tests, 0 failures).
Review items from your 03:15–03:28 round:
- ParserBase:401 (final comment), BigIntegerParser:40 (@since 3.2) — addressed in 7acfb5c
- JsonParserBase:229 (buffer-boundary length validation) — addressed in d1bf5c2
- ReaderBasedJsonParser:1589 (_checkHexNumbersAllowed helper) — addressed in 4e5945f
- JsonParserBase:368 (decode in TextBuffer) — left a per-thread reply leaning toward keeping this scoped out and revisiting as a follow-up given the PR-size concern; happy to file a tracking issue if you'd prefer.

Ready for another look when you have time.

cowtowncoder · 2026-05-27T18:31:26Z

CLA received.

cowtowncoder

LGTM: planning to merge later tonight.

github-actions · 2026-05-27T22:57:33Z

📈 Overall Code Coverage

Metric	Coverage	Change
Instructions	83.01%	📈 +0.000%
Branches	75.74%	📈 +0.030%

Overall project coverage from JaCoCo test results. Change values compare against the latest base branch build.

seonwooj0810 · 2026-05-28T00:27:16Z

Thanks @cowtowncoder and @pjfanning for the thorough review. Moving the hex bookkeeping into JsonParserBase, reusing CharTypes.charToHex(), and the buffer-boundary length check all made the PR meaningfully better. Really appreciate the time you both put in.

cowtowncoder · 2026-05-28T01:13:57Z

@seonwooj0810 Great collaboration, will get this merged tonight.

seonwooj0810 force-pushed the feat/707-allow-hex-numbers branch from 92f36fa to 0e27cba Compare May 26, 2026 12:00

pjfanning reviewed May 26, 2026

View reviewed changes

Comment thread src/main/java/tools/jackson/core/json/JsonParserBase.java Outdated

seonwooj0810 force-pushed the feat/707-allow-hex-numbers branch from ba2c3ea to 45e02b3 Compare May 26, 2026 13:48

pjfanning reviewed May 26, 2026

View reviewed changes

Comment thread src/test/java/tools/jackson/core/unittest/io/NumberInputTest.java Outdated

pjfanning reviewed May 26, 2026

View reviewed changes

Comment thread src/test/java/tools/jackson/core/unittest/read/NonStandardHexNumbers707Test.java