Validate that strings being parsed as integers consist of ASCII characters#2995
Conversation
…cters. This is technically an incompatible change, although it is unlikely that anyone will be affected by it. It also fixes a minor security issue. Fixes google#2994.
| private void validateAscii(String s) throws MalformedJsonException { | ||
| for (int i = 0; i < s.length(); i++) { | ||
| if (s.charAt(i) > 127) { | ||
| throw syntaxError("String contains non-ASCII characters: " + s); |
There was a problem hiding this comment.
Should this be more specific?
| private void validateAscii(String s) throws MalformedJsonException { | |
| for (int i = 0; i < s.length(); i++) { | |
| if (s.charAt(i) > 127) { | |
| throw syntaxError("String contains non-ASCII characters: " + s); | |
| private void validateAsciiNumber(String s) throws MalformedJsonException { | |
| for (int i = 0; i < s.length(); i++) { | |
| if (s.charAt(i) > 127) { | |
| throw syntaxError("Number contains non-ASCII characters: " + s); |
There was a problem hiding this comment.
I'm sort of inclined to leave it the way it is. It should be clear enough from the stack trace that we are talking about a number, and there's nothing specific to numbers in the method itself.
| } | ||
| try { | ||
| validateAscii(peekedString); | ||
| long result = Long.parseLong(peekedString); |
There was a problem hiding this comment.
Side note: This still slightly deviates from JSON number syntax, e.g. it allows 00099 (leading 0) or +123 (leading +).
But I guess this is not a big issue?
(For Double#parseDouble used here and by JsonReader#nextDouble the difference is even more extreme in what that method permits compared to what the JSON specification normally permits for a number.)
There was a problem hiding this comment.
Right. I already think that the issue we're addressing is unlikely to cause problems in practice. I think accepting these little variants here is even less likely to, whereas fixing them would have a nonzero chance of breaking people.
| * @throws MalformedJsonException if the next literal value is NaN or Infinity and this reader is | ||
| * not {@link #setStrictness(Strictness) lenient}. | ||
| */ | ||
| public double nextDouble() throws IOException { |
There was a problem hiding this comment.
nextDouble is not affected right? It seems it does not allow non-ASCII chars.
(might be good to add a test for this though?)
On the other hand, the syntax Double#parseDouble permits is quite different from what the JSON syntax permits. Also, it allows leading and trailing whitespace and control characters. Not sure if that could be a problem as well.
There was a problem hiding this comment.
I did add a small test for this in testMalformedNumbers.
Similar remark about the non-strict parsing here.
There was a problem hiding this comment.
I did add a small test for this in
testMalformedNumbers.
Ah right, that one is reading it as unquoted string.
There was a problem hiding this comment.
Oh I see, yes. I really just wanted to check that Double.parseDouble rejects non-ASCII digits. I actually did have a second new assertion for the string case, but that failed the round-trip verification and I didn't feel inclined to do the extra work.
Suggested by @Marcono1234.
…ip ci] Bumps [com.google.code.gson:gson](https://github.com/google/gson) from 2.13.2 to 2.14.0. Release notes *Sourced from [com.google.code.gson:gson's releases](https://github.com/google/gson/releases).* > Gson 2.14.0 > ----------- > > What's Changed > -------------- > > * Add type adapters for `java.time` classes by [`@eamonnmcmanus`](https://github.com/eamonnmcmanus) in [google/gson#2948](https://redirect.github.com/google/gson/pull/2948) > > When the `java.time` API is available, Gson automatically can read and write instances of classes like `Instant` and `Duration`. The format it uses essentially freezes the JSON representation that `ReflectiveTypeAdapterFactory` established by default, based on the private fields of `java.time` classes. That's not a great representation, but it is understandable. Changing it to anything else would break compatibility with systems that are expecting the current format. > > With this change, Gson no longer tries to access private fields of these classes using reflection. So it is no longer necessary to run with `--add-opens` for these classes on recent JDKs. > * Remove `com.google.gson.graph` by [`@eamonnmcmanus`](https://github.com/eamonnmcmanus) in [google/gson#2990](https://redirect.github.com/google/gson/pull/2990). > > This package was not part of any released artifact and depended on Gson internals in potentially problematic ways. > * Validate that strings being parsed as integers consist of ASCII characters by [`@eamonnmcmanus`](https://github.com/eamonnmcmanus) in [google/gson#2995](https://redirect.github.com/google/gson/pull/2995) > > Previously, strings could contain non-ASCII Unicode digits and still be parsed as integers. That's inconsistent with how JSON numbers are treated. > * Fix duplicate key detection when first value is null by [`@andrewstellman`](https://github.com/andrewstellman) in [google/gson#3006](https://redirect.github.com/google/gson/pull/3006) > > This could potentially break code that was relying on the incorrect behaviour. For example, this JSON string was previously accepted but will no longer be: `{"foo": null, "foo": bar}`. > * Remove `Serializable` from internal `Type` implementation classes. by [`@eamonnmcmanus`](https://github.com/eamonnmcmanus) in [google/gson#3011](https://redirect.github.com/google/gson/pull/3011) > > The nested classes `ParameterizedTypeImpl`, `GenericArrayTypeImpl`, and `WildcardTypeImpl` in `GsonTypes` are implementations of the corresponding types (without `Impl`) in `java.lang.reflect`. For some reason, they were serializable, even though the `java.lang.reflect` implementations are not. Having unnecessarily serializable classes could *conceivably* have been a security problem if they were part of a larger exploit using serialization. (We do not consider this a likely scenario and do not suggest that you need to update Gson just to get this change.) > * Add `LegacyProtoTypeAdapterFactory`. by [`@eamonnmcmanus`](https://github.com/eamonnmcmanus) in [google/gson#3014](https://redirect.github.com/google/gson/pull/3014) > > This is not part of any released artifact, but may be of use when trying to fix code that is currently accessing the internals of protobuf classes via reflection. > * Make AppendableWriter do flush and close if delegation object supports by [`@MukjepScarlet`](https://github.com/MukjepScarlet) in [google/gson#2925](https://redirect.github.com/google/gson/pull/2925) > > Other less visible changes > -------------------------- > > * Add default capacity to EnumTypeAdapter maps by [`@MukjepScarlet`](https://github.com/MukjepScarlet) in [google/gson#2959](https://redirect.github.com/google/gson/pull/2959) > * refactor: move derived adapters from Gson to TypeAdapters by [`@MukjepScarlet`](https://github.com/MukjepScarlet) in [google/gson#2951](https://redirect.github.com/google/gson/pull/2951) > * Optimize `new Gson()` by [`@MukjepScarlet`](https://github.com/MukjepScarlet) in [google/gson#2864](https://redirect.github.com/google/gson/pull/2864) > > New Contributors > ---------------- > > * [`@ThirdGoddess`](https://github.com/ThirdGoddess) made their first contribution in [google/gson#2944](https://redirect.github.com/google/gson/pull/2944) > * [`@lmj798`](https://github.com/lmj798) made their first contribution in [google/gson#2988](https://redirect.github.com/google/gson/pull/2988) > * [`@Eng-YasminKotb`](https://github.com/Eng-YasminKotb) made their first contribution in [google/gson#3005](https://redirect.github.com/google/gson/pull/3005) > * [`@andrewstellman`](https://github.com/andrewstellman) made their first contribution in [google/gson#3006](https://redirect.github.com/google/gson/pull/3006) > > **Full Changelog**: <google/gson@gson-parent-2.13.2...gson-parent-2.14.0> Commits * [`3ff35d6`](google/gson@3ff35d6) [maven-release-plugin] prepare release gson-parent-2.14.0 * [`a3024fd`](google/gson@a3024fd) Bump the maven group with 13 updates ([#3002](https://redirect.github.com/google/gson/issues/3002)) * [`5689ffe`](google/gson@5689ffe) Bump the github-actions group across 1 directory with 3 updates ([#3018](https://redirect.github.com/google/gson/issues/3018)) * [`48db33c`](google/gson@48db33c) Add `LegacyProtoTypeAdapterFactory`. ([#3014](https://redirect.github.com/google/gson/issues/3014)) * [`53d703e`](google/gson@53d703e) Update outdated comment regarding serializable types ([#3012](https://redirect.github.com/google/gson/issues/3012)) * [`0189b72`](google/gson@0189b72) Remove `Serializable` from internal `Type` implementation classes. ([#3011](https://redirect.github.com/google/gson/issues/3011)) * [`f4d371d`](google/gson@f4d371d) Fix duplicate key detection when first value is null ([#3006](https://redirect.github.com/google/gson/issues/3006)) * [`27d9ba1`](google/gson@27d9ba1) Fix typo in README (JPMS dependencies section) ([#3005](https://redirect.github.com/google/gson/issues/3005)) * [`1fa9b7a`](google/gson@1fa9b7a) Validate that strings being parsed as integers consist of ASCII characters (#... * [`b7d5954`](google/gson@b7d5954) Add iterator fail-fast tests for LinkedTreeMap.clear() ([#2992](https://redirect.github.com/google/gson/issues/2992)) * Additional commits viewable in [compare view](google/gson@gson-parent-2.13.2...gson-parent-2.14.0) [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
## Release notes Sourced from com.google.code.gson:gson's releases. Gson 2.14.0 What's Changed Add type adapters for java.time classes by @eamonnmcmanus in google/gson#2948 When the java.time API is available, Gson automatically can read and write instances of classes like Instant and Duration. The format it uses essentially freezes the JSON representation that ReflectiveTypeAdapterFactory established by default, based on the private fields of java.time classes. That's not a great representation, but it is understandable. Changing it to anything else would break compatibility with systems that are expecting the current format. With this change, Gson no longer tries to access private fields of these classes using reflection. So it is no longer necessary to run with --add-opens for these classes on recent JDKs. Remove com.google.gson.graph by @eamonnmcmanus in google/gson#2990. This package was not part of any released artifact and depended on Gson internals in potentially problematic ways. Validate that strings being parsed as integers consist of ASCII characters by @eamonnmcmanus in google/gson#2995 Previously, strings could contain non-ASCII Unicode digits and still be parsed as integers. That's inconsistent with how JSON numbers are treated. Fix duplicate key detection when first value is null by @andrewstellman in google/gson#3006 This could potentially break code that was relying on the incorrect behaviour. For example, this JSON string was previously accepted but will no longer be: {"foo": null, "foo": bar}. Remove Serializable from internal Type implementation classes. by @eamonnmcmanus in google/gson#3011 The nested classes ParameterizedTypeImpl, GenericArrayTypeImpl, and WildcardTypeImpl in GsonTypes are implementations of the corresponding types (without Impl) in java.lang.reflect. For some reason, they were serializable, even though the java.lang.reflect implementations are not. Having unnecessarily serializable classes could conceivably have been a security problem if they were part of a larger exploit using serialization. (We do not consider this a likely scenario and do not suggest that you need to update Gson just to get this change.) Add LegacyProtoTypeAdapterFactory. by @eamonnmcmanus in google/gson#3014 This is not part of any released artifact, but may be of use when trying to fix code that is currently accessing the internals of protobuf classes via reflection. Make AppendableWriter do flush and close if delegation object supports by @MukjepScarlet in google/gson#2925 Other less visible changes Add default capacity to EnumTypeAdapter maps by @MukjepScarlet in google/gson#2959 refactor: move derived adapters from Gson to TypeAdapters by @MukjepScarlet in google/gson#2951 Optimize new Gson() by @MukjepScarlet in google/gson#2864 New Contributors @ThirdGoddess made their first contribution in google/gson#2944 @lmj798 made their first contribution in google/gson#2988 @Eng-YasminKotb made their first contribution in google/gson#3005 @andrewstellman made their first contribution in google/gson#3006 Full Changelog: google/gson@gson-parent-2.13.2...gson-parent-2.14.0 Gson 2.13.2 The main changes in this release are just newer dependencies. ... (truncated) ## Changelog Sourced from com.google.code.gson:gson's changelog. Change Log The change log for versions newer than 2.10 is available only on the GitHub Releases page. Version 2.10 Support for serializing and deserializing Java records, on Java ≥ 16. (google/gson#2201) Add JsonArray.asList and JsonObject.asMap view methods (google/gson#2225) Fix TypeAdapterRuntimeTypeWrapper not detecting reflective TreeTypeAdapter and FutureTypeAdapter (google/gson#1787) Improve JsonReader.skipValue() (google/gson#2062) Perform numeric conversion for primitive numeric type adapters (google/gson#2158) Add Gson.fromJson(..., TypeToken) overloads (google/gson#1700) Fix changes to GsonBuilder affecting existing Gson instances (google/gson#1815) Make JsonElement conversion methods more consistent and fix javadoc (google/gson#2178) Throw UnsupportedOperationException when JsonWriter.jsonValue is not supported (google/gson#1651) Disallow JsonObject Entry.setValue(null) (google/gson#2167) Fix TypeAdapter.toJson throwing AssertionError for custom IOException (google/gson#2172) Convert null to JsonNull for JsonArray.set (google/gson#2170) Fixed nullSafe usage. (google/gson#1555) Validate TypeToken.getParameterized arguments (google/gson#2166) Fix #1702: Gson.toJson creates CharSequence which does not implement toString (google/gson#1703) Prefer existing adapter for concurrent Gson.getAdapter calls (google/gson#2153) Improve ArrayTypeAdapter for Object[] (google/gson#1716) Improve AppendableWriter performance (google/gson#1706) ## Commits 3ff35d6 [maven-release-plugin] prepare release gson-parent-2.14.0 a3024fd Bump the maven group with 13 updates (#3002) 5689ffe Bump the github-actions group across 1 directory with 3 updates (#3018) 48db33c Add LegacyProtoTypeAdapterFactory. (#3014) 53d703e Update outdated comment regarding serializable types (#3012) 0189b72 Remove Serializable from internal Type implementation classes. (#3011) f4d371d Fix duplicate key detection when first value is null (#3006) 27d9ba1 Fix typo in README (JPMS dependencies section) (#3005) 1fa9b7a Validate that strings being parsed as integers consist of ASCII characters (# b7d5954 Add iterator fail-fast tests for LinkedTreeMap.clear() (#2992) Additional commits viewable in compare view  Issue-ID: CIMAN-33 Signed-off-by: dependabot[bot] <support@github.com> Change-Id: I3707533b94e77d221389af97baae133d90bdb985 GitHub-PR: #37 GitHub-Hash: dc2f307754a500e1 Signed-off-by: onap.gh2gerrit <releng+onap-gh2gerrit@linuxfoundation.org>
### What changes were proposed in this pull request? This PR aims to upgrade `gson` to `2.14.0` for Apache Spark 5. ### Why are the changes needed? To bring the latest bug fixes. - https://github.com/google/gson/releases/tag/gson-parent-2.14.0 (2026-04-23) - google/gson#3006 - google/gson#2995 - google/gson#2925 - google/gson#2864 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7 Closes #56175 from dongjoon-hyun/SPARK-57122. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This is technically an incompatible change, although it is unlikely that anyone will be affected by it. It also fixes a minor security issue.
Fixes #2994.