⚡️ Use TypeAdapter.validate_json instead of json.loads#15617
Conversation
|
This requires bumping pydantic to 2.10.0 or above because of this issue with validate_json appears on versions below pydantic-core 2.24.1. But github-actions-bot does not allow pyproject.toml modification (#13951), that's why "lowest-direct" tests are failing in CI. |
Merging this PR will improve performance by 24.76%
|
| Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|
| ⚡ | test_async_receiving_large_payload |
11.9 ms | 9.5 ms | +24.84% |
| ⚡ | test_sync_receiving_large_payload |
12.1 ms | 9.7 ms | +24.67% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing dolfinus:feature/pydanticv2-validate-json-new (dd1189a) with master (59d4a80)1
Footnotes
-
No successful run was found on
master(dbfd55c) during the generation of this report, so 59d4a80 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
t0ugh-sys
left a comment
There was a problem hiding this comment.
Code Review: Use TypeAdapter.validate_json instead of json.loads
This is a solid performance-oriented PR that follows the same pattern as #14962 (which replaced json.dumps with TypeAdapter.serialize_json). Here are my observations:
👍 What's Good
-
Performance improvement is well-motivated. The benchmark data shows ~15% RPS improvement (2228 → 2570) by avoiding the double-parsing overhead of
json.loadsfollowed byvalidate_python. Usingvalidate_jsonlets Pydantic parse and validate in a single pass, which is a meaningful win for JSON-heavy workloads. -
Clean separation of
is_body_jsonflag. Threading theis_body_jsonboolean throughsolve_dependencies→request_body_to_argsis a clean way to defer JSON parsing until validation time, rather than eagerly callingrequest.json()in routing. -
Graceful multi-field fallback. The PR correctly handles the case where multiple
Body()fields exist — falling back tojson.loadssince there's no singleTypeAdapterto validate against. The inline comment makes this limitation clear. -
Error handling preservation. The
parse_jsonhelper constructs error dicts that match Pydantic's error format (json_invalidtype,loc,ctxwithpos/lineno/colno), maintaining backward compatibility for API consumers. -
The
_validate_json_body_as_model_fielderror post-processing is clever — it detects when Pydantic returnsbytesin theinputfield and re-parses it to a dict, which is more user-friendly in error responses.
⚠️ Potential Issues & Questions
-
Breaking change in error response format for invalid JSON. The
bodyfield in validation error responses changes from a parsed dict to a JSON string (e.g.,'{"title": "towel", "size": "XL"}'). This is a user-visible behavioral change that could break clients parsing error responses. It's documented in the updated tutorial docs, but worth noting — consumers relying onbodybeing a dict will need to update. -
Error message inconsistency for single vs. multi-field routes. For a single non-embedded field, invalid JSON errors go through
_validate_json_body_as_model_field→ Pydantic's validator (e.g.,"msg": "Invalid JSON: key must be a string..."). For multi-field routes, they go throughparse_json→ stdlib's message format (e.g.,"msg": "Invalid JSON: Expecting property name enclosed in double quotes"). The test attest_tutorial002.pyconfirms this — the messages differ. This inconsistency could confuse users. -
loctuple change for JSON decode errors. Previously, the errorlocwas("body", e.pos)(with the byte position). Now it's just("body",)— the position info is moved toctx. This is arguably better (position is context, not a location path), but it's another user-visible change. -
The
_validate_json_body_as_model_fieldalways callsjson.loadson error input. If the body is large and validation fails, this re-parses the bytes just to populate theinputfield. For large payloads with validation errors, this could negate some of the performance gain on the error path. Consider whether displaying raw bytes (or truncating) might be acceptable instead. -
Removed
valuesparameter from_validate_value_with_model_field. Thevaluesarg was removed from_validate_value_with_model_fieldand fromfield.validate()calls. This is a good cleanup if unused, but I'd want to confirm that thevalidatemethod signature in_compat/v2.pytruly doesn't use thevaluesparameter for anything (e.g., for validators that depend on sibling fields). -
No content-type + no strict → always JSON. The logic
elif not actual_strict_content_type: is_body_json = Truemeans that when there's noContent-Typeheader and strict mode is off, the body is always treated as JSON. This preserves the existing behavior, but worth noting that this is a policy decision.
💡 Suggestions
-
Consider unifying error message formats. Could
parse_json()reuse Pydantic's JSON parser internally (e.g.,pydantic_core.from_json) to get consistent error messages for both single and multi-field paths? -
Add a comment or docstring to
_validate_json_body_as_model_fieldexplaining why theinputbytes → dict conversion is necessary, for future maintainers. -
The
IsOneOfworkaround intest_handling_errorsfor httpx compact JSON is pragmatic, but it might be worth tracking as a follow-up to pin the expected format once httpx stabilizes.
Overall, this is a well-executed PR with clear performance benefits and thoughtful handling of edge cases. The main concern is the backward-incompatible change in error response format (body becoming a string), which should be highlighted in release notes if merged. 🚀
golikovichev
left a comment
There was a problem hiding this comment.
The direct validate_json path avoids the double pass of json.loads followed by model validation, so the intent makes sense. Two things I would want to confirm before this lands, plus one question.
-
Error body shape change. The docs diff shows the validation error body going from a parsed object to the raw JSON string, with an added input field. That is a user-visible change for anyone whose exception handler or tests read exc.body as a dict. It may well be more correct, but it is a breaking change for error-handling code, so it is worth calling out in the PR description and the release notes.
-
solve_dependencies signature. The body parameter is narrowed from dict[str, Any] | FormData | bytes | None to bytes | FormData | None, and a new is_body_json flag is added. solve_dependencies gets imported and sometimes wrapped downstream. If it is treated as semi-public, the narrowed type plus the new keyword could surprise callers. Would a compatible default cover that?
-
Benchmark. The PR is tagged as a performance change. Do you have before and after numbers on a representative payload? Even a small benchmark on a nested model would make the perf claim concrete.
Happy to test a branch build against a few real request shapes if that helps.
#14962 replaced
json.dumpswith pydantic v2TypeAdapter.serialize_json(). This is the same, but for body parsing - replacejson.loadswithTypeAdapter.validate_json().See also #13949
Local tests gives +5-10% performance boost. The larger the request body, the larger the speedup.
Caveats - if route handle contains multiple fields with Body() annotation, json will be parsed for each field. Changing this requires substantial rewrite of body parsing, e.g. merging all body fields into one pydantic model and then calling it's
model_validate_json()method to parse request body once.I'm not ready for this type of change, so this can be considered as proof-of-concept.
Small benchmark
requirements.txt
app.py
locustfile.py
Before - 2228RPS,

json.loadstook 9.8% andtype_adapter.validate_pythontook 1% = 11% combinedfastapi_with_json_loads.tar.gz
After - 2570RPS,

type_adapter.validate_jsontook 7.17%fastapi_with_typeadapter_json.tar.gz