Shrink additional parser AST collections#25465
Conversation
Memory usage reportSummary
Significant changesClick to expand detailed breakdownflake8
trio
sphinx
prefect
|
|
|
Merging this PR will improve performance by 6.52%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Memory | parser[pydantic/types.py] |
368.5 KB | 322.7 KB | +14.2% |
| ⚡ | Memory | parser[unicode/pypinyin.py] |
52.6 KB | 46.5 KB | +13.15% |
| ⚡ | Memory | parser[numpy/globals.py] |
16 KB | 14.3 KB | +12.44% |
| ⚡ | Memory | parser[numpy/ctypeslib.py] |
166.5 KB | 150.8 KB | +10.38% |
| ⚡ | Memory | ty_micro[pandas_tdd] |
39.4 MB | 36.4 MB | +8.06% |
| ⚡ | Memory | ty_micro[very_large_tuple] |
13.3 MB | 12.5 MB | +6.65% |
| ⚡ | Memory | ty_micro[gradual_vararg_call] |
13.1 MB | 12.3 MB | +6.63% |
| ⚡ | Memory | ty_micro[vararg_parameter_type_accumulation] |
12 MB | 11.3 MB | +6.34% |
| ⚡ | Memory | ty_micro[complex_constrained_attributes_2] |
13.5 MB | 12.7 MB | +6.09% |
| ⚡ | Memory | ty_micro[complex_constrained_attributes_1] |
13.5 MB | 12.7 MB | +6.08% |
| ⚡ | Memory | ty_micro[many_tuple_assignments] |
14 MB | 13.2 MB | +5.87% |
| ⚡ | Memory | ty_micro[literal_match_fallthrough] |
12.4 MB | 11.7 MB | +5.81% |
| ⚡ | Memory | parser[large/dataset.py] |
816.8 KB | 772.2 KB | +5.78% |
| ⚡ | Memory | ty_micro[many_string_assignments] |
14.6 MB | 13.8 MB | +5.73% |
| ⚡ | Memory | DateType |
22.9 MB | 21.6 MB | +5.7% |
| ⚡ | Memory | ty_micro[many_enum_members_2] |
14.6 MB | 13.8 MB | +5.67% |
| ⚡ | Memory | ty_micro[large_isinstance_narrowing] |
14.5 MB | 13.7 MB | +5.65% |
| ⚡ | Memory | ty_micro[complex_constrained_attributes_3] |
15 MB | 14.2 MB | +5.57% |
| ⚡ | Memory | ty_micro[many_tuple_assignments] |
12.9 MB | 12.2 MB | +5.56% |
| ⚡ | Memory | ty_micro[typevar_mapping_small_accumulations] |
14.8 MB | 14 MB | +5.54% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Curious why this is faster? Use the CodSpeed MCP and ask your agent.
Comparing micha/parser-shrink-more-collections (d8914fe) with main (06dd02b)1
Footnotes
Are the numbers total reduction across all these repositories or an average reduction ? |
dhruvmanila
left a comment
There was a problem hiding this comment.
Seems reasonable, thanks!
It's the reduction across all repositories. |
Summary
Shrink the containers for more AST node fields during parser to reduce AST memory consumption.
This PR adds more
shrink_to_fitcalls to the parser. It's still not all nodes, I focused on nodes for which we see the biggest memory improvements.Parameters.argsStmtImportFrom.namesExprTuple.eltsStmtFunctionDef.decorator_listExprDict.itemsExprList.eltsStmtWith.itemsStmtImport.namesStmtIf.elif_else_clausesExprListComp.generatorsExprBoolOp.valuesExprGenerator.generatorsStmtTry.handlersParameters.kwonlyargsParameters.posonlyargsStmtClassDef.decorator_listExprDictComp.generatorsComprehension.ifsExprSetComp.generatorsI focused on nodes where the saving was at least one MiB, with the exception of
ExprSetCompwhich shares its parsing withExprListComp.These numbers are from parsing all Python files in pytest, typeshed, home-assistant-core, anyio, mypy, fastapi, pandas, flake8, trio, sphinx, and prefect.
Performance
This PR introduces a small performance regression because the
shrink_to_fitcalls often require reallocating the collection nodes. The ty project walltime performance benchmarks range and the formatter and linter benchmarks from neutral to -1%. The parser benchmarks range from -1 to -4%, which is roughly what I'd expect