Reduce memory usage by dropping token-excess capacity and improve performance by approximating the initial tokens vec size#25354
Conversation
|
|
e02394f to
5879173
Compare
Memory usage reportSummary
Significant changesClick to expand detailed breakdownflake8
trio
sphinx
prefect
|
|
I think another issue here is that the Edit: tried it in #25382, it seems to at least fix the linter benchmark |
|
(Oh nice, that's a good catch.) |
|
I updated the heuristic to reduce the instances where we allocate too much storage. I really like codex explanation here, which is why I copy paste some of it:
|
…formance by approximating the initial tokens vec size (astral-sh#25354) Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
Summary
This PR plugs the existing
allocate_tokens_vec(contents)heuristic into our parser. The idea ofallocate_tokens_vecis to approximate the size of the allocations Vec based on the source text size instead of starting from an empty Vec, to reduce the number of regrows during lexing. Reducing the allocations helps with performance. However, this isn't a zero-cost optimization because it increases memory usage when we over-approximate. This PR adds ashrink_to_fitcall once parsing's finished to mitigate the memory usage increase to peak-memory usage only. However,shrink_to_fitalso isn't free; it can require a reallocation, depending on the amount of excess capacity. However, callingshrink_to_fitshows that it does reduce ty's memory usage.The parser memory usage benchmarks look very concerning. I think they're a bit overdramatic. What they show is that peak memory usage increases, which is accurate. However, for most benchmarks, the number of allocations reduces. Which includes many ty full-project runs, which I consider more meaningful.
For most benchmarks, performance increases by 1-2%. There are very few for which performance decreases by 1-2%. There are two parser benchmarks that stand out: One for which performance increases by 10% and one for which performance decreases by 5%.
Overall, I'm leaning towards landing this change, because it allows us to reduce memory usage in ty, while still suggesting a small perf improvement over all.