Skip to content

Conversation

@norberttech
Copy link
Member

Change Log


Added

Fixed

  • Performance bottlenecks in thrift by overwriting implementations with more modern code
  • Cleanup Chunk builders after flushing content to stream

Changed

  • Moved thrift generated models to new namespace in parquet library

Removed

Deprecated

Security

This is a result of my initial investigation of #1857.
I'm not fully sure yet what's happening there, but even if this one is not yet a full fix is definitely a step into right direction

@codecov
Copy link

codecov bot commented Sep 24, 2025

Codecov Report

❌ Patch coverage is 95.09804% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.63%. Comparing base (0e140a5) to head (0d75f65).
⚠️ Report is 2 commits behind head on 1.x.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##              1.x    #1867      +/-   ##
==========================================
- Coverage   82.53%   77.63%   -4.90%     
==========================================
  Files         772      824      +52     
  Lines       21808    25227    +3419     
==========================================
+ Hits        17999    19586    +1587     
- Misses       3809     5641    +1832     
Components Coverage Δ
etl 89.25% <ø> (ø)
cli 85.91% <ø> (ø)
lib-array-dot 94.56% <ø> (ø)
lib-azure-sdk 61.35% <ø> (ø)
lib-doctrine-dbal-bulk 95.59% <ø> (ø)
lib-filesystem 80.25% <ø> (ø)
lib-types 53.55% <ø> (ø)
lib-parquet 68.75% <97.00%> (-16.75%) ⬇️
lib-parquet-viewer 83.11% <ø> (ø)
lib-snappy 90.23% <ø> (ø)
bridge-filesystem-async-aws 90.38% <ø> (ø)
bridge-filesystem-azure 89.92% <ø> (ø)
bridge-monolog-http 97.04% <ø> (ø)
bridge-openapi-specification 94.52% <ø> (ø)
symfony-http-foundation 74.41% <ø> (ø)
adapter-chartjs 86.70% <ø> (ø)
adapter-csv 88.85% <ø> (ø)
adapter-doctrine 91.21% <ø> (ø)
adapter-elasticsearch 97.23% <ø> (ø)
adapter-google-sheet 91.66% <ø> (ø)
adapter-http 58.10% <ø> (ø)
adapter-json 87.98% <ø> (ø)
adapter-logger 53.84% <ø> (ø)
adapter-meilisearch 97.95% <ø> (ø)
adapter-parquet 78.40% <0.00%> (-0.53%) ⬇️
adapter-text 84.44% <ø> (ø)
adapter-xml 82.86% <ø> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 24, 2025

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
| benchmark             | subject                | revs | its | mem_peak        | mode             | rstdev          |
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
| CSVExtractorBench     | bench_extract_10k      | 1    | 3   | 4.956mb +0.08%  | 433.127ms -0.69% | ±1.03% +94.16%  |
| ExcelExtractorBench   | bench_extract_10k_ods  | 1    | 3   | 66.215mb +0.01% | 1.088s -0.35%    | ±0.16% -77.70%  |
| ExcelExtractorBench   | bench_extract_10k_xlsx | 1    | 3   | 68.326mb +0.01% | 1.720s +0.56%    | ±0.27% -52.91%  |
| JsonExtractorBench    | bench_extract_10k      | 1    | 3   | 5.492mb -1.37%  | 1.180s +0.03%    | ±0.63% -67.04%  |
| ParquetExtractorBench | bench_extract_10k      | 1    | 3   | 10.788mb -0.67% | 9.514s -19.43%   | ±0.68% +339.21% |
| TextExtractorBench    | bench_extract_10k      | 1    | 3   | 4.682mb +0.09%  | 61.893ms +0.79%  | ±0.40% -16.85%  |
| XmlExtractorBench     | bench_extract_10k      | 1    | 3   | 4.669mb +0.09%  | 625.232ms -0.96% | ±1.50% +124.73% |
+-----------------------+------------------------+------+-----+-----------------+------------------+-----------------+
Transformers
+---------------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+
| benchmark                       | subject                  | revs | its | mem_peak         | mode            | rstdev          |
+---------------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+
| RenameEachEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 18.687mb +0.02%  | 72.636ms -0.03% | ±0.45% -81.84%  |
| RenameEntryTransformerBench     | bench_transform_10k_rows | 1    | 3   | 123.490mb +0.00% | 65.845ms +1.42% | ±1.01% +357.35% |
+---------------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+
Loaders
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode             | rstdev          |
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 62.781mb +0.01%  | 89.672ms -0.39%  | ±0.69% -52.21%  |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 80.702mb +0.01%  | 101.965ms +0.19% | ±0.56% +211.29% |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 819.384mb -3.14% | 20.031s -26.28%  | ±0.63% -8.76%   |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 17.982mb +0.02%  | 34.393ms +0.43%  | ±0.42% -81.04%  |
+--------------------+----------------+------+-----+------------------+------------------+-----------------+
Building Blocks
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark         | subject                    | revs | its | mem_peak         | mode             | rstdev          |
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 42.608mb +0.01%  | 411.641ms +2.38% | ±0.46% +86.89%  |
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 11.665mb +0.04%  | 81.530ms -2.11%  | ±0.37% -59.55%  |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 106.085mb +0.00% | 648.738ms -0.88% | ±0.80% +41.10%  |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 55.363mb +0.01%  | 334.112ms +2.49% | ±0.80% +1.83%   |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.949mb +0.03%  | 68.928ms -2.70%  | ±0.77% -57.73%  |
| RowsBench         | bench_chunk_10_on_10k      | 2    | 3   | 93.553mb +0.00%  | 3.503ms +5.55%   | ±1.88% +115.35% |
| RowsBench         | bench_diff_left_1k_on_10k  | 2    | 3   | 110.943mb +0.00% | 237.366ms -1.34% | ±0.19% -78.17%  |
| RowsBench         | bench_diff_right_1k_on_10k | 2    | 3   | 93.663mb +0.00%  | 23.857ms -0.25%  | ±0.30% -31.36%  |
| RowsBench         | bench_drop_1k_on_10k       | 2    | 3   | 94.428mb +0.00%  | 1.405ms +4.53%   | ±3.78% +717.19% |
| RowsBench         | bench_drop_right_1k_on_10k | 2    | 3   | 94.428mb +0.00%  | 1.349ms -2.95%   | ±1.21% -55.57%  |
| RowsBench         | bench_entries_on_10k       | 2    | 3   | 92.589mb +0.00%  | 3.357ms -1.55%   | ±1.35% -17.33%  |
| RowsBench         | bench_filter_on_10k        | 2    | 3   | 93.117mb +0.00%  | 16.034ms +5.15%  | ±1.40% +30.20%  |
| RowsBench         | bench_find_on_10k          | 2    | 3   | 93.117mb +0.00%  | 15.302ms +1.74%  | ±1.58% +271.47% |
| RowsBench         | bench_find_one_on_10k      | 10   | 3   | 91.806mb +0.00%  | 1.706μs -9.92%   | ±2.72% +7.69%   |
| RowsBench         | bench_first_on_10k         | 10   | 3   | 91.806mb +0.00%  | 0.400μs 0.00%    | ±0.00% 0.00%    |
| RowsBench         | bench_flat_map_on_1k       | 2    | 3   | 100.867mb +0.00% | 14.240ms -5.00%  | ±0.39% +42.73%  |
| RowsBench         | bench_map_on_10k           | 2    | 3   | 130.294mb +0.00% | 64.966ms -6.20%  | ±1.91% +87.47%  |
| RowsBench         | bench_merge_1k_on_10k      | 2    | 3   | 93.637mb +0.00%  | 1.227ms -11.60%  | ±3.66% +518.02% |
| RowsBench         | bench_partition_by_on_10k  | 2    | 3   | 97.025mb +0.00%  | 60.618ms -3.49%  | ±1.09% +33.18%  |
| RowsBench         | bench_remove_on_10k        | 2    | 3   | 94.690mb +0.00%  | 3.518ms +2.58%   | ±1.55% -21.55%  |
| RowsBench         | bench_sort_asc_on_1k       | 2    | 3   | 92.187mb +0.00%  | 39.688ms +0.60%  | ±0.75% -22.31%  |
| RowsBench         | bench_sort_by_on_1k        | 2    | 3   | 92.187mb +0.00%  | 39.727ms -0.37%  | ±0.20% -63.17%  |
| RowsBench         | bench_sort_desc_on_1k      | 2    | 3   | 92.187mb +0.00%  | 40.006ms +0.20%  | ±0.65% -53.28%  |
| RowsBench         | bench_sort_entries_on_1k   | 2    | 3   | 94.249mb +0.00%  | 8.012ms -0.06%   | ±0.85% -60.71%  |
| RowsBench         | bench_sort_on_1k           | 2    | 3   | 91.999mb +0.00%  | 29.538ms -0.47%  | ±1.15% -17.90%  |
| RowsBench         | bench_take_1k_on_10k       | 10   | 3   | 91.806mb +0.00%  | 14.480μs +3.02%  | ±1.17% -55.21%  |
| RowsBench         | bench_take_right_1k_on_10k | 10   | 3   | 91.806mb +0.00%  | 16.157μs +1.44%  | ±2.06% -32.60%  |
| RowsBench         | bench_unique_on_1k         | 2    | 3   | 110.943mb +0.00% | 241.597ms +1.13% | ±0.80% +42.11%  |
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+
Parquet Library
+--------------------+---------------------------------+------+-----+------------------+-------------------+-----------------+
| benchmark          | subject                         | revs | its | mem_peak         | mode              | rstdev          |
+--------------------+---------------------------------+------+-----+------------------+-------------------+-----------------+
| ParquetReaderBench | bench_page_headers              | 1    | 3   | 6.989mb -3.92%   | 1.863s -44.92%    | ±0.40% -10.21%  |
| ParquetReaderBench | bench_read_metadata             | 1    | 3   | 5.444mb -0.63%   | 8.029ms -56.23%   | ±0.96% +363.81% |
| ParquetReaderBench | bench_read_schema               | 1    | 3   | 5.444mb -0.63%   | 8.058ms -55.96%   | ±0.58% -29.20%  |
| ParquetReaderBench | bench_read_values_all_columns   | 1    | 3   | 9.260mb -1.39%   | 5.579s -29.43%    | ±1.01% +246.73% |
| ParquetReaderBench | bench_read_values_single_column | 1    | 3   | 6.491mb -0.98%   | 214.315ms -53.00% | ±0.21% -58.88%  |
| ParquetReaderBench | bench_read_values_with_limit    | 1    | 3   | 7.075mb -1.87%   | 18.457ms -45.32%  | ±0.26% +115.67% |
| ParquetWriterBench | bench_write_batch               | 1    | 3   | 11.878mb -14.94% | 194.633ms -13.39% | ±0.66% +66.92%  |
| ParquetWriterBench | bench_write_gzip                | 1    | 3   | 10.503mb -0.88%  | 218.954ms -1.00%  | ±0.76% +27.31%  |
| ParquetWriterBench | bench_write_row_by_row          | 1    | 3   | 11.878mb -14.94% | 192.223ms -13.65% | ±1.04% +66.49%  |
| ParquetWriterBench | bench_write_snappy              | 1    | 3   | 11.878mb -14.94% | 192.819ms -13.01% | ±0.37% -41.15%  |
| ParquetWriterBench | bench_write_uncompressed        | 1    | 3   | 10.124mb -0.20%  | 192.310ms +0.19%  | ±0.69% -15.02%  |
+--------------------+---------------------------------+------+-----+------------------+-------------------+-----------------+

@norberttech norberttech force-pushed the chore/thrift-performance branch from 4959309 to 8f7d720 Compare September 24, 2025 20:49
@norberttech norberttech merged commit 446df3b into 1.x Sep 24, 2025
24 of 25 checks passed
@norberttech norberttech deleted the chore/thrift-performance branch September 24, 2025 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants