Skip to content

Conversation

@norberttech
Copy link
Member

Change Log

Added

  • codeowners definition

Fixed

Changed

  • Analyze will only take column type from schema when available to display column statistics

Removed

  • Entry object from column statistics, use reference instead

Deprecated

Security


Description

@norberttech norberttech linked an issue Mar 29, 2025 that may be closed by this pull request
@norberttech norberttech requested a review from Copilot March 29, 2025 20:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes the entry object from column statistics and directs users to use a reference instead. It also updates the pull request template to enforce a changelog format and includes a codeowners definition.

  • Updated pull request template comments with explicit instructions on changelog maintenance.
  • Added codeowners definition.
Files not reviewed (4)
  • .github/CODEOWNERS: Language not supported
  • src/cli/src/Flow/CLI/Formatter/PipelineReportFormatter.php: Language not supported
  • src/cli/tests/Flow/CLI/Tests/Integration/FileAnalyzeCommandTest.php: Language not supported
  • src/core/etl/src/Flow/ETL/Dataset/Statistics/Column.php: Language not supported

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+
| benchmark             | subject           | revs | its | mem_peak        | mode             | rstdev          |
+-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+
| CSVExtractorBench     | bench_extract_10k | 1    | 3   | 4.883mb +0.01%  | 614.470ms +0.47% | ±1.70% +156.27% |
| JsonExtractorBench    | bench_extract_10k | 1    | 3   | 4.957mb +0.01%  | 1.116s -1.88%    | ±0.62% +128.98% |
| ParquetExtractorBench | bench_extract_10k | 1    | 3   | 86.470mb +0.00% | 905.532ms +1.71% | ±1.14% +453.81% |
| TextExtractorBench    | bench_extract_10k | 1    | 3   | 4.606mb +0.01%  | 38.754ms +2.24%  | ±0.77% +63.06%  |
| XmlExtractorBench     | bench_extract_10k | 1    | 3   | 4.581mb +0.01%  | 601.834ms -1.05% | ±0.87% -21.40%  |
+-----------------------+-------------------+------+-----+-----------------+------------------+-----------------+
Transformers
+-----------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+
| benchmark                   | subject                  | revs | its | mem_peak         | mode            | rstdev          |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 127.405mb +0.00% | 70.026ms -0.18% | ±0.72% +111.36% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+-----------------+
Loaders
+--------------------+----------------+------+-----+------------------+------------------+----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode             | rstdev         |
+--------------------+----------------+------+-----+------------------+------------------+----------------+
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 64.050mb +0.00%  | 101.713ms -1.03% | ±0.24% -35.88% |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 84.432mb +0.00%  | 97.858ms -1.12%  | ±0.55% +75.10% |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 161.284mb +0.00% | 20.818s +0.28%   | ±0.50% +28.19% |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 18.138mb +0.00%  | 31.057ms -0.03%  | ±0.08% -87.82% |
+--------------------+----------------+------+-----+------------------+------------------+----------------+
Building Blocks
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark         | subject                    | revs | its | mem_peak         | mode             | rstdev          |
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 43.920mb +0.00%  | 361.726ms -0.25% | ±0.47% -43.56%  |
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 11.730mb +0.00%  | 73.968ms +0.97%  | ±0.31% -22.63%  |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 106.034mb +0.00% | 524.071ms +2.82% | ±1.29% +57.35%  |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 55.224mb +0.00%  | 263.525ms +0.52% | ±0.27% -52.21%  |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.746mb +0.00%  | 57.811ms +1.90%  | ±0.54% -14.45%  |
| RowsBench         | bench_chunk_10_on_10k      | 2    | 3   | 97.067mb +0.00%  | 3.658ms +6.22%   | ±2.69% -16.31%  |
| RowsBench         | bench_diff_left_1k_on_10k  | 2    | 3   | 114.423mb +0.00% | 181.733ms -0.55% | ±0.59% +30.06%  |
| RowsBench         | bench_diff_right_1k_on_10k | 2    | 3   | 97.143mb +0.00%  | 18.167ms -0.94%  | ±0.89% +4.71%   |
| RowsBench         | bench_drop_1k_on_10k       | 2    | 3   | 97.942mb +0.00%  | 1.765ms +6.92%   | ±2.08% -47.17%  |
| RowsBench         | bench_drop_right_1k_on_10k | 2    | 3   | 97.942mb +0.00%  | 1.694ms -11.42%  | ±0.59% -61.87%  |
| RowsBench         | bench_entries_on_10k       | 2    | 3   | 96.103mb +0.00%  | 4.932ms +4.71%   | ±2.67% +16.24%  |
| RowsBench         | bench_filter_on_10k        | 2    | 3   | 96.632mb +0.00%  | 15.964ms -1.12%  | ±1.43% -5.28%   |
| RowsBench         | bench_find_on_10k          | 2    | 3   | 96.632mb +0.00%  | 15.970ms +3.04%  | ±0.97% +158.71% |
| RowsBench         | bench_find_one_on_10k      | 10   | 3   | 95.323mb +0.00%  | 1.900μs +0.32%   | ±0.00% -100.00% |
| RowsBench         | bench_first_on_10k         | 10   | 3   | 95.323mb +0.00%  | 0.400μs 0.00%    | ±0.00% 0.00%    |
| RowsBench         | bench_flat_map_on_1k       | 2    | 3   | 104.541mb +0.00% | 15.126ms -3.83%  | ±0.17% -92.66%  |
| RowsBench         | bench_map_on_10k           | 2    | 3   | 134.608mb +0.00% | 73.268ms +1.14%  | ±0.72% +68.26%  |
| RowsBench         | bench_merge_1k_on_10k      | 2    | 3   | 97.151mb +0.00%  | 1.616ms +13.39%  | ±0.88% -58.54%  |
| RowsBench         | bench_partition_by_on_10k  | 2    | 3   | 100.522mb +0.00% | 62.725ms +2.29%  | ±0.64% +777.87% |
| RowsBench         | bench_remove_on_10k        | 2    | 3   | 98.204mb +0.00%  | 3.982ms +0.54%   | ±1.59% -10.08%  |
| RowsBench         | bench_sort_asc_on_1k       | 2    | 3   | 95.685mb +0.00%  | 41.481ms +0.66%  | ±1.78% +92.23%  |
| RowsBench         | bench_sort_by_on_1k        | 2    | 3   | 95.685mb +0.00%  | 41.866ms -4.62%  | ±0.45% -74.92%  |
| RowsBench         | bench_sort_desc_on_1k      | 2    | 3   | 95.685mb +0.00%  | 41.913ms -0.42%  | ±0.97% +251.52% |
| RowsBench         | bench_sort_entries_on_1k   | 2    | 3   | 97.763mb +0.00%  | 8.426ms +1.61%   | ±2.18% +111.10% |
| RowsBench         | bench_sort_on_1k           | 2    | 3   | 95.513mb +0.00%  | 29.495ms +1.16%  | ±0.33% -40.45%  |
| RowsBench         | bench_take_1k_on_10k       | 10   | 3   | 95.323mb +0.00%  | 14.440μs +8.33%  | ±2.38% +36.13%  |
| RowsBench         | bench_take_right_1k_on_10k | 10   | 3   | 95.323mb +0.00%  | 15.676μs +1.59%  | ±3.29% +117.13% |
| RowsBench         | bench_unique_on_1k         | 2    | 3   | 114.424mb +0.00% | 183.259ms -3.38% | ±0.68% -16.18%  |
+-------------------+----------------------------+------+-----+------------------+------------------+-----------------+

@codecov
Copy link

codecov bot commented Mar 29, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.11%. Comparing base (edb042d) to head (a0c242d).
Report is 4 commits behind head on 1.x.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##              1.x    #1561      +/-   ##
==========================================
- Coverage   83.16%   83.11%   -0.05%     
==========================================
  Files         695      695              
  Lines       18892    18893       +1     
==========================================
- Hits        15711    15703       -8     
- Misses       3181     3190       +9     
Components Coverage Δ
etl 86.17% <100.00%> (-0.10%) ⬇️
cli 84.59% <100.00%> (ø)
lib-array-dot 94.53% <ø> (ø)
lib-azure-sdk 62.56% <ø> (ø)
lib-doctrine-dbal-bulk 90.11% <ø> (ø)
lib-filesystem 78.02% <ø> (ø)
lib-parquet 84.33% <ø> (ø)
lib-parquet-viewer 82.02% <ø> (ø)
lib-snappy 90.69% <ø> (-0.47%) ⬇️
bridge-filesystem-async-aws 90.38% <ø> (ø)
bridge-filesystem-azure 89.92% <ø> (ø)
bridge-monolog-http 96.38% <ø> (ø)
symfony-http-foundation 74.41% <ø> (ø)
adapter-chartjs 86.45% <ø> (ø)
adapter-csv 89.57% <ø> (ø)
adapter-doctrine 89.14% <ø> (ø)
adapter-elasticsearch 97.19% <ø> (ø)
adapter-google-sheet 78.04% <ø> (ø)
adapter-http 59.15% <ø> (ø)
adapter-json 90.62% <ø> (ø)
adapter-logger 53.84% <ø> (ø)
adapter-meilisearch 97.75% <ø> (ø)
adapter-parquet 80.85% <ø> (ø)
adapter-text 84.44% <ø> (ø)
adapter-xml 83.15% <ø> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@norberttech norberttech merged commit 8a24f10 into 1.x Mar 29, 2025
21 of 22 checks passed
@norberttech norberttech deleted the 1560-remove-entry-object-from-column-statistics branch March 29, 2025 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove entry object from Column Statistics

2 participants