Skip to content

Conversation

@norberttech
Copy link
Member

@norberttech norberttech commented Apr 11, 2025

Change Log

Added

Fixed

  • schema comparison for nullability

Changed

Removed

Deprecated

Security


Description

Long story short, I accidentally introduced a regression #1587 by returning true.

So what happened is that first entry that was a string entry created from null (with FROM_NULL) metadata was pausing the schema validation with a result true.

This behavior was fixed, so whenever both schema validators will encounter a string_schema with FROM_NULL metadata and the expected definition is nullable, it will let it pass.

This behavior is here to not fail schema validation when schema allows value to be null and there is only one row (or it's a first row) where entry is created from null.

Entry Factory when creating entry from null without provided definition will create it as a string_entry('name', null, Metadata::with(Metadata::FROM_NULL)).

@norberttech norberttech linked an issue Apr 11, 2025 that may be closed by this pull request
@github-actions
Copy link
Contributor

github-actions bot commented Apr 11, 2025

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+-------------------+------+-----+-----------------+------------------+----------------+
| benchmark             | subject           | revs | its | mem_peak        | mode             | rstdev         |
+-----------------------+-------------------+------+-----+-----------------+------------------+----------------+
| CSVExtractorBench     | bench_extract_10k | 1    | 3   | 4.878mb +0.01%  | 616.662ms +0.20% | ±0.53% -11.84% |
| JsonExtractorBench    | bench_extract_10k | 1    | 3   | 5.466mb +0.00%  | 1.353s -0.21%    | ±0.44% -58.29% |
| ParquetExtractorBench | bench_extract_10k | 1    | 3   | 86.476mb +0.00% | 939.443ms -0.49% | ±0.45% -79.68% |
| TextExtractorBench    | bench_extract_10k | 1    | 3   | 4.603mb +0.01%  | 38.973ms -0.91%  | ±0.87% +28.19% |
| XmlExtractorBench     | bench_extract_10k | 1    | 3   | 4.577mb +0.01%  | 605.485ms -0.63% | ±0.62% -12.77% |
+-----------------------+-------------------+------+-----+-----------------+------------------+----------------+
Transformers
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark                   | subject                  | revs | its | mem_peak         | mode            | rstdev         |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 127.400mb +0.00% | 70.709ms +1.23% | ±1.53% -37.06% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+------------------+----------------+
| benchmark          | subject        | revs | its | mem_peak         | mode             | rstdev         |
+--------------------+----------------+------+-----+------------------+------------------+----------------+
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 64.045mb +0.00%  | 104.271ms -0.70% | ±0.13% -62.82% |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 84.077mb +0.00%  | 97.105ms -1.24%  | ±0.47% -8.50%  |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 166.560mb +0.00% | 20.780s -1.80%   | ±0.21% -6.80%  |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 18.135mb +0.00%  | 31.235ms -0.76%  | ±0.67% +28.52% |
+--------------------+----------------+------+-----+------------------+------------------+----------------+
Building Blocks
+-------------------+----------------------------+------+-----+------------------+------------------+------------------+
| benchmark         | subject                    | revs | its | mem_peak         | mode             | rstdev           |
+-------------------+----------------------------+------+-----+------------------+------------------+------------------+
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 44.043mb +0.00%  | 458.757ms -0.26% | ±0.45% +182.05%  |
| TypeDetectorBench | bench_type_detector        | 1    | 3   | 11.853mb +0.00%  | 94.505ms +0.16%  | ±0.52% +130.57%  |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 105.508mb +0.00% | 722.547ms +0.09% | ±0.44% -63.03%   |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 55.019mb +0.00%  | 363.603ms -1.19% | ±0.74% -73.49%   |
| EntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.797mb +0.00%  | 77.084ms -3.55%  | ±0.20% -80.13%   |
| RowsBench         | bench_chunk_10_on_10k      | 2    | 3   | 97.062mb +0.00%  | 3.250ms -17.24%  | ±2.16% +603.07%  |
| RowsBench         | bench_diff_left_1k_on_10k  | 2    | 3   | 114.418mb +0.00% | 182.071ms +1.57% | ±1.02% +69.64%   |
| RowsBench         | bench_diff_right_1k_on_10k | 2    | 3   | 97.138mb +0.00%  | 17.962ms -1.16%  | ±1.08% +50.89%   |
| RowsBench         | bench_drop_1k_on_10k       | 2    | 3   | 97.937mb +0.00%  | 1.588ms -20.11%  | ±1.85% -26.33%   |
| RowsBench         | bench_drop_right_1k_on_10k | 2    | 3   | 97.937mb +0.00%  | 1.486ms -31.85%  | ±1.97% -11.64%   |
| RowsBench         | bench_entries_on_10k       | 2    | 3   | 96.097mb +0.00%  | 4.272ms -18.82%  | ±2.50% +535.13%  |
| RowsBench         | bench_filter_on_10k        | 2    | 3   | 96.626mb +0.00%  | 16.297ms -1.77%  | ±0.48% -74.98%   |
| RowsBench         | bench_find_on_10k          | 2    | 3   | 96.626mb +0.00%  | 16.251ms -4.91%  | ±0.76% -59.49%   |
| RowsBench         | bench_find_one_on_10k      | 10   | 3   | 95.318mb +0.00%  | 1.894μs -0.64%   | ±2.53% +3.57%    |
| RowsBench         | bench_first_on_10k         | 10   | 3   | 95.318mb +0.00%  | 0.400μs 0.00%    | ±0.00% 0.00%     |
| RowsBench         | bench_flat_map_on_1k       | 2    | 3   | 104.536mb +0.00% | 14.779ms -10.47% | ±0.97% -16.04%   |
| RowsBench         | bench_map_on_10k           | 2    | 3   | 134.603mb +0.00% | 71.971ms -2.84%  | ±0.61% -32.79%   |
| RowsBench         | bench_merge_1k_on_10k      | 2    | 3   | 97.146mb +0.00%  | 1.319ms -33.82%  | ±1.85% +13.50%   |
| RowsBench         | bench_partition_by_on_10k  | 2    | 3   | 100.518mb +0.00% | 63.047ms -0.37%  | ±1.20% +105.46%  |
| RowsBench         | bench_remove_on_10k        | 2    | 3   | 98.199mb +0.00%  | 3.809ms -17.51%  | ±1.11% -43.64%   |
| RowsBench         | bench_sort_asc_on_1k       | 2    | 3   | 95.680mb +0.00%  | 39.604ms -4.29%  | ±1.30% +134.62%  |
| RowsBench         | bench_sort_by_on_1k        | 2    | 3   | 95.681mb +0.00%  | 42.064ms +1.94%  | ±2.30% +88.81%   |
| RowsBench         | bench_sort_desc_on_1k      | 2    | 3   | 95.680mb +0.00%  | 39.579ms -5.83%  | ±0.57% -70.25%   |
| RowsBench         | bench_sort_entries_on_1k   | 2    | 3   | 97.758mb +0.00%  | 8.335ms -3.18%   | ±3.56% +3853.89% |
| RowsBench         | bench_sort_on_1k           | 2    | 3   | 95.508mb +0.00%  | 29.515ms -4.37%  | ±0.85% -65.11%   |
| RowsBench         | bench_take_1k_on_10k       | 10   | 3   | 95.318mb +0.00%  | 13.849μs -5.19%  | ±2.68% +732.23%  |
| RowsBench         | bench_take_right_1k_on_10k | 10   | 3   | 95.318mb +0.00%  | 15.366μs -10.36% | ±2.77% -13.13%   |
| RowsBench         | bench_unique_on_1k         | 2    | 3   | 114.419mb +0.00% | 181.593ms -1.32% | ±0.71% -47.23%   |
+-------------------+----------------------------+------+-----+------------------+------------------+------------------+

@codecov
Copy link

codecov bot commented Apr 11, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.19%. Comparing base (941c4b9) to head (da8fec6).
Report is 2 commits behind head on 1.x.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##              1.x    #1594      +/-   ##
==========================================
+ Coverage   83.16%   83.19%   +0.02%     
==========================================
  Files         702      702              
  Lines       18983    18983              
==========================================
+ Hits        15787    15792       +5     
+ Misses       3196     3191       -5     
Components Coverage Δ
etl 86.29% <100.00%> (+0.06%) ⬆️
cli 84.59% <ø> (ø)
lib-array-dot 94.53% <ø> (ø)
lib-azure-sdk 62.56% <ø> (ø)
lib-doctrine-dbal-bulk 90.11% <ø> (ø)
lib-filesystem 78.02% <ø> (ø)
lib-parquet 84.36% <ø> (ø)
lib-parquet-viewer 82.02% <ø> (ø)
lib-snappy 91.16% <ø> (ø)
bridge-filesystem-async-aws 90.38% <ø> (ø)
bridge-filesystem-azure 89.92% <ø> (ø)
bridge-monolog-http 96.38% <ø> (ø)
symfony-http-foundation 74.41% <ø> (ø)
adapter-chartjs 86.45% <ø> (ø)
adapter-csv 89.57% <ø> (ø)
adapter-doctrine 89.14% <ø> (ø)
adapter-elasticsearch 97.19% <ø> (ø)
adapter-google-sheet 78.04% <ø> (ø)
adapter-http 59.15% <ø> (ø)
adapter-json 90.62% <ø> (ø)
adapter-logger 53.84% <ø> (ø)
adapter-meilisearch 97.75% <ø> (ø)
adapter-parquet 80.85% <ø> (ø)
adapter-text 84.44% <ø> (ø)
adapter-xml 83.15% <ø> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@norberttech norberttech merged commit 3740760 into 1.x Apr 11, 2025
25 checks passed
@norberttech norberttech deleted the 1593-false-positive-schema-validation branch April 11, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

False positive schema validation

2 participants