Skip to content

feat: Test more arrow types#818

Closed
hermanschaaf wants to merge 17 commits intomainfrom
test-arrow-types
Closed

feat: Test more arrow types#818
hermanschaaf wants to merge 17 commits intomainfrom
test-arrow-types

Conversation

@hermanschaaf
Copy link
Copy Markdown
Contributor

@hermanschaaf hermanschaaf commented Apr 21, 2023

This adds support for testing (almost) the complete range of Arrow types in destinations.

By default, only the types previously supported by CQTypes are enabled (with the important exception of more uint and int types than before, plus float32). The TestSourceOptions struct allows destinations to declare that they want to test with more column types, including maps, structs, times, etc.

Closes #805

@github-actions github-actions bot added the feat label Apr 21, 2023
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 21, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (ea90e4e) 47.46% compared to head (cdca89f) 47.46%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #818   +/-   ##
=======================================
  Coverage   47.46%   47.46%           
=======================================
  Files          59       59           
  Lines        6611     6611           
=======================================
  Hits         3138     3138           
  Misses       3065     3065           
  Partials      408      408           
Impacted Files Coverage Δ
internal/memdb/memdb.go 83.66% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Copy Markdown
Member

@disq disq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits and a slice copy issue

@disq disq self-requested a review April 24, 2023 12:36
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 24, 2023

⏱️ Benchmark results

Comparing with ea90e4e

  • DefaultConcurrencyDFS-2 resources/s: 10,361 ⬇️ 4.48% decrease vs. ea90e4e
  • DefaultConcurrencyRoundRobin-2 resources/s: 11,148 ⬆️ 0.39% increase vs. ea90e4e
  • Glob-2 ns/op: 266.4 ⬆️ 20.80% increase vs. ea90e4e
  • TablesWithChildrenDFS-2 resources/s: 25,228 ⬇️ 6.10% decrease vs. ea90e4e
  • TablesWithChildrenRoundRobin-2 resources/s: 24,679 ⬇️ 9.00% decrease vs. ea90e4e
  • TablesWithRateLimitingDFS-2 resources/s: 28.48 ⬆️ 0.11% increase vs. ea90e4e
  • TablesWithRateLimitingRoundRobin-2 resources/s: 863.1 ⬆️ 0.93% increase vs. ea90e4e

@hermanschaaf hermanschaaf marked this pull request as ready for review April 24, 2023 15:59
@disq disq self-requested a review April 24, 2023 16:57
@github-actions github-actions bot added feat and removed feat labels Apr 28, 2023
basicFields = append(basicFields, FixedWidthFields()...)

// add extensions
basicFields = append(basicFields, arrow.Field{Name: "uuid", Type: types.NewUUIDType(), Nullable: true})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add both nullable and not nullable fields of the same type to the schema

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe; but what is your reasoning?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CH: Array may contain nullable elements, but the array itself shouldn't be marked nullable.
We need to pass these types to verify the correct destination logic:

  • nullable array of nullable
  • nullable array of non-nullable
  • non-nullable array of nullable
  • non-nullable array of non-nullable

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks; maybe we can add 4 columns similar to what you suggested just to test nullable behavior specifically, rather than adding a nullable column for every type (doubling the number of columns)? There are already around 200 columns if all the options are enabled, and doubling that will make it harder to figure out what really went wrong in a test.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use testing.T.Run + utilize the fieldPrettify logic to construct the name to be able to deconstruct easily: https://github.com/cloudquery/plugin-sdk/blob/main/schema/arrow.go#L60

@github-actions github-actions bot added feat and removed feat labels Apr 29, 2023
@github-actions github-actions bot added the feat label Apr 29, 2023
Copy link
Copy Markdown
Contributor

@yevgenypats yevgenypats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I worked on #832, I think if we can review/merge 832 first and then rebase on top it should also simplify part of this PR as well due to the new abstraction

@hermanschaaf
Copy link
Copy Markdown
Contributor Author

Replaced by #863

@candiduslynx
Copy link
Copy Markdown
Contributor

@hermanschaaf is it OK to remove the branch as well?

kodiakhq bot pushed a commit that referenced this pull request May 15, 2023
This adds support for testing (almost) the complete range of Arrow types in destinations.

By default, only the types previously supported by CQTypes are enabled (with the important exception of more uint and int types than before, plus float32). The TestSourceOptions struct allows destinations to declare that they want to test with more column types, including maps, structs, times, etc.

I have not included all the nullable column variations. These are a bit more difficult to do now that we are not using `arrow.Field` directly anymore, plus we can rather do that as a follow-up if and when it becomes necessary to test this.

This PR is a rebased version for v3 that replaces #818.
@candiduslynx candiduslynx deleted the test-arrow-types branch June 5, 2024 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add all support arrow types to our tests (e.g. int8, int16, float64)

4 participants