Skip to content

feat: Allow testing of more Arrow types#863

Merged
kodiakhq[bot] merged 8 commits intomainfrom
test-more-arrow-types
May 15, 2023
Merged

feat: Allow testing of more Arrow types#863
kodiakhq[bot] merged 8 commits intomainfrom
test-more-arrow-types

Conversation

@hermanschaaf
Copy link
Copy Markdown
Contributor

This adds support for testing (almost) the complete range of Arrow types in destinations.

By default, only the types previously supported by CQTypes are enabled (with the important exception of more uint and int types than before, plus float32). The TestSourceOptions struct allows destinations to declare that they want to test with more column types, including maps, structs, times, etc.

I have not included all the nullable column variations. These are a bit more difficult to do now that we are not using arrow.Field directly anymore, plus we can rather do that as a follow-up if and when it becomes necessary to test this.

This PR is a rebased version for v3 that replaces #818.

@codecov
Copy link
Copy Markdown

codecov bot commented May 12, 2023

Codecov Report

Patch coverage has no change and project coverage change: -3.02 ⚠️

Comparison is base (766fe62) 48.29% compared to head (28e3b13) 45.27%.

❗ Current head 28e3b13 differs from pull request most recent head 13aa690. Consider uploading reports for the commit 13aa690 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #863      +/-   ##
==========================================
- Coverage   48.29%   45.27%   -3.02%     
==========================================
  Files          29       29              
  Lines        2429     2591     +162     
==========================================
  Hits         1173     1173              
- Misses       1154     1316     +162     
  Partials      102      102              
Impacted Files Coverage Δ
schema/testdata.go 0.00% <0.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@github-actions
Copy link
Copy Markdown

github-actions bot commented May 12, 2023

⏱️ Benchmark results

  • Glob-2 ns/op: 201.3

},
},
// TestSourceOptions controls which types are included by TestSourceColumns.
type TestSourceOptions struct {
Copy link
Copy Markdown
Member

@disq disq May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding a shorthand to set all options to true?
Something like func AllTestSourceOptions() TestSourceOptions and then every field set to true in the return struct.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd even swap for SkipTypeX semantics, so that the default would be everything and if not, the destination has to be mindful about it (just like of all the skip modes for writing)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems that this testing suite is growing I would do two things, one is more important then the other imo:

  1. I would use Skip - this way everything is included by default as usually destinations should support everything, unlike modes of operation the SDK doesn't error out (at least for now) if a mode is not supported so if a destinations updates to a newer SDK with new arrow types this would error out in "prod" instead of tests.
  2. More of a preference - I think mostly we used WithOption() pattern so I would stick to that but that's not a must.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed. Let's use Skip so that it needs to be explicitly declared in destinations when something isn't supported. Can use the WithOption pattern too 👍

Copy link
Copy Markdown
Contributor

@yevgenypats yevgenypats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Had mostly one comment around Skip vs Include

bldr.Field(i).(*array.ListBuilder).ValueBuilder().(*types.MacBuilder).Append(mac)
} else {
panic("unknown type: " + c.Type.String() + " column: " + c.Name)
example := getExampleJSON(c.Name, c.Type, opts)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mentioned it but I can't recall, why you went with json instead of regular append to builder?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both can work, and I don't recall the exact reason anymore either 😅 I think it was just easier and cleaner to split it into a separate recursive function that can also handle composite types, and doesn't require any type switches. Using JSON is also a common pattern in the Arrow library testing files, which is where I got the idea.

},
},
// TestSourceOptions controls which types are included by TestSourceColumns.
type TestSourceOptions struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems that this testing suite is growing I would do two things, one is more important then the other imo:

  1. I would use Skip - this way everything is included by default as usually destinations should support everything, unlike modes of operation the SDK doesn't error out (at least for now) if a mode is not supported so if a destinations updates to a newer SDK with new arrow types this would error out in "prod" instead of tests.
  2. More of a preference - I think mostly we used WithOption() pattern so I would stick to that but that's not a must.

@hermanschaaf hermanschaaf requested a review from yevgenypats May 15, 2023 11:40
@hermanschaaf
Copy link
Copy Markdown
Contributor Author

Updated to use Skip and options columns. Most destinations will probably start failing their tests when we merge this now, but we'll just have to update them to skip the types they don't support yet (until they do support them)

@kodiakhq kodiakhq bot merged commit 28642ec into main May 15, 2023
@kodiakhq kodiakhq bot deleted the test-more-arrow-types branch May 15, 2023 13:09
kodiakhq bot pushed a commit that referenced this pull request May 15, 2023
🤖 I have created a release *beep* *boop*
---


## [3.2.0](v3.1.0...v3.2.0) (2023-05-15)


### Features

* Allow testing of more Arrow types ([#863](#863)) ([28642ec](28642ec))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants