-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-37597: [MATLAB] Add toMATLAB method to arrow.array.ChunkedArray class
#37613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… StringType, and TimestampType
… all Numeric Types
kevingurney
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you!
|
+1 |
|
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 65e2f22. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them. |
…dArray` class (apache#37613) ### Rationale for this change Currently, there is no way to easily convert an `arrow.array.ChunkedArray` into a corresponding MATLAB array, other than (1) manually iterating chunk by chunk, (2) calling `toMATLAB` on each chunk, and then (3) concatenating all of the converted chunks together into one contiguous MATLAB array. It would be helpful to add a toMATLAB method to `arrow.array.ChunkedArray` that abstracts away all of these steps. ### What changes are included in this PR? 1. Added `toMATLAB` method to `arrow.array.ChunkedArray` class 2. Added `preallocateMATLABArray` abstract method to `arrow.type.Type` class. This method is used by the `ChunkedArray` `toMATLAB` to pre-allocate a MATLAB array of the expected class type and shape. This is necessary to ensure `toMATLAB` returns the correct MATLAB array when the `ChunkedArray` has zero chunks. If `toMATLAB` stored the result of calling `toMATLAB` on each chunk in a `cell` array before concatenating the values, `toMATLAB` would return a 0x0 `double` array for zero-chunked arrays. The pre-allocation approach avoids this issue. 3. Implement `preallocateMATLABArray` on all `arrow.type.Type` classes. 4. Added an abstract class `arrow.type.NumericType` that all classes representing numeric data types inherit from. `NumericType` implements `preallocateMATLABArray` for its subclasses. ### Are these changes tested? Yes. Added unit tests to `tChunkedArray.m`. ### Are there any user-facing changes? Yes. Users can now call `toMATLAB` on `ChunkedArray`s. **Example** ```matlab >> a = arrow.array([1 2 NaN 4 5]); >> b = arrow.array([6 7 8 9 NaN 11]); >> c = arrow.array.ChunkedArray.fromArrays(a, b); >> data = toMATLAB(c) data = 1 2 NaN 4 5 6 7 8 9 NaN 11 ``` * Closes: apache#37597 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
…dArray` class (apache#37613) ### Rationale for this change Currently, there is no way to easily convert an `arrow.array.ChunkedArray` into a corresponding MATLAB array, other than (1) manually iterating chunk by chunk, (2) calling `toMATLAB` on each chunk, and then (3) concatenating all of the converted chunks together into one contiguous MATLAB array. It would be helpful to add a toMATLAB method to `arrow.array.ChunkedArray` that abstracts away all of these steps. ### What changes are included in this PR? 1. Added `toMATLAB` method to `arrow.array.ChunkedArray` class 2. Added `preallocateMATLABArray` abstract method to `arrow.type.Type` class. This method is used by the `ChunkedArray` `toMATLAB` to pre-allocate a MATLAB array of the expected class type and shape. This is necessary to ensure `toMATLAB` returns the correct MATLAB array when the `ChunkedArray` has zero chunks. If `toMATLAB` stored the result of calling `toMATLAB` on each chunk in a `cell` array before concatenating the values, `toMATLAB` would return a 0x0 `double` array for zero-chunked arrays. The pre-allocation approach avoids this issue. 3. Implement `preallocateMATLABArray` on all `arrow.type.Type` classes. 4. Added an abstract class `arrow.type.NumericType` that all classes representing numeric data types inherit from. `NumericType` implements `preallocateMATLABArray` for its subclasses. ### Are these changes tested? Yes. Added unit tests to `tChunkedArray.m`. ### Are there any user-facing changes? Yes. Users can now call `toMATLAB` on `ChunkedArray`s. **Example** ```matlab >> a = arrow.array([1 2 NaN 4 5]); >> b = arrow.array([6 7 8 9 NaN 11]); >> c = arrow.array.ChunkedArray.fromArrays(a, b); >> data = toMATLAB(c) data = 1 2 NaN 4 5 6 7 8 9 NaN 11 ``` * Closes: apache#37597 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>
Rationale for this change
Currently, there is no way to easily convert an
arrow.array.ChunkedArrayinto a corresponding MATLAB array, other than (1) manually iterating chunk by chunk, (2) callingtoMATLABon each chunk, and then (3) concatenating all of the converted chunks together into one contiguous MATLAB array.It would be helpful to add a toMATLAB method to
arrow.array.ChunkedArraythat abstracts away all of these steps.What changes are included in this PR?
toMATLABmethod toarrow.array.ChunkedArrayclasspreallocateMATLABArrayabstract method toarrow.type.Typeclass. This method is used by theChunkedArraytoMATLABto pre-allocate a MATLAB array of the expected class type and shape. This is necessary to ensuretoMATLABreturns the correct MATLAB array when theChunkedArrayhas zero chunks. IftoMATLABstored the result of callingtoMATLABon each chunk in acellarray before concatenating the values,toMATLABwould return a 0x0doublearray for zero-chunked arrays. The pre-allocation approach avoids this issue.preallocateMATLABArrayon allarrow.type.Typeclasses.arrow.type.NumericTypethat all classes representing numeric data types inherit from.NumericTypeimplementspreallocateMATLABArrayfor its subclasses.Are these changes tested?
Yes. Added unit tests to
tChunkedArray.m.Are there any user-facing changes?
Yes. Users can now call
toMATLABonChunkedArrays.Example
toMATLABmethod toarrow.array.ChunkedArrayclass #37597