GH-39769: [C++][Device] Fix Importing nested and string types for DeviceArray#39770
Merged
zeroshade merged 3 commits intoapache:mainfrom Feb 5, 2024
Merged
GH-39769: [C++][Device] Fix Importing nested and string types for DeviceArray#39770zeroshade merged 3 commits intoapache:mainfrom
zeroshade merged 3 commits intoapache:mainfrom
Conversation
|
|
pitrou
requested changes
Jan 24, 2024
felipecrv
reviewed
Jan 29, 2024
felipecrv
approved these changes
Jan 29, 2024
Member
|
Do you have plans to add CUDA-based tests? |
Member
Author
|
@pitrou I'll add explicit CUDA based tests for the device array interface as a separate PR for #39786 (comment) |
bkietz
approved these changes
Feb 2, 2024
pitrou
approved these changes
Feb 5, 2024
|
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 26801f1. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 6 possible false positives for unstable benchmarks that are known to sometimes produce them. |
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this pull request
Feb 19, 2024
…or DeviceArray (apache#39770) ### Rationale for this change In my testing with libcudf and other GPU data, I discovered a deficiency in ImportDeviceArray and thus ImportDeviceRecordBatch where the device type and memory manager aren't propagated to child importers and it fails to import offset-based types such as strings. ### What changes are included in this PR? These are relatively easily handled by first ensuring that `ImportChild` propagates the device_type and memory manager from the parent. Then for importing offset based values we merely need to use the memory manager to copy the final offset value to the CPU to use for the buffer size computation. This will work for any device which has implemented CopyBufferTo/From ### Are these changes tested? A new test is added to test these situations. * Closes: apache#39769 Authored-by: Matt Topol <zotthewizard@gmail.com> Signed-off-by: Matt Topol <zotthewizard@gmail.com>
pitrou
pushed a commit
that referenced
this pull request
May 23, 2024
#41477) ### Rationale for this change Currently ```MemoryManager``` objects define functionality to Copy or View entire buffers. Occasionally there is the need to only copy a single value or slice from a buffer to a piece of CPU memory (see #39770 (comment)). It's overkill to do a bunch of whole Buffer operations and manually slicing just to copy 4 or 8 bytes. ### What changes are included in this PR? Add the ```MemoryManager::CopyBufferSliceToCPU``` function, which initially attempts to use memcpy for the specified slice. If this is not possible, it defaults to copying the entire buffer and then viewing/copying the slice. Update ```ArrayImporter::ImportStringValuesBuffer``` to use this function. ### Are these changes tested? ```ArrayImporter::ImportStringValuesBuffer``` is tested as a part of ```arrow-c-bridge-test``` * GitHub Issue: #39858 Lead-authored-by: Alan Stoate <alan.stoate@gmail.com> Co-authored-by: Mac Lilly <maclilly45@gmail.com> Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
vibhatha
pushed a commit
to vibhatha/arrow
that referenced
this pull request
May 25, 2024
…pointer (apache#41477) ### Rationale for this change Currently ```MemoryManager``` objects define functionality to Copy or View entire buffers. Occasionally there is the need to only copy a single value or slice from a buffer to a piece of CPU memory (see apache#39770 (comment)). It's overkill to do a bunch of whole Buffer operations and manually slicing just to copy 4 or 8 bytes. ### What changes are included in this PR? Add the ```MemoryManager::CopyBufferSliceToCPU``` function, which initially attempts to use memcpy for the specified slice. If this is not possible, it defaults to copying the entire buffer and then viewing/copying the slice. Update ```ArrayImporter::ImportStringValuesBuffer``` to use this function. ### Are these changes tested? ```ArrayImporter::ImportStringValuesBuffer``` is tested as a part of ```arrow-c-bridge-test``` * GitHub Issue: apache#39858 Lead-authored-by: Alan Stoate <alan.stoate@gmail.com> Co-authored-by: Mac Lilly <maclilly45@gmail.com> Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
In my testing with libcudf and other GPU data, I discovered a deficiency in ImportDeviceArray and thus ImportDeviceRecordBatch where the device type and memory manager aren't propagated to child importers and it fails to import offset-based types such as strings.
What changes are included in this PR?
These are relatively easily handled by first ensuring that
ImportChildpropagates the device_type and memory manager from the parent. Then for importing offset based values we merely need to use the memory manager to copy the final offset value to the CPU to use for the buffer size computation.This will work for any device which has implemented CopyBufferTo/From
Are these changes tested?
A new test is added to test these situations.