Skip to content

refactor: improve ak.to_cudf errors and documentation#3850

Merged
ikrommyd merged 4 commits intoscikit-hep:mainfrom
ikrommyd:to-cudf-supported-versions-and-docstring
Feb 6, 2026
Merged

refactor: improve ak.to_cudf errors and documentation#3850
ikrommyd merged 4 commits intoscikit-hep:mainfrom
ikrommyd:to-cudf-supported-versions-and-docstring

Conversation

@ikrommyd
Copy link
Copy Markdown
Collaborator

@ikrommyd ikrommyd commented Feb 5, 2026

We improve the the to_cudf implementation with appropriate import error check for cudf, proper error for unsupported cudf versions and we also improve its docstring. Finally we fix the cudf documentation by adding ak.to_cudf properly to the api reference tree.

@ikrommyd ikrommyd changed the title refactor: improve ak.to_cudf implementation and documentation refactor: improve ak.to_cudf errors and documentation Feb 5, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 14.28571% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.63%. Comparing base (303bcdd) to head (e9663c6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/awkward/operations/ak_to_cudf.py 14.28% 12 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/awkward/operations/ak_to_raggedtensor.py 21.81% <ø> (ø)
src/awkward/operations/ak_to_safetensors.py 84.61% <ø> (ø)
src/awkward/operations/ak_to_tensorflow.py 28.12% <ø> (ø)
src/awkward/operations/ak_to_torch.py 33.33% <ø> (ø)
src/awkward/operations/ak_to_cudf.py 35.00% <14.28%> (-25.00%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 5, 2026

The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR3850

Copy link
Copy Markdown
Member

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ikrommyd - thanks it looks great! I think the tensor converters are limited in what they can handle.

@ikrommyd
Copy link
Copy Markdown
Collaborator Author

ikrommyd commented Feb 6, 2026

@ianna all the converters use ak.to_layout on the input which means the type of the input is anything that is or an be converted to an awkward array (python list for example). It's just a docstring consistency that I'm fixing. We write "Array-like data (anything #ak.to_layout recognizes)." on ALL other docstrings except these 🤣 That's all there is to the update.

Edit:
Yeah I see what you mean but if I input for example a python list, it will work. That's my point since it will use ak.to_layout on the input. It doesn't have to be an awkward array. So the "type" is technically "anything that ak.to_layout recognizes" but it doesn't mean that it will work without error on all of them. I think what I'm changing is meant to reflect the type. In ak.to_numpy, we write "Array-like data (anything #ak.to_layout recognizes)." too and to_numpy fill fail for jagged data. I think the docstring changes are correct given how we write it anywhere else and make them consistent with the rest of the docstrings.

@ianna
Copy link
Copy Markdown
Member

ianna commented Feb 6, 2026

@ianna all the converters use ak.to_layout on the input which means the type of the input is anything that is or an be converted to an awkward array (python list for example). It's just a docstring consistency that I'm fixing. We write "Array-like data (anything #ak.to_layout recognizes)." on ALL other docstrings except these 🤣 That's all there is to the update.

Edit: Yeah I see what you mean but if I input for example a python list, it will work. That's my point since it will use ak.to_layout on the input. It doesn't have to be an awkward array. So the "type" is technically "anything that ak.to_layout recognizes" but it doesn't mean that it will work without error on all of them. I think what I'm changing is meant to reflect the type. In ak.to_numpy, we write "Array-like data (anything #ak.to_layout recognizes)." too and to_numpy fill fail for jagged data. I think the docstring changes are correct given how we write it anywhere else and make them consistent with the rest of the docstrings.`

Are we testing for all these? ak.to_layout takes an array – Array-like data. May be a high level ak.Array, ak.Record (if allow_record), ak.ArrayBuilder, or low-level ak.contents.Content, ak.record.Record (if allow_record), or a supported backend array (NumPy ndarray, CuPy ndarray, JAX DeviceArray), data-less TypeTracer, Arrow object, or an arbitrary Python iterable (for ak.from_iter to convert).

@ikrommyd
Copy link
Copy Markdown
Collaborator Author

ikrommyd commented Feb 6, 2026

In general yes. Of course we don't test all possible inputs in every high-level function since they all call to_layout but to_layout is in general tested explicitly separately and indirectly in all high-level function tests since they all use it (since forever)
All highlevel functions can take in anything and will try to convert it into an awkward array first and then call the actual function. You can go from list -> jax for example like this.

In [3]: ak.to_jax([1,2,3])
Out[3]: Array([1, 2, 3], dtype=int64)

or to a torch tensor

In [2]: ak.to_torch([1,2,3])
Out[2]: tensor([1, 2, 3])

This is why I claim that my docstring changes here are correct and they were just an inconsistency with the rest of the dcostrings.

If the input to a highlevel function already is an awkward array or layout, it's a no-op.

Copy link
Copy Markdown
Member

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ikrommyd - thanks for double checking! Indeed, the functions take all these including the ArrayBuilder:

>>> builder = ak.ArrayBuilder()
>>> ak.to_torch(builder)
tensor([], dtype=torch.float64)

Please merge it if you are done with it. Thanks!

@ikrommyd ikrommyd merged commit c8c9cff into scikit-hep:main Feb 6, 2026
39 checks passed
@ikrommyd ikrommyd deleted the to-cudf-supported-versions-and-docstring branch February 6, 2026 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants