feat: add 3D mesh support and MeshFolder builder#8055
Conversation
A Test Conducted:from datasets import Features, Value, Sequence, Image, Audio, Mesh, load_dataset
# Define features.
features = Features({
'id': Value('string'),
'objaverse_uid': Value('string'),
'text': Value('string'),
'image': Image(),
'audio': Audio(),
'mesh': Mesh(), # NEW automatically handles struct<bytes, path>
'metadata': {
'image_score': Value('double'),
'audio_score': Value('double'),
'tags': Sequence(Value('string'))
}
})
# Load a Parquet.
dataset = load_dataset(
"parquet",
data_files={"train": "train-00001.parquet"},
features=features,
streaming=True
)["train"]
# Push.
dataset.push_to_hub("VINAY-UMRETHE/Vividha-Test")This can be viewed at VINAY-UMRETHE/Vividha-test Although the dataset viewer does not show anything since the site is not configured to show Mesh with a rendered image (heavy) or a simple placeholder icon. Up to devs. |
|
@lhoestq review |
|
Looks really cool ! is there a python lib that can be used to load the data instead of returning bytes/path ? and sorry for the delay ! |
Yes, I updated:
|
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
This PR introduces 3D mesh support to the
datasetslibrary, mirroring the existing paradigms for Image, Audio, and Video modalities. this is made to support 3D data just like image, audio, etc...new
Meshfeature class, which manages 3D data via a PyArrowstructcontaining both raw bytes and file paths. support is intentionally focused on self-contained binary formats like GLB, PLY, and STL (since they seem sweetspot to me because others like.obj .gltfrequires external sub files).new
MeshFolderbuilder module. This packaged module enables users to load datasets directly from structured or unstructured directories of mesh files. implementation has been integrated into library's core, including registration in the main features module and support for 3D data within `WebDataset``Tests Were conducted using some new files too.
TESTS CONDUCTED :
Output:
some test files were added too in
tests/features/datafolder like I saw for other modalites.