Skip to content

Redesign re_types API ("as_components" model)#3162

Merged
teh-cmc merged 18 commits intomainfrom
cmc/rust_as_components
Sep 1, 2023
Merged

Redesign re_types API ("as_components" model)#3162
teh-cmc merged 18 commits intomainfrom
cmc/rust_as_components

Conversation

@teh-cmc
Copy link
Copy Markdown
Contributor

@teh-cmc teh-cmc commented Aug 30, 2023

Commit by commit

This PR redesigns re_types's traits in an effort to allow users to easily implement their own handwritten Archetypes/Components/Datatypes, or even extend builtin ones.
It also introduces the of notion of "component lists" as first-class citizens, which is the basis of our archetype extension story.

The new traits are designed in a way to make most of their methods optional, with sane default implementations that either return errors or automatically "do the right thing" by relying on other methods.
In most cases, this makes it possible to implement "just what you need", and everything should work fine.

The new LoggableList, ComponentList & DatatypeList traits allow for erased collections of components, and do not require any effort on part of the user as we provide blanket implementations for all common cases.

The legacy iterator-based compatibility layer in the deserialization path is gone.
Similarly, all the awful datatype extension hacks are gone. Extensions live in the Field and the Field alone, until we get rid of arrow2-convert and DataCell::component_name in the future.

Quoting the crate-level doc:

The [Archetype] trait is the core of this crate and is a good starting point to get familiar with the code.
An archetype is a logical collection of [Component]s that play well with each other.

Rerun (and the underlying Arrow data framework) is designed to work with large arrays of [Component]s, as opposed to single instances.
When multiple instances of a [Component] are put together in an array, they yield a [ComponentList]: the atomic unit of (de)serialization.

Internally, [Component]s are implemented using many different [Datatype]s.

All builtin archetypes, components and datatypes can be found in their respective top-level modules.

Fixes #3178
Part of #3103


Implementing a custom component is now fairly straightforward:

/// A custom [`rerun::Component`] that is backed by a builtin [`Float32`] scalar [`rerun::Datatype`].
#[derive(Debug, Clone, Copy)]
struct Confidence(Float32);

impl Loggable for Confidence {
    type Name = ComponentName;

    fn name() -> Self::Name {
        "user.Confidence".into()
    }

    fn arrow_datatype() -> arrow2::datatypes::DataType {
        Float32::arrow_datatype()
    }

    fn try_to_arrow_opt<'a>(
        data: impl IntoIterator<Item = Option<impl Into<std::borrow::Cow<'a, Self>>>>,
    ) -> re_types::SerializationResult<Box<dyn arrow2::array::Array>>
    where
        Self: 'a,
    {
        Float32::try_to_arrow_opt(data.into_iter().map(|opt| opt.map(Into::into).map(|c| c.0)))
    }
}

And so is implementing a custom archetype:

/// A custom [`Archetype`] that extends Rerun's builtin [`Points3D`] archetype with extra
/// [`rerun::Component`]s.
struct CustomPoints3D {
    points3d: Points3D,
    confidences: Option<Vec<Confidence>>,
}

impl Archetype for CustomPoints3D {
    fn name() -> ArchetypeName {
        "user.CustomPoints3D".into()
    }

    fn required_components() -> std::borrow::Cow<'static, [rerun::ComponentName]> {
        Points3D::required_components()
    }

    fn recommended_components() -> std::borrow::Cow<'static, [rerun::ComponentName]> {
        Points3D::recommended_components()
            .iter()
            .copied()
            .chain([Confidence::name()])
            .collect::<Vec<_>>()
            .into()
    }

    fn optional_components() -> std::borrow::Cow<'static, [rerun::ComponentName]> {
        Points3D::optional_components()
    }

    fn num_instances(&self) -> usize {
        self.points3d.num_instances()
    }

    fn as_component_lists(&self) -> Vec<&dyn ComponentList> {
        // TODO(#3159): need an easy way to get a CustomPoints3DIndicator component in here!
        self.points3d
            .as_component_lists()
            .into_iter()
            .chain(
                std::iter::once(self.confidences.as_ref().map(|v| v as &dyn ComponentList))
                    .flatten(),
            )
            .collect()
    }
}

As a nice side-effect of these simplified traits, the custom_space_view example is now much simpler:
image

What

Checklist

@teh-cmc teh-cmc force-pushed the cmc/rust_as_components branch from db96379 to 2f79e8e Compare August 30, 2023 16:38
@teh-cmc teh-cmc added sdk-rust Rust logging API codegen/idl labels Aug 30, 2023
@teh-cmc teh-cmc force-pushed the cmc/rust_as_components branch from 2f79e8e to 462e4a9 Compare August 30, 2023 16:52
Copy link
Copy Markdown
Member

@emilk emilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!


/// Returns the names of all components that _must_ be provided by the user when constructing
/// this archetype.
fn required_components() -> ::std::borrow::Cow<'static, [ComponentName]>;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with the the extra :: ? Isn't std::borrow::Cow verbose enough?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old habits from codegen code; not needed here

/// Returns the names of all components that must, should and may be provided by the user when
/// constructing this archetype.
#[inline]
fn all_components() -> ::std::borrow::Cow<'static, [ComponentName]> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of several methods that the user really shouldn't override. We should probably note that in the docstring, and maybe put them in their own section

Copy link
Copy Markdown
Contributor Author

@teh-cmc teh-cmc Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does make sense to override this if you have everything available statically and want to avoid allocations.

I'll detail that in the docstring.

/// Given an iterator of Arrow arrays and their respective field metadata, deserializes them
/// into this archetype.
///
/// Panics on failure.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should have the panicking from_arrow - it seems like a footgun waiting to happen. One should never assume some arrow data follows the expected schema.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rip

@teh-cmc teh-cmc force-pushed the cmc/rust_as_components branch from 568f7a2 to 4d03cf4 Compare September 1, 2023 09:33
@teh-cmc teh-cmc merged commit 0a2258a into main Sep 1, 2023
@teh-cmc teh-cmc deleted the cmc/rust_as_components branch September 1, 2023 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

codegen/idl sdk-rust Rust logging API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Archetypes should implement as_components - Rust

2 participants