Suppose you have an RGB image, and for each pixel you want to attach some custom data, such as:
- label: string
- confidence: f32
- …
How would you do it? I see three different ways of doing it:
A) Log a box per pixel
Log an extra Boxes2D entity, instanced so that you end up with one box per pixel. You can then attach custom components to that entity.
This is the ugliest solution, but also the only one that works reliably at the moment.
Pros:
- Works today
- Supports any and all component types (you can even attach an image to each pixel of an image, e.g. for light-field visualization)
Cons:
B) Log an extra image per custom data
Log multiple images to sibling entities, one for each type of custom data.
For instance, you could log confidence as a Luminance image with datatype f32.
Pros:
- Works today
- Pro: you can visualize the confidence image with color mapping
Cons:
Image is limited in the types it supports (each pixel can only be L, RGB, or RGBA, of u8, u16, f32 etc)
- No support for strings (but can often be done with
SegmentationImage)
- No support for e.g. a jacobian per pixel
- No support for custom components
- Has wrong semantic meaning (encoding "confidence" as "luminance" doesn't really make sense)
- Our support for stacking dozens of images on top of each other is poor (but can be fixed)
- We don't show all of them when hovering
- What image ends up on top is not well-defined
- It's difficult to configure which image ends up on top
- It's difficult to configure the transparency of each image
Images are not really made to be vessels of custom data
- we could use a
Tensor instead, but it's not a lot better
C) Add support for "projecting" components onto pixels
The user would log a single 640x480 RGB image, together with 307200 values for a custom "confidence" component, and a 307200 strings with labels, etc. They would log these components to the same entity.
If the user hovers the pixel at (X=52, Y=23) we translate that to pixel number 52+23*640=14772, and then look up the values of the 14772:th instance of all components at this entity path, and show that in the tooltip when hovering an image.
Pros:
- Supports all custom components
- Easy to implement
Cons:
- Slightly hacky
- The selection panel would show a single image and 307200 floats, for instance
- What happens if the number of components is different from WxH?
- This would be slightly nicer if we supported nullable components, which we want to remove:
Related:
Suppose you have an RGB image, and for each pixel you want to attach some custom data, such as:
How would you do it? I see three different ways of doing it:
A) Log a box per pixel
Log an extra
Boxes2Dentity, instanced so that you end up with one box per pixel. You can then attach custom components to that entity.This is the ugliest solution, but also the only one that works reliably at the moment.
Pros:
Cons:
Boxes3Dis very slow #10276B) Log an extra image per custom data
Log multiple images to sibling entities, one for each type of custom data.
For instance, you could log
confidenceas aLuminanceimage with datatypef32.Pros:
Cons:
Imageis limited in the types it supports (each pixel can only be L, RGB, or RGBA, of u8, u16, f32 etc)SegmentationImage)Images are not really made to be vessels of custom dataTensorinstead, but it's not a lot betterC) Add support for "projecting" components onto pixels
The user would log a single 640x480 RGB image, together with 307200 values for a custom "confidence" component, and a 307200 strings with labels, etc. They would log these components to the same entity.
If the user hovers the pixel at (X=52, Y=23) we translate that to pixel number 52+23*640=14772, and then look up the values of the 14772:th instance of all components at this entity path, and show that in the tooltip when hovering an image.
Pros:
Cons:
NaN, etc is often good-enoughRelated: