Skip to content

Performance of bin for many->many bin mappings #1846

@SimonHeybrock

Description

@SimonHeybrock

We previously optimized cases like this cases on the SubbinSizes helper class. Here is an example where this is not enough:

image

Bin into 50x50 output bins takes about 2 seconds, even for virtually no events:

sc.bin(a, erase=['detector_id'], edges=[theta_bins, wavelength_bins])
  • Is there a way to ensure the optimization works better in this case?
  • Note there are about 200 million mappings between bins, so performance hit is not unexpected.
  • This issue might become much less relevant once we add multi-threading for reduction operations in transform. I think this should be the first thing to investigate, since it would also fix several other performance bottlenecks.

Note: A faster solution (for few events) is:

sc.bin(d.bins.concatenate('detector_id'), edges=[theta_bins, wavelength_bins])

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions