Skip to content

DICOM collector API#657

Merged
Enet4 merged 22 commits intomasterfrom
new/object/collector
Jul 27, 2025
Merged

DICOM collector API#657
Enet4 merged 22 commits intomasterfrom
new/object/collector

Conversation

@Enet4
Copy link
Copy Markdown
Owner

@Enet4 Enet4 commented Jun 29, 2025

The DICOM collector API in DicomObjectCollector can be used for reading DICOM objects in contiguous chunks, thus being able to collect meta-data earlier and reducing memory footprint by not having to keep the full data set in memory.

Summary

  • [parser] add LazyDataSetReader::peek and LazyDataSetReader::new_with_ts
  • [parser] change Debug impl for LazyDataToken so that it works for any data source
  • [object] add collector module and DicomCollector abstraction
    • allows users to fetch whole DICOM data elements in chunks, and also enables reading pixel data fragments one by one
    • pub use DicomCollector and DicomCollectorOptions at crate root
  • Adjust documentation a bit

Known caveats

  • While each pixel data fragment can be retrieved independently, the API does not provide a way to read portions of said fragments, which could be a problem in memory-tight platforms working with large fragments.
    • There's always the theoretical limit of 2^32-2 bytes (~ 4GiB), though other DICOM tools out there cannot even handle more than 2^31-1 bytes (~ 2GiB).
  • It cannot read multiple chunks from the same nested data set (sequence). I suspect this should only become a problem in very, very large structured reports.
  • read_next_fragment allows consumers to retrieve the basic offset table as plain bytes, despite being the fragment reserved for the basic offset table. Users who are interested in the offset table should call read_basic_offset_table before calling read_next_fragment, because it is not saved by the collector.
    • If this behavior becomes confusing to users, it might need to be reconsidered in some way.
  • The plan is to use this API on DICOM object implementations, but this work hasn't been done yet.

@Enet4 Enet4 added A-lib Area: library C-object Crate: dicom-object C-parser Crate: dicom-parser new This provides a new, mostly independent feature labels Jun 29, 2025
@Enet4 Enet4 force-pushed the new/object/collector branch 3 times, most recently from 1a17b4b to b38f206 Compare June 29, 2025 21:57
@Enet4 Enet4 marked this pull request as ready for review July 16, 2025 14:54
@Enet4 Enet4 force-pushed the new/object/collector branch from 728dcb5 to 9026c9e Compare July 16, 2025 15:41
Enet4 added 10 commits July 18, 2025 21:23
- [parser] add LazyDataSetReader::peek and LazyDataSetReader::new_with_ts
- [parser] change Debug impl for LazyDataToken so that it works for any
  data source
- [object] add collector module and DicomCollector abstraction
   - allows users to fetch DICOM data in chunks
   - pub use DicomCollector and DicomCollectorOptions at crate root
- Adjust documentation
- add read_up_to_pixeldata,
  shortcut method
- add read_basic_offset_table
- add missing backtrace fields
- add documentation to `Error`
- format code
- make fields public
- document it
- with odd length strategy support
- make it a consuming builder type
- add support for odd_length strategy option
- support overriding the data dictionary
@Enet4 Enet4 force-pushed the new/object/collector branch from 9026c9e to 7dc935d Compare July 18, 2025 20:23
Enet4 added 8 commits July 18, 2025 21:45
- use `read_dataset_up_to_pixeldata` where possible
and remove unused variants
- so that their size is not too large
- impl Debug for CollectionSource
- add "source" field to DicomCollector Debug impl
- move transfer syntax UID resolution to inside CollectionSource
- remove type parameter 't
- add ts_hint through option method `expected_ts` and `unset_expected_ts`
- remove `open_file_with_ts` and `from_reader_with_ts`
- type parameters less likely to change should appear last
- provide type parameter defaults for DicomCollector, simplify impls accordingly
- move all examples and extended explanation to module-level documentation
@Enet4 Enet4 force-pushed the new/object/collector branch from 8fae262 to 3efc39f Compare July 26, 2025 15:01
Enet4 added 3 commits July 26, 2025 16:11
- only available in 1.83.0,
  whereas MSRV is still 1.72.0
- add method `DicomCollector::take_file_meta`
- add samples per pixel in sample data
@Enet4 Enet4 merged commit f274e61 into master Jul 27, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-lib Area: library C-object Crate: dicom-object C-parser Crate: dicom-parser new This provides a new, mostly independent feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant