Skip to content

Add option to override per-VR character set usage#676

Merged
Enet4 merged 3 commits intomasterfrom
imp/parser/extend-charset-cs
Sep 25, 2025
Merged

Add option to override per-VR character set usage#676
Enet4 merged 3 commits intomasterfrom
imp/parser/extend-charset-cs

Conversation

@Enet4
Copy link
Copy Markdown
Owner

@Enet4 Enet4 commented Aug 21, 2025

Should resolve #675 by offering a way to ignore the fact that some VRs should always use the default character repertoire.

Summary

  • [parser] Add support for character set override logic
  • [object] Extend charset_override option to collector and in-mem DICOM object

@Enet4 Enet4 added enhancement A-lib Area: library C-object Crate: dicom-object C-parser Crate: dicom-parser labels Aug 21, 2025
@Enet4 Enet4 mentioned this pull request Aug 21, 2025
Copy link
Copy Markdown
Contributor

@naterichman naterichman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I'm missing something, but figured I would hop on a review in case it helped!

@momostarsky
Copy link
Copy Markdown

@Enet4 ​​Hi, why hasn't this PR been merged yet? Is there anything blocking it? Does it need more testing?​

@Enet4
Copy link
Copy Markdown
Owner Author

Enet4 commented Sep 22, 2025

I just didn't get to it yet. Let me assess and revise on Nathan's comment first, hopefully within the next few days.

- CharacterSetOverride allows one to indicate that
  the more extended character repertoire available
  should be used for decoding other VRs such as CS.
… object

- re-export dicom_parser options accordingly
@Enet4 Enet4 force-pushed the imp/parser/extend-charset-cs branch from 3941f03 to 6231ccf Compare September 24, 2025 07:59
- Include UR in VRs which should use the default character set
@Enet4
Copy link
Copy Markdown
Owner Author

Enet4 commented Sep 24, 2025

I rebased the branch and added UR as one of the VRs to use the default character set by default. Things seem to be in order. @momostarsky I noticed a deleted post here, are there still any issues?

@momostarsky
Copy link
Copy Markdown

@Enet4 It might be an issue with testing or configuration. I'll verify later using storescu and storescp to check if it affects read/write operations.

@momostarsky
Copy link
Copy Markdown

@Enet4 Building on the earlier tests, Test some DICOM files with character sets such as "ISO 192", "ISO 100", "GB18030", and "ISO 2022 IR 58". Modify the storescp implementation in store_sync.rs around line 192, add code similar to the following:
'''
match dicom_object::OpenFileOptions::new()
.charset_override(CharacterSetOverride::AnyVr)
.read_until(tags::PIXEL_DATA)
.open_file(&file_path)
{
Ok(dcm_obj) => {
match dcm_obj.element(tags::SPECIFIC_CHARACTER_SET) {
Ok(elm) => {
println!(
"SPECIFIC_CHARACTER_SET: {:?}",
elm.value().to_str()
);
}
Err(_) => {
warn!("Specific Character Set tag not found in the DICOM file");
}
};

                                    match dcm_obj.element(tags::BODY_PART_EXAMINED) {
                                        Ok(elm) => {
                                            println!(
                                                "BODY_PART_EXAMINED: {:?}",
                                                elm.value().to_str()
                                            );
                                        }
                                        Err(_) => {
                                            warn!("Body Part Examined tag not found in the DICOM file");
                                        }
                                    };
                                }
                                Err(e) => {
                                    warn!("Failed to read back stored file: {}", e);
                                }
                            }

'''
no abnormalities were found. Should we test with more datasets?

@Enet4
Copy link
Copy Markdown
Owner Author

Enet4 commented Sep 25, 2025

That sounds to offer good coverage already, thanks! If in the meantime you found what could be considered a bug, please file a new issue.

@Enet4 Enet4 merged commit 76e4e5a into master Sep 25, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-lib Area: library C-object Crate: dicom-object C-parser Crate: dicom-parser enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Charset Convert

3 participants