Skip to content

[core] Make PersonName hold both borrowed and owned strings#635

Merged
Enet4 merged 1 commit intoEnet4:masterfrom
AlexTMjugador:feat/owned-pn-components
Jul 27, 2025
Merged

[core] Make PersonName hold both borrowed and owned strings#635
Enet4 merged 1 commit intoEnet4:masterfrom
AlexTMjugador:feat/owned-pn-components

Conversation

@AlexTMjugador
Copy link
Copy Markdown
Contributor

At the moment, PersonName cannot be ergonomically used by application code as an structure to group and pass around owned person name components that are meant to conform to DICOM's PN value semantics. It is, for example, difficult to come up with code for deserializing a PersonName if strings cannot be borrowed from the data source, which is a common ocurrence when the source is e.g. a stream that reads data chunks to temporary buffers.

To improve on that, let's make PersonName have Cow<'a, str> fields instead, and update PersonNameBuilder to accept them. Cows are smart pointers that can hold both borrowed and owned data, so while &'a str can be wrapped in a Cow<'a, str> at virtually no cost, an owned String may also be wrapped with the same type but an indefinitely long lifetime, Cow<'static, str>. Therefore, code that borrows name components can still use PersonNames as before, and new code that relies on PersonNames owning their components can now easily work by just using a 'static lifetime bound.

These changes are technically semver breaking due to the removal of the Copy trait on PersonName, and minor changes in the semantics of the PersonNameBuilder::build method.

@Enet4 Enet4 added A-lib Area: library C-core Crate: dicom-core labels Feb 20, 2025
@Enet4
Copy link
Copy Markdown
Owner

Enet4 commented Feb 20, 2025

Hello @AlexTMjugador! That is an interesting idea, but I wonder whether we can introduce this without a semver breaking change. Extending the owned version of a person name into a dedicated type (OwnedPersonName) would be feasible and have the benefit of being as efficient as it can be in the case where no additional ownership is necessary. Admittedly, it implies replicating more code, such as having two person builders.

  • New structs OwnedPersonName and OwnedPersonBuilder, would use String for every field;
  • One could trivially obtain a PersonName<'a> from an OwnedPersonName with a method call;
  • Parsing a person name would still result in a borrowed PersonName, but would have a ToOwned implementation into a OwnedPersonName.

Let me know your thoughts on this.

@AlexTMjugador
Copy link
Copy Markdown
Contributor Author

AlexTMjugador commented Feb 21, 2025

Hello!

I believe we could introduce the variation you proposed if our primary goal here is to avoid any semver-breaking changes. However, the current approach in this PR has several characteristics that, in my view, may justify keeping it as is:

  • As you stated, having a single PersonName struct that supports both use cases aligns better with the DRY principle, improving maintainability and making it easier to achieve higher test coverage.
  • While PersonName is a useful struct for specific situations, it does not appear to be widely used in open-source code I could find. Moreover, the breaking changes introduced by this PR are relatively minor and easy to migrate to if needed, so developer churn should remain negligible.
  • Though admittedly a niche use case, a single struct with Cows is the only way to support person names where some component strings are owned and others are borrowed. For example, a separate OwnedPersonName struct would not allow the following clone-free code, which this PR enables:
let mut name_builder = PersonName::builder();
name_builder.with_given("John"); // Borrowed from constant binary data
name_builder.with_family(String::from("Doe")); // Owned, may come from a buffer
let dicom_pn = name_builder.build().to_dicom_string(); // Doe^John

Ultimately, I'd understand if you decide this approach is not worth the technicality of being backwards incompatible, so please let me know how you'd like to proceed. For what it’s worth, in my experience as a library user, the PersonName implementation in this PR feels quite ergonomic, as it's easy and efficient to use whatever string variant comes from other parts of an application, and the additional generics usually don't need to be written or even deeply understood to be effective.

@Enet4 Enet4 added the breaking change Hint that this may require a major version bump on release label Feb 21, 2025
@Enet4
Copy link
Copy Markdown
Owner

Enet4 commented Feb 21, 2025

Fair enough, this can added to the 0.9 milestone.

@Enet4 Enet4 added this to the DICOM-rs 0.9 milestone Feb 21, 2025
At the moment, `PersonName` cannot be ergonomically used by application
code as an structure to group and pass around owned person name
components that are meant to conform to DICOM's PN value semantics.
It is, for example, difficult to come up with code for deserializing a
`PersonName` if strings cannot be borrowed from the data source,
which is a common ocurrence when the source is e.g. a stream that reads
data chunks to temporary buffers.

To improve on that, let's make `PersonName` have `Cow<'a, str>` fields
instead, and update `PersonNameBuilder` to accept them. `Cow`s are smart
pointers that can hold both borrowed and owned data, so while `&'a str`
can be wrapped in a `Cow<'a, str>` at virtually no cost, an owned
`String` may also be wrapped with the same type but an indefinitely long
lifetime, `Cow<'static, str>`. Therefore, code that borrows name
components can still use `PersonName`s as before, and new code that
relies on `PersonName`s owning their components can now easily work by
just using a `'static` lifetime bound.
Copy link
Copy Markdown
Owner

@Enet4 Enet4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just clarifying that by me this is good and ready (thanks!), I am just holding it here until it's time to start merging all breaking changes for DICOM-rs 0.9.0.

@AlexTMjugador
Copy link
Copy Markdown
Contributor Author

Thanks for the reminder, @Enet4! I just wanted to quietly rebase this PR onto the latest master commits to keep it mergeable and not too distant from the mainline :)

@Enet4 Enet4 merged commit 124b464 into Enet4:master Jul 27, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-lib Area: library breaking change Hint that this may require a major version bump on release C-core Crate: dicom-core

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants