Add std::any::Provider support to Serializer#2420
Conversation
|
This would be extremely useful for things like serde_json::RawValue and similar! |
|
Although it would need to apply to Deserialize as well. |
|
I can certainly add it for Deserialize, too! |
|
FWIW I'm just a drive-by reader, not maintainer of Serde, although I could very much make use of such feature if it ever lands to avoid hacks for communication between deserializer & custom type in serde-wasm-bindgen. |
6b6b91e to
3ac28f0
Compare
|
I don't know about the timeline on std::any::Provider stabilization, but if/when it lands this would indeed be a great solution to the serializer-specific specialization problem.
While it is valuable for everyone who needs it, it also adds a just-for-convenience method to an already extremely big trait. Afaict even the |
🥳 Yeah, I'm aware that
I suppose there's a balance here between smaller API surface and more documentation burden.
I'll be glad to take that stuff out and update the documentation. |
Turn a `Serializer` and a `Deserializer` into a [`std::any::Provider`]
in a backward compatible way.
This allows specialization between a `Serializer` and a `Serialize`, and
between a `Deserializer` and a `Deserialize`.
The one particular use case I have in mind is for [`serde_dynamo`].
DynamoDB supports lists of generic items – which is what `serde_dynamo`
currently uses to serialize e.g. a `Vec<String>`. However, DynamoDB also
supports specialized types like [String Set] that it would be nice to be
able to opt in to.
Using [`std::any::Provider`], this could look something like this:
/// A specialized serializer
///
/// This would most likely be used with `#[serde(with = "string_set")]`
struct StringSet<T>(T);
impl<T> Serialize for StringSet<T> where T: Serialize {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
if let Some(string_set_serializer) = std::any::request_value::<serde_dynamo::StringSetSerializer>(&serializer.provider()) {
self.0.serialize(string_set_serializer)
} else {
self.0.serialize(serializer)
}
}
}
/// In user code, a struct that wants `strings` to be serialized
/// using a DynamoDb string set *if* the serializer is
/// `serde_dynamo`
#[derive(Debug, Serialize, Deserialize)]
struct Subject {
strings: StringSet<Vec<String>>,
}
With this set up, `Subject.strings` would be serialized as normal using
`serde_json`
{
"strings": ["one", "two"]
}
but when using `serde_dynamo`, `Subject.strings` would use the specific
`StringSetSerializer` to get custom behavior.
This is a very specific example where `serde_dynamo` would likely
provide both the `StringSet` wrapper and the `StringSetSerializer`.
There *may* also be situations where the ecosystem would develop common
practices. One example is the `Serializer::is_human_readable` function.
If this existed before that was introduced, it's *possible* the
serializers could have provided `struct IsHumanReadable(bool)` and
serializers could have used that to determine how to display timestamps.
Links regarding [`std::any::Provider`]:
* `std::any::Provider` RFC: https://rust-lang.github.io/rfcs/3192-dyno.html
* `std::any::Provider` tracking issue: rust-lang/rust#96024
[`std::any::Provider`]: https://doc.rust-lang.org/nightly/core/any/trait.Provider.html
[`serde_dynamo`]: https://docs.rs/serde_dynamo
[String Set]: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.NamingRulesDataTypes.html
3ac28f0 to
5d308fa
Compare
|
This would be very useful for deserialization to add an ability to provide spans of deserialized objects to the I was able to implement such a concept by extending trait PositionProvider {
type Position;
fn current_position() -> Self::Position;
}
trait Deserializer<'de> {
type PositionProvider: PositionProvider;
fn position_provider(&self) -> Self::PositionProvider;
...
}
// impl
impl<'de, 'a> Deserializer<'de> for &'a mut MyDeserializer {
type PositionProvider = MyPositionProvider;
fn position_provider(&self) -> Self::PositionProvider {
MyPositionProvider(self as _)
}
}
// Cannot use reference here because of borrow checker
// and that fact that deserializer is consumed during deserialization
struct MyPositionProvider(*const MyDeserializer);
impl PositionProvider for MyPositionProvider {
type Position = usize;
fn current_position(&self) -> usize {
let de: &MyDeserializer = unsafe { &*self.0 };
de.current_position();
}
}Then any type can be wrapped into struct Spanned<T> {
value: T,
span: Range<usize>,
}
impl Deserialize<'de> for Spanned<T>
where
T: Deserialize<'de>,
{
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> {
let provider = deserializer.position_provider();
let start = provider.current_position();
let value = T::deserialize(deserializer)?;
let end = provider.current_position();
Ok(Self { value, span: (start..end).into() })
}
}This trick is only worked for deserializers that implements |
|
@Mingun You could just make |
|
Looks like the |
|
Actually, the |
Turn a
Serializerinto astd::any::Providerin a backward compatible way.This allows specialization between a
Serializerand aSerializein a generic way.The one particular use case I have in mind is for
serde_dynamo. DynamoDB supports lists of generic items – which is whatserde_dynamocurrently uses to serialize e.g. aVec<String>. However, DynamoDB also supports specialized types like String Set that it would be nice to be able to opt in to.Using
std::any::Provider, this could look something like this:With this set up,
Subject.stringswould be serialized as normal usingserde_json{ "strings": ["one", "two"] }but when using
serde_dynamo,Subject.stringswould use the specificStringSetSerializerto get custom behavior.This is a very specific example where
serde_dynamowould likely provide both theStringSetwrapper and theStringSetSerializer.There may also be situations where the ecosystem would develop common practices. One example is the
Serializer::is_human_readablefunction. If this existed before that was introduced, it's possible the serializers could have providedstruct IsHumanReadable(bool)and serializers could have used that to determine how to display timestamps. It's possible that this alleviates some of the need for #455, but I haven't looked at every use case there.Links regarding
std::any::Provider:std::any::ProviderRFC: https://rust-lang.github.io/rfcs/3192-dyno.htmlstd::any::Providertracking issue: Tracking Issue for Provider API rust-lang/rust#96024Implementation notes:
I opted for getting this out for discussion and review quickly, which means I didn't spend too much time on documentation and zero time on tests. If this is an approach the team is interested in, I'll be happy to revisit those.I've added doc tests toprovideandprovider.There doesn't necessarily need to be aBased on doc tests and some experimenting, I think having the defaultSerializer::providerconvenience method. We could use documentation to push implementors to useSerializerProvider::wrap(&serializer)when they need a provider.providermethod is valuable.This is most useful to my use case onI've added this toSerializer, but I wonder if it also would be useful on theSerialize*sub-traits ofSerializer, and/or onDeserializetraits.Deserializeas well.