Properly use xsi:nil to deserialize null values via serde#842
Properly use xsi:nil to deserialize null values via serde#842weiznich wants to merge 1 commit intotafia:masterfrom
xsi:nil to deserialize null values via serde#842Conversation
|
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #842 +/- ##
==========================================
+ Coverage 60.21% 60.73% +0.51%
==========================================
Files 41 41
Lines 16021 16013 -8
==========================================
+ Hits 9647 9725 +78
+ Misses 6374 6288 -86
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Mingun
left a comment
There was a problem hiding this comment.
Good work, and it's nice to see such a person in committers. I think, that we should try to improve it to cover all possible scenarios, but if that would be too many work (because maybe we need to switch to namespace-aware parser), the current implementation could be merged as the first step.
In any case, we need a changelog entry, and in case of using only xsi prefix this should be explicitly noted in the documentation somewhere here.
If you're not prepared to dive so deep to explore the things required, I can finish this PR myself, but that can take some time.
src/events/attributes.rs
Outdated
|
|
||
| pub(crate) fn is_xsi_nil(&self) -> bool { | ||
| self.clone().any(|attr| { | ||
| matches!(attr, Ok(crate::events::attributes::Attribute{ | ||
| key: QName(b"xsi:nil"), value | ||
| }) if &*value == b"true") | ||
| }) | ||
| } |
There was a problem hiding this comment.
While the xsi prefix usually bound to http://www.w3.org/2001/XMLSchema-instance namespace, this is not necessary true. I would like to resolve actual namespace and check against it.
Also, need to check is the true the only thing that represents truth value.
I think, it would be better to place such method to the BytesStart, because then we can avoid cloning and the name would better fit its purpose (currently it actually checks has_xsi_nil).
There was a problem hiding this comment.
While the xsi prefix usually bound to http://www.w3.org/2001/XMLSchema-instance namespace, this is not necessary true. I would like to resolve actual namespace and check against it.
I will try to have a look at that, if the idea I have works it shouldn't be too hard. Just to verify that I have the right understanding how this namespace thing works:
- I need to collect all
xmlns:xyzattributes from any parent tag - I then need to check if one or more of these attributes is set to
http://www.w3.org/2001/XMLSchema-instance - That then will me give the namespace name, so
xyzin the above examle - Finally I can use that name to check if an
nilattribute exists and then use that in the check
Also, need to check is the true the only thing that represents truth value.
Unless I miss something this already checks for the true value via a guard expression (line 242)
I think, it would be better to place such method to the BytesStart, because then we can avoid cloning and the name would better fit its purpose (currently it actually checks has_xsi_nil).
I'm not sure I can follow here. Attributes just borrows the buffer from BytesStart so no cloning should be happening here.
There was a problem hiding this comment.
- I need to collect all
xmlns:xyzattributes from any parent tag
Basically we need to use NsReader instead of a Reader in Deserializer, so here:
Lines 3080 to 3084 in 8f91a9c
...and here:
Lines 3149 to 3152 in 8f91a9c
Then make any necessary changes to make code compilable and then you may use resolve methods to resolve prefix. And it seems to me that that refactoring may be not so trivial.
Unless I miss something this already checks for the true value via a guard expression (line 242)
I mean, maybe not only the string "true" represents the true value. For example, XML Schema defines "true" and "1" as true and "false" and "0" as false. It seems that only "true" should be used, as stated here, but I did not check all occurrences of the nil word in the specification. The XML spec is a hard thing to read, so we must be carefull. I would like also have link to the specification (maybe the mentioned link is the right link) near to the check so everyone could verify that check is correct without much efforts.
I'm not sure I can follow here.
Attributesjust borrows the buffer fromBytesStartso no cloning should be happening here.
Yes, I think, actually no cloning is made here but that not obvious from the code. You need to look at the Attributes definition to be sure. As this method used only in start.attributes().is_xsi_nil() chain, it makes sence to move it to the BytesStart for clarity and call it as start.is_xsi_nil().
There was a problem hiding this comment.
Then make any necessary changes to make code compilable and then you may use resolve methods to resolve prefix. And it seems to me that that refactoring may be not so trivial.
Maybe I miss something obvious but that wasn't complicated. In fact it literally required changing these two types + adjusting my new added code to get access to the resolve methods.
I mean, maybe not only the string "true" represents the true value. For example, XML Schema defines "true" and "1" as true and "false" and "0" as false. It seems that only "true" should be used, as stated here, but I did not check all occurrences of the nil word in the specification. The XML spec is a hard thing to read, so we must be carefull. I would like also have link to the specification (maybe the mentioned link is the right link) near to the check so everyone could verify that check is correct without much efforts.
Done
Yes, I think, actually no cloning is made here but that not obvious from the code. You need to look at the Attributes definition to be sure. As this method used only in start.attributes().is_xsi_nil() chain, it makes sence to move it to the BytesStart for clarity and call it as start.is_xsi_nil().
Done
src/de/mod.rs
Outdated
| } | ||
|
|
||
| #[test] | ||
| fn deserialize_nil() { |
There was a problem hiding this comment.
I prefer to group tests using modules. This have some advantages, for example, code folding in the editor. In that case it seems reasonable to split this test to several in the nil module:
mod nil {
// <foo xsi:nil="true"/>
// <foo xsi:nil="true" attr="value"/>
// <foo xsi:nil="true"><tag/></foo>
fn true_() {}
// <foo xsi:nil="false"/>
// <foo xsi:nil="false" attr="value"/>
// <foo xsi:nil="false"><tag/></foo>
fn false_() {}
}
// and the same for the case of nested element with nilThose tests are better to place to the serde-de tests to reduce overall time of compiling library (which already not so good), because they do not use internal details.
There was a problem hiding this comment.
Moved the tests + added some more variants. What's still not tested is using a different namespace prefix, but I can also add tests for that if you want.
src/de/mod.rs
Outdated
| let data = r#"<foo xsi:nil="true"/>"#; | ||
|
|
||
| let res = super::from_str::<Option<Foo>>(data).unwrap(); | ||
| assert!(res.is_none()); |
There was a problem hiding this comment.
I prefer to use assert_eq checks where possible because it will output diff in case of failed check (project uses pretty_assertions crate for diffs).
8df645b to
fa3df2a
Compare
This commit fixes an issue that causes `quick-xml` trying to deserialize
empty tags via the serde interface even if these tags were explicitly
marked as `xsi:nil="true"`
For example the following XML failed to deserialize before this commit:
```xml
<bar>
<foo xsi:nil="true"/>
</bar>
```
into the following rust type:
```rust
#[derive(Deserialize)]
struct Bar {
foo: Option<Inner>,
}
#[derive(Deserialize)]
struct Foo {
baz: String,
}
```
Before this commit this failed to deserialize with an error message that
complained that the `baz` field was missing. After this commit this uses
the `xsi:nil` attribute to deserialize this into `foo: None` instead.
The standard (https://www.w3.org/TR/xmlschema-1/#xsi_nil) seems to
support this behaviour.
Fix tafia#497
fa3df2a to
ad89a00
Compare
|
Just want to inform you, that this PR is not abandoned, I working on improving testcases to ensure that everything is consistent. |
| #[derive(Debug, PartialEq, serde::Deserialize)] | ||
| struct Helper { | ||
| #[serde(rename = "@attr")] | ||
| attr: String, | ||
| #[serde(rename = "$value")] | ||
| inner: Option<Foo>, | ||
| } |
There was a problem hiding this comment.
Unfortunately, this is incorrect representation for <foo attr="..." xsi:nil="..."/>. Currently, you even cannot use $value name for annotating fields with non-enum type, the serialization of such fields would fail. So currently I searching a correct way to represent such XML by the Rust type.
Lines 351 to 360 in 8f91a9c
|
I think, that the correct handling of <!-- XML 1 -->
<any-tag xsi:nil="true" attr="value">
<element1>value 1</element1>
<element2>value 2</element2>
</any-tag><!-- XML 2 -->
<any-tag xsi:nil="false" attr="value">
<element1>value 1</element1>
<element2>value 2</element2>
<element3>value 3</element3>
</any-tag>should be mapped to #[derive(Deserialize)]
struct AnyName {
#[serde(rename = "@attr")]
attr: String,
element1: String,
element2: Option<String>,
element3: Option<String>,
element4: Option<String>,
}
// XML 1
let xml1 = AnyName {
// xsi:nil does not influence to attributes, only to nested elements
attr: "value",
// Non-Option fields deserialized as usual
element1: "value 1",
// Note: even while element is presented in XML,
// it is not deserialized because of xsi:nil="true".
// Without it it would be `Some("value 2")`
element2: None,
element3: None,
element4: None,
};
// XML 2
let xml2 = AnyName {
attr: "value",
element1: "value 1",
element2: Some("value 2"),
element3: Some("value 3"),
element4: None,
}; |
|
@weiznich, I plan to finish this PR myself, because I already done some prepare work of writing tests thats cover all possible (I hope) variants. So nothing from you is required here, but of course, you can give feedback, is that solution would fit your needs? I hope I can finish PR in next week. |
|
I'm happy with whatever works best for you. If you want to address things on your own that's fine for me, it's also fine to spend a bit more work on my side to implement this as long as I known what to do and what are the expectations. |
|
Because I cannot push my changes here, I've opened #850 |
This commit fixes an issue that causes
quick-xmltrying to deserialize empty tags via the serde interface even if these tags were explicitly marked asxsi:nil="true"For example the following XML failed to deserialize before this commit:
into the following rust type:
Before this commit this failed to deserialize with an error message that complained that the
bazfield was missing. After this commit this uses thexsi:nilattribute to deserialize this intofoo: Noneinstead. The standard (https://www.w3.org/TR/xmlschema-1/#xsi_nil) seems to support this behaviour.Fix #497