DataSetWriterOptions - handling of explicit length Sequences and Items#645
DataSetWriterOptions - handling of explicit length Sequences and Items#645Enet4 merged 4 commits intoEnet4:masterfrom
Conversation
|
Let me just throw some thoughts here. To implement the recalculation strategy, I think there are 2 ways to approach this:
encoder is strongly coupled with write target using the This makes it impossible for the If you would be willing to approve the recalculation strategy, how would you go about changing the current abstractions ? Could we just make the Thank you |
|
You are right, there is a rather tight connection between the encoder and the write destination at the moment. This is also the case for decoders, with its own set of problems alongside it. Performance-wise this has been mitigated through a I wouldn't mind relaxing this constraint (as in changing the encoder-writer type handshake), but I suspect this would come with substantial rewriting all the way down from the
impl<W, E, T> StatefulEncoder<W, E, T> {
pub fn for_writer<U>(&self, to: U) -> StatefulEncoder<U, E, T>
where
U: Write,
E: Clone,
T: Clone,
{
StatefulEncoder {
to,
encoder: self.encoder.clone(),
text: self.text.clone(),
bytes_written: self.bytes_written,
buffer: self.buffer.clone(),
}
}
}Let me know if you will want to reiterate on this PR of you'd like it to be admitted as is. |
There was a problem hiding this comment.
I had overlooked this before, but the changes to the constructor functions in DataSetWriter constitute breaking changes. This isn't a problem, though it will require me to postpone merging this until I start collecting everything for v0.9, unless the breaking changes are reverted in the meantime.
40cce73 to
7fc431f
Compare
- revert `new`, add `new_with_options` - revert `new_with_codec`, add `new_with_codec_options` - add simple documentation
0fcf0db to
50919fd
Compare
Enet4
left a comment
There was a problem hiding this comment.
Should be good now. Thank you again! 👍
This adds
DatasetWriterOptionsto theparsercrate and should handle #636 and similar situations.The options now contain an
ExplicitLengthSqItemStrategy.ExplicitLengthSqItemStrategy::NoChange- is the previous behavior, where explicit length items and sequences were left with their length unchanged, which in case of modifying the sequences or items content could lead to incorrect explicit length.As this situation would occur on a regular basis, and the resulting errors are hard to debug, this is no more the default strategy and risks resulting from it's use are documented.
ExplicitLengthSqItemStrategy::SetUndefinedis the new default strategy. It simply writes all Items and Sequences withLength::UNDEFINEDirrespective of the originalLengthenum variant.Upon inspection of the writer byte output, this will appear counterintuitive to the user.
This is documented.
The final strategy - recalculating the lengths - is not implemented.
Reasons are: