MsgPackReader: properly increment index in ext#370
MsgPackReader: properly increment index in ext#370htmldoug merged 3 commits intocom-lihaoyi:masterfrom
ext#370Conversation
The `parseExt` helper in `MsgPackReader` was not properly updating the position in the data source. Also override `Ext.equals` to compare the data arrays by contents instead of identity (which is consistent with the other subclasses of `Msg`). Added two regression tests. Note that these don't work without the `equals` changes, since structurally identical `Ext` messages won't compare equal unless they also happen to be the same object. Fixes com-lihaoyi#369
| } | ||
|
|
||
| override def hashCode: Int = | ||
| MurmurHash3.bytesHash(b, "Ext".hashCode) |
There was a problem hiding this comment.
tag should be represented in the hash somehow. We could probably just use it as the seed.
There was a problem hiding this comment.
The more I think about this, the less comfortable I am about updating equals and hashCode here. That could lead to some end-user changes and it isn't consistent with some of the other types (eg. Bytes). Maybe this is still desirable change, but probably not in a PR about fixing ext parsing.
I'm going to switch the round-tripping test I added to assert equivalence of the serialized and serialized-deserialized-reserialized payloads (that way we don't need to call equals on a Msg)
There was a problem hiding this comment.
👍 for smaller, well-scoped PRs.
Re: equals/hashCode, it's hard to imagine that any users are depending on the behavior being broken other than to work around it.
Good point about case class Binary(value: Array[Byte]) extends Msg being affected, too. A fix for one should probably fix both.
There was a problem hiding this comment.
The rabbit hole goes deeper. Given that 0.0d == 0L, should Float64(0.0) == Int64(0L)? Probably so! This is especially relevant since round-tripping Int64(0L) through messagepack will come back as Int32(0). I think you're wise to push equals/hashCode to a separate PR. Making assertions against the Array[Byte] sounds good for now.
There was a problem hiding this comment.
Given that
0.0d == 0L, shouldFloat64(0.0) == Int64(0L)?
Personally, I don't think this should be the case - the "int format family" and "float format family" are separate. I further wish that 0.0d == 0L would fail to typecheck instead of returning true (that's another rabbit hole though, and Scala is unlikely to ever change that).
This is especially relevant since round-tripping
Int64(0L)through messagepack will come back asInt32(0).
I agree Int64(0L) should roundtrip, but on the other hand: is there really a need for the distinction? Why not just a single case class Int(value: Long) extends Msg (and UInt for unsigned)? We don't distinguish Int16 or the fixint cases, so why distinguish the 32 and 64 bit cases?
| val (arr, i, j) = sliceArr(index + 1, n) | ||
| visitor.visitExt(getByteSafe(index), arr, i, j, index) | ||
| val res = visitor.visitExt(getByteSafe(index), arr, i, j, index) | ||
| index += n + 1 |
There was a problem hiding this comment.
This looks important!
If I'm reading correctly, looks like n is the array length, and the + 1 is the tag/ext-type byte. Seems like this fix should cover the fixext types as well. 👍
This rolls back changing `equals` and `hashCode`. Those changes may be worth it, but not in this PR. The test compares serialized and re-serialized values
|
I've switched back to If you are planning on squashing and merging, maybe use that as the commit message instead of the default concatenation of intermediate messages? |
The
parseExthelper inMsgPackReaderwas not properly updating theposition in the data source.
Added two regression tests. These compare serialized and re-serialized
values (instead of just initial and serialized) since structurally identical
Extmessages won't compare identical (due to the
Array[Byte]field whoseequality is determined by identity).
Fixes #369