rowcontainer: fix hash row container for some types#49851
rowcontainer: fix hash row container for some types#49851craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
The explanation is that `HashDiskRowContainer` is implemented using `DiskRowContainer` with the equality columns (i.e. the columns to hash) of the former being the ordering columns for the latter, and those ordering columns are used to compute the keys of the rows (in `encodeRow`) so that we could store the row in the sorted order. This way we store the build (right) side of the join, but for the probe (left) side we use `hashMemRowIterator` to compute the key of the probing row. The key computation methods must be the same in both places, otherwise, the results of the join can be incorrect. 45229 broke this synchronization by changing the key computation method in `hashMemRowIterator.computeKey` to use `Fingerprint`. So we have to either use `Fingerprint` in `encodeRow` or use `Encode` in `computeKey`. The first choice doesn't seem to work because `Fingerprint` doesn't provide the ordering we need in `DiskRowContainer`, so we need to use the second approach. The ordering property is necessary because `DiskRowContainer` implements "hash row container" by sorting all rows on the ordering (i.e. hash) columns and using the ordering property to provide the "hashing" behavior (i.e. we would seek to the first row that has the same hash columns and then iterate from that row one row at a time forward until the hash columns remain the same). If we don't have the ordering property, then the necessary invariant that all rows that hash to the same value are contiguous is not maintained. Release note: None
|
This issue was exposed with some changes in #49801, and I don't think the fix needs to be backported. However, it does suggest a question related to #48130: should we be able to support joins on JSON columns? Currently, we don't seem to have key encoding for JSONs, so we will error out when spilling to disk when performing the hash join. I don't see an easy way to fix that. Thoughts? |
Regardless of whether we should, I think that we have already said we do support them in past releases. I'm not sure if we can walk back support for these things. |
|
But the support has been broken (and is still broken in the same way after this PR) - when hash join spills to disk, if we have JSON column among equality columns, the "json doesn't have key encoding" error will occur. I'm not sure whether we want/need to remove this partially supported feature, but we definitely should add it to the list of known limitations. |
|
I agree with at least adding it to the known limitations. does @jordanlewis have an opinion about possibly removing support for json joins? |
|
We don't have to remove support unless we return incorrect results. I'd prefer we leave support for JSON joins, even given this limitation. If a query errors out when spilling to disk, that's unfortunate, but a user can complain to us at that point safely. |
|
Ok, not removing the partial support makes sense to me - it's better than nothing. Filed a docs issue. Do I have a stamp for this PR? |
|
TFTR! bors r+ |
Build succeeded |
|
Thanks for tracking this down and fixing it! |
The explanation is that
HashDiskRowContaineris implemented usingDiskRowContainerwith the equality columns (i.e. the columns to hash)of the former being the ordering columns for the latter, and those
ordering columns are used to compute the keys of the rows (in
encodeRow) so that we could store the row in the sorted order. Thisway we store the build (right) side of the join, but for the probe
(left) side we use
hashMemRowIteratorto compute the key of theprobing row. The key computation methods must be the same in both
places, otherwise, the results of the join can be incorrect. #45229
broke this synchronization by changing the key computation method in
hashMemRowIterator.computeKeyto useFingerprint. So we have to eitheruse
FingerprintinencodeRowor useEncodeincomputeKey. The firstchoice doesn't seem to work because
Fingerprintdoesn't provide theordering we need in
DiskRowContainer, so we need to use the second approach.The ordering property is necessary because
DiskRowContainerimplements"hash row container" by sorting all rows on the ordering (i.e. hash) columns
and using the ordering property to provide the "hashing" behavior (i.e. we
would seek to the first row that has the same hash columns and then iterate
from that row one row at a time forward until the hash columns remain the
same). If we don't have the ordering property, then the necessary invariant
that all rows that hash to the same value are contiguous is not maintained.
Release note: None