This is a request to document some limitations we came across with Bigtable, and a question about the use of bigtable-beam-import to copy data from HBase to Bigtable and vice-versa.
Documentation requests
In our HBase table we do Put operations with a reversed timestamp, i.e.,
Put p = new Put(row, column, qualifier, Long.MAX_VALUE - timeNowMillis, value);
table.put(p);
We do this to enforce a particular ordering, and this has worked fine with HBase.
When we used the bigtable-hbase-1.x client to do the same Put in Bigtable, the subsequent Get results all contained Long.MAX_VALUE timestamp. We traced it down to the com.google.cloud.bigtable.hbase.util.TimestampConverter which internally converts the time, and doesn't handle things well when the Put timestamp exceeds TimestampConverter.HBASE_EFFECTIVE_MAX_TIMESTAMP. We are exploring ways to fix this.
I think it will be useful if you document this issue in CBT Docs, especially considering the HBase suggestions around this which others may be following.
Questions
Will you make the TimestampConverter.HBASE_EFFECTIVE_MAX_TIMESTAMP public? Because one fix we are exploring is to use HBASE_EFFECTIVE_MAX_TIMESTAMP - timeNowMillis instead of Long.MAX_VALUE - timeNowMillis in our Puts; we are vary of computing it ourselves in case TimestampConverter.FACTOR changes.
Also, how does this work while using bigtable-beam-import:
a) Say, I have an HBase table with cells with timestamp in milliseconds. And I export this to sequencefiles using HBase's Export. When I import this sequencefile into Bigtable using bigtable-beam-import, will it do the hbase2bigtable() translation on the sequencefile cell timestamps?
b) If I export the data in Bigtable using bigtable-beam-import export, will it do the reverse translation using bigtable2hbase()? i.e., can I expect millisecond timestamps in the exported sequencefile or will it be microsecond timestamps?
c) How do these work if the HBase table had cells created with reverse timestamps, i.e., Long.MAX_VALUE - timeNowMillis?
This is a request to document some limitations we came across with Bigtable, and a question about the use of
bigtable-beam-importto copy data from HBase to Bigtable and vice-versa.Documentation requests
In our HBase table we do Put operations with a reversed timestamp, i.e.,
We do this to enforce a particular ordering, and this has worked fine with HBase.
When we used the
bigtable-hbase-1.xclient to do the same Put in Bigtable, the subsequent Get results all containedLong.MAX_VALUEtimestamp. We traced it down to thecom.google.cloud.bigtable.hbase.util.TimestampConverterwhich internally converts the time, and doesn't handle things well when the Put timestamp exceedsTimestampConverter.HBASE_EFFECTIVE_MAX_TIMESTAMP. We are exploring ways to fix this.I think it will be useful if you document this issue in CBT Docs, especially considering the HBase suggestions around this which others may be following.
Questions
Will you make the
TimestampConverter.HBASE_EFFECTIVE_MAX_TIMESTAMPpublic? Because one fix we are exploring is to useHBASE_EFFECTIVE_MAX_TIMESTAMP - timeNowMillisinstead ofLong.MAX_VALUE - timeNowMillisin our Puts; we are vary of computing it ourselves in caseTimestampConverter.FACTORchanges.Also, how does this work while using
bigtable-beam-import:a) Say, I have an HBase table with cells with timestamp in milliseconds. And I export this to sequencefiles using HBase's
Export. When I import this sequencefile into Bigtable usingbigtable-beam-import, will it do thehbase2bigtable()translation on the sequencefile cell timestamps?b) If I export the data in Bigtable using
bigtable-beam-importexport, will it do the reverse translation usingbigtable2hbase()? i.e., can I expect millisecond timestamps in the exported sequencefile or will it be microsecond timestamps?c) How do these work if the HBase table had cells created with reverse timestamps, i.e.,
Long.MAX_VALUE - timeNowMillis?