-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Describe the enhancement requested
We are currently rewriting some of our components to use Arrow and when appending data to the record, we want to basically deduplicate/aggregate the data before adding new rows. For that, we either need to keep track of the previously appended data outside of Arrow, or (and that's what we want to enhance) we read back the data from the underlying builders.
I was able to find several occurrences of reading values from builders.
Next, to potentially adding the *Builder Value(i int) * functions to some types we would mostly be interested in reading back values from a *BinaryDictionaryBuilder.
I've managed to read the indices back from the builders, but then failed and stopped trying to read the values from the underlying BinaryMemoTable.
It would be super helpful to read back those strings from the Dictionary's BinaryMemoTable. Is this something we can somehow add? Is this something within the scope of this library?
diff --git a/go/arrow/array/dictionary.go b/go/arrow/array/dictionary.go
index 2409e296c..622dece04 100644
--- a/go/arrow/array/dictionary.go
+++ b/go/arrow/array/dictionary.go
@@ -1208,6 +1209,54 @@ func (b *BinaryDictionaryBuilder) InsertStringDictValues(arr *String) (err error
return
}
+func (b *BinaryDictionaryBuilder) GetValueIndex(i int) int {
+ switch b := b.idxBuilder.Builder.(type) {
+ case *Int64Builder:
+ return int(b.Value(i))
+ case *Uint64Builder:
+ return int(b.Value(i))
+ case *Float64Builder:
+ return int(b.Value(i))
+ case *Int32Builder:
+ return int(b.Value(i))
+ case *Uint32Builder:
+ return int(b.Value(i))
+ case *Float32Builder:
+ return int(b.Value(i))
+ case *Int16Builder:
+ return int(b.Value(i))
+ case *Uint16Builder:
+ return int(b.Value(i))
+ case *Int8Builder:
+ return int(b.Value(i))
+ case *Uint8Builder:
+ return int(b.Value(i))
+ case *TimestampBuilder:
+ return int(b.Value(i))
+ case *Time32Builder:
+ return int(b.Value(i))
+ case *Time64Builder:
+ return int(b.Value(i))
+ case *Date32Builder:
+ return int(b.Value(i))
+ case *Date64Builder:
+ return int(b.Value(i))
+ case *DurationBuilder:
+ return int(b.Value(i))
+ default:
+ return -1
+ }
+}
+
+func (b *BinaryDictionaryBuilder) Value(i int) []byte {
+ return []byte{}
+}
+
+func (b *BinaryDictionaryBuilder) ValueStr(i int) string {
+ //b.memoTable.(*hashing.BinaryMemoTable).
+ return ""
+}
+
type FixedSizeBinaryDictionaryBuilder struct {
dictionaryBuilder
byteWidth intComponent(s)
Go