feat(spanner): add support for Proto Columns#6886
feat(spanner): add support for Proto Columns#6886rahul2393 merged 21 commits intogoogleapis:proto-column-enhancement-alphafrom
Conversation
olavloite
left a comment
There was a problem hiding this comment.
Looks generally good to me, with a couple of minor nits.
gauravpurohit06
left a comment
There was a problem hiding this comment.
LGTM with minor comments
spanner/protoutils.go
Outdated
| return &sppb.Type{Code: sppb.TypeCode_ENUM} | ||
| } | ||
|
|
||
| func messageProto(m proto.Message) *proto3.Value { |
There was a problem hiding this comment.
We are going to support only proto2 for Verily Private GA. Not sure if there is any implication to use proto3.Value here instead of proto2.Value. Can you verify?
There was a problem hiding this comment.
I guess proto3 here means the Spanner proto3 api definition not the protobuf proto3.
| if err != nil { | ||
| return errBadEncoding(v, err) | ||
| } | ||
| reflect.ValueOf(p).Elem().SetInt(y) |
There was a problem hiding this comment.
What happens if y is not a valid enum value? Is the behavior expected.
Please add a comment about the behavior.
There was a problem hiding this comment.
At client library end, as we do not have the schema of the enum it is difficult to validate the valid range of enum value.
I guess validation should be done from the backend side while inserting enum. If the enum is validated during insertion, then we can assume the value y is always valid.
However, if y is not a valid enum value, then there will not be any exception in above code. The integer value gets assigned to the enum and when user tries to print the enum it shows an integer.
For ex: Consider the following schema of Enum:
enum Genre {
POP = 0;
JAZZ = 1;
FOLK = 2;
ROCK = 3;
}
Here valid range is 0 to 3.
If y is 2, then fmt.Println(p) prints FOLK.
If y is 6 which is invalid, then fmt.Println(p) prints 6
spanner/value.go
Outdated
| pb.Kind = stringKind(strconv.FormatInt(int64(v.Number()), 10)) | ||
| protoEnumfqn = string(v.Descriptor().FullName()) | ||
| } | ||
| pt = enumType(protoEnumfqn) |
There was a problem hiding this comment.
If v is nil, it seems protoEnumfqn is not set. Is this expected?
Do we have a test for this? If so, which line. Thanks!
There was a problem hiding this comment.
If v is nil, then protoEnumfqn by default gets assigned with default value of string which is "".
There was a problem hiding this comment.
Is it expected? We need a test for this.
Can we get the descriptor even if the value is nil as the descriptor just depends on the type?
There was a problem hiding this comment.
The only way we can pass nil here is to directly pass nil while inserting data. We cannot create a nil instance of an enum (pb.Genre_Rock or pb.Genre) or proto (pb.Book{}) as they create default objects in every possibility of declaration. So the only possibility is to pass nil and we cannot get descriptor fn for nil. So assigning "" in that case.
There was a problem hiding this comment.
Also from above, if nil is passed it will be handled by client library as a NULL_VALUE at the beginning of switch case and it will not come inside the protoreflect.Enum case. This means protoEnumfqn will not be set to "" as there is no way to send a nil instance of an enum or proto.
There was a problem hiding this comment.
The only way to pass a nil type of proto enum is by type casting and passing like (*pb.Genre)(nil). As it is a nil type, calling v.Descriptor() raises a panic in golang with exception as follows:
panic: value method Genre.Descriptor called using nil *Genre pointer
So, the only option is to set protoEnumfqn as "".
I have added a unit test for the same Test
There was a problem hiding this comment.
So a nil value of type pb.Genre is possible.
In golang, is it possible to do something like:
v := pb.Genre_ROCK
vp := &v
vp = nil
If so, what is the type of v now?
Is it reasonable that a user passing a nil value of type pb.Genre would expect Spanner to know the right type? If that's the case and we cannot do it, maybe we should give them an error instead of silently doing something unexpected. Reasonable?
And is ENUM with empty fqn a good candidate and why?
There was a problem hiding this comment.
If we expect users to use nullProto instead of a pb.Genre nil value, we should point that out in the error msg.
There was a problem hiding this comment.
It seems that we can get descriptor without a concrete value with:
pb.Genre.Descriptor()
I will rely on you to help to see if a nil enum value would give us the descriptor.
There was a problem hiding this comment.
The following code will give the same type (*pb.Genre)(nil) as pointed above
v := pb.Genre_ROCK
vp := &v
vp = nil
golang throws an exception when any method is called using nil pointer. The only reasonable option would be to throw an error to user indicating that nil types cannot be used.
I have added the code changes @Shua1. Can you please have a look?
spanner/value.go
Outdated
| pb.Kind = stringKind(base64.StdEncoding.EncodeToString(bytes)) | ||
| protoMessagefqn = string(proto.MessageReflect(v).Descriptor().FullName()) | ||
| } | ||
| pt = protoType(protoMessagefqn) |
There was a problem hiding this comment.
If v is nil, then protoMessagefqn by default gets assigned with default value of string which is "".
There was a problem hiding this comment.
Added test.
As per protoreflect documentation, (*pb.SingerInfo)(nil) is an invalid message type. I verified it by using proto.MessageReflect(v).isValid() and it returns false for (*pb.SingerInfo)(nil) .So setting protoMessagefqn as "" is valid.
Text from Protoreflect document:
An invalid message often corresponds to a nil pointer of the concrete message type
spanner/value.go
Outdated
| if reflect.ValueOf(p).Kind() != reflect.Ptr { | ||
| return errNotAPointer(p) | ||
| } | ||
| if code != sppb.TypeCode_ENUM { |
There was a problem hiding this comment.
This is fine for now. But we need to keep in mind that some user might want to decode a INT64 value into enum.
There was a problem hiding this comment.
Maybe, we should make it happen for now. To be consistent with
- the write path, where integer can be written into ENUM column.
- the proto and bytes compatible behavior.
What do you think?
There was a problem hiding this comment.
@Shua1
I have added the code changes and integration tests to validate the following compatibility behaviours between Int64 and Enum. Can you please verify?
- Write Enum into Int64 column
- Write Int64 into Enum column
- Read Int64 column value using Enum
- Read Enum column value using Int64
Link to tests: Compatibility Behaviour Integration tests
There was a problem hiding this comment.
It looks good.
To make the case completely solid, could you please also verify this is the expected behavior internally on your side?
There was a problem hiding this comment.
Yes @Shua1. This behaviour is valid on our side also.
spanner/value.go
Outdated
| if reflect.ValueOf(p).Kind() != reflect.Ptr { | ||
| return errNotAPointer(p) | ||
| } | ||
| if code != sppb.TypeCode_PROTO { |
There was a problem hiding this comment.
Different from the ENUM case, we do need to support decoding bytes typed result into PROTO. BYTES values and PROTO values are compatible with each other.
Please also add tests for both direction.
There was a problem hiding this comment.
@Shua1
Supporting compatibility between BYTES and PROTO values involves code changes in the decoding function of BYTES also. This needs a bit more testing and handling corner cases.
I will make a note of this and will support this case in my subsequent PR. Does that sound good?
There was a problem hiding this comment.
@Shua1
I have added the code changes and integration tests to validate the following compatibility behaviours between Bytes and Proto.
- Write Bytes into Proto column
- Write Proto into Bytes column
- Read Proto column value using Bytes
- Read Bytes column value using Proto
Link to tests: Compatibility Behaviour Integration tests
…bility between Bytes and Proto
This reverts commit 484b00c.
spanner/protoutils.go
Outdated
| return &proto3.Value{Kind: &proto3.Value_NullValue{NullValue: proto3.NullValue_NULL_VALUE}} | ||
| } | ||
|
|
||
| func protoType(fqn string) *sppb.Type { |
There was a problem hiding this comment.
Somehow, I must have lost the comment I left here previously.
Consider making this consistent with the protoMessageProto method below.
There was a problem hiding this comment.
That being said, it does feel strange that in the API TypeCode_PROTO means proto message. But protoMessage does look more accurate. What do you think?
There was a problem hiding this comment.
I have followed the convention used in API. As it was TypeCode_PROTO, I have named the methods as protoType(). However, using protoMessageType() looks more meaningful.
I have added the necessary changes @Shua1
spanner/protoutils.go
Outdated
| return &sppb.Type{Code: sppb.TypeCode_ENUM} | ||
| } | ||
|
|
||
| func messageProto(m proto.Message) *proto3.Value { |
There was a problem hiding this comment.
I guess proto3 here means the Spanner proto3 api definition not the protobuf proto3.
| want interface{} | ||
| }{ | ||
| {col: "ProtoMessage", val: &singerProtoMessage, want: singerProtoMessage}, | ||
| {col: "ProtoEnum", val: pb.Genre_ROCK, want: singerProtoEnum}, |
There was a problem hiding this comment.
It seems that we are mixing usage of pb.Genre_ROCK and singerProtoEnum. What's the rationale for it? Is it easier to debug if we use Genre_ROCK in all places?
There was a problem hiding this comment.
This is to differentiate the data passed in val and want. pb.Genre_ROCK is passed in val, and singerProtoEnum is passed in want in all the tests. Also in some places (in future PR changes) we need to pass a pointer like &singerProtoEnum instead of just the constant in want. So to maintain consistency across all the tests I have used singerProtoEnum in want.
Also want here shows how the user passes a variable to read the data. This helps to debug from user's point of view.
spanner/protoutils.go
Outdated
| return &sppb.Type{Code: sppb.TypeCode_PROTO, ProtoTypeFqn: fqn} | ||
| } | ||
|
|
||
| func enumType(fqn string) *sppb.Type { |
| if err != nil { | ||
| return errBadEncoding(v, err) | ||
| } | ||
| reflect.ValueOf(p).Elem().SetInt(y) |
spanner/value.go
Outdated
| if reflect.ValueOf(p).Kind() != reflect.Ptr { | ||
| return errNotAPointer(p) | ||
| } | ||
| if code != sppb.TypeCode_ENUM { |
There was a problem hiding this comment.
It looks good.
To make the case completely solid, could you please also verify this is the expected behavior internally on your side?
spanner/value.go
Outdated
| pb.Kind = stringKind(strconv.FormatInt(int64(v.Number()), 10)) | ||
| protoEnumfqn = string(v.Descriptor().FullName()) | ||
| } | ||
| pt = enumType(protoEnumfqn) |
There was a problem hiding this comment.
Is it expected? We need a test for this.
Can we get the descriptor even if the value is nil as the descriptor just depends on the type?
* feat(spanner): add support for Proto Columns (#6886) * feat(spanner): Adding support for Proto Message and Proto Enum * feat(spanner): Add license header to new files * code refactoring and additional checks * nit: code changes * go.mod update to prevent failing builds * go.mod update to prevent failing builds * go.mod and build_samples.sh update to prevent failing builds * revert back grpc version * revert back changes * Using standard Singers example, refactoring redundant code * code and proto refactoring * Add proto_type_fqn for Proto Message and Proto Enum * code refactoring * go mod tidy: go.mod and go.sum version updates * add changes to support compatibility between Int64 and Enum & compatibility between Bytes and Proto * Revert "go mod tidy: go.mod and go.sum version updates" This reverts commit 484b00c. * add Integration Tests for Proto Message, Proto Enum, compatibility tests * code refactoring * code refactoring * add unit tests for nil proto types * Add error when nil proto message or nil enum is passed * feat(spanner): add support for handling null value in Proto columns (#6954) * feat(spanner): add support for handling null value in Proto columns * code refactor: get protoTypeFqn from user defined nil types * code refactoring * code refactoring * code refactoring * code refactoring * Add tests for MarshalJSON and UnmarshalJSON methods * refactoring test file * code refactoring * feat(spanner): add support for Array of Proto columns (#7014) * feat(spanner): add support for array of proto columns * refactoring comments and added negative test cases while reading array of protos * change decoding logic of handling array of proto columns * feat(spanner): add support for handling null values in array of proto columns (#7042) * feat(spanner): add support for handling null values in Array of Proto Columns * add comments for code readability * nit: change in error message * feat(spanner): Modify configuration for integration test and add license header (#7059) * feat(spanner): update go-genproto dependency (#7066) * feat(spanner): support for enum columns as keys, index and integration tests (#7091) * feat(spanner): support for proto columns as primary key and tests for parameterized queries, primary key and indexes * feat(spanner): close read-only transaction to prevent session leak * feat(spanner): update table schema to have GPK on proto field * feat(spanner): add proto changes to support proto columns feature * feat(spanner): remove gen-proto dep overwrite for proto column support * feat(spanner): remove gen-proto dep overwrite from kokoro build * feat(spanner): reset array of proto, enum when null from database (#7176) * feat(spanner): reset array of proto, enum when null from database * feat(spanner): fix license header to fix vet.sh build * feat(spanner): fix proto generated file to fix vet.sh build * feat(spanner): organize imports to fix vet.sh build * feat(spanner): golint changes to fix vet.sh build * feat(spanner): add support for Proto column DDL (#7292) * feat(spanner): add proto changes for Proto Columns DDL support * feat(spanner): add Proto descriptor file and integration tests for Proto DDL feature * feat(spanner): skip pg tests and code refactoring for proto ddl * feat(spanner): rename NewDatabaseAdminRESTClient to NewDatabaseAdminClient due to visibility label issue fix * feat(spanner): remove hardcoded cloud-devel host * feat(spanner): remove hardcoded project id * feat(spanner): revert autogenerated code * feat(spanner): just to validate integration tests running * feat(spanner): change copyright year * feat(spanner): set project id for integration tests * feat(spanner): use jsoniter instead of json for marshal and unmarshal * feat(spanner): revert presubmit.sh changes * chore(spanner): run integration tests in presubmit * chore(spanner): revert presubmit.sh changes * feat(spanner): fix json --------- Co-authored-by: rahul2393 <irahul@google.com>
This PR contains implementation changes to add support for Proto Messages and Proto Enum.