Add fromXContent method to GetResponse#22082
Conversation
There was a problem hiding this comment.
We're going to have many of those, I think we should have static methods like ParsingExceptions.expectFieldName(String name, Token current), ParsingExceptions.expectStartArray(...) etc to have unified error messages.
There was a problem hiding this comment.
will do once you pushed your PR, will reuse those methods that you are introducing
There was a problem hiding this comment.
Can we move
builder.field(_INDEX, index);
builder.field(_TYPE, type);
builder.field(_ID, id);
out of the if condition?
There was a problem hiding this comment.
We have assertion here but ParsingException in GetField... shouldn't we unify this?
There was a problem hiding this comment.
leftover, I never know whether I want to use asserts or exceptions. I also don't want to end up validating that our own response are well formed, cause we should have tests for that, but I guess these few situations need to be checked. The question is exceptions or asserts?
There was a problem hiding this comment.
I'd prefer exceptions in this case
There was a problem hiding this comment.
Same, I'm wondering if a static method ParsingExceptions.unsupportedField() would help
There was a problem hiding this comment.
I understand why you did that but I'd be happy if we could get rid of those. I think that all "root" Response objects that are passed to this method should ensure that a startObject() is done in their toXContent() methods
There was a problem hiding this comment.
I would love to get rid of those but the default here is starting an object externally, which is conceptually wrong. With this change there is one less of those situations. I expect that we clean this up as we go. objects should be started within toXContent as much as possible, otherwise the writing and parsing code will not be symmetric.
There was a problem hiding this comment.
It is not an issue in GetResult but maybe we'll need to introduce some kind of post-parsing validation? Like if we parse incompatible values (found == true but version is still -1) we should raise an exception?
There was a problem hiding this comment.
I wouldn't do this given that it's our own responses. Let's rather have tests that make sure these cases don't happen?
There was a problem hiding this comment.
I think we could have an abstract XContentESTestCase that contains a set of helpers and test methods, so that the following methods testGetResultToAndFromXContent/testGetResultEqualsAndHashcode etc could go in their own GetResultTests class.
There was a problem hiding this comment.
I agree in principle, I admit I am scared of starting with factoring things out and make things too generic and shoot ourselves in the foot. I'd need to look at how much copy pasting that would save us, which should be clearer as soon as we have added a few of these fromXContent methods. @cbuescher WDYT?
There was a problem hiding this comment.
At this point I would try to avoid too much hierarchy in the tests. In the query refactoring we ended up centralizing a lot of helpers in AbstractQueryTestCase but this ended up as a huge (>1000 loc) class and created a bunch of dependencies between the individual tests. To reuse freuqently needed helper methods I'd rather use utility classes with static methods where possible instead of creating abstract base classes too early. That said, we might need to revisit that decision at a later point.
There was a problem hiding this comment.
randomByte.intValue() -> randomByte
There was a problem hiding this comment.
randomShort.intValue() -> randomShort
There was a problem hiding this comment.
that wouldn't quite work. The list is typed <Object> if I add randomShort the short becomes Short, but I need an Integer in the map.
There was a problem hiding this comment.
text gets printed out as string, when reparsed it is parsed as string, not Text
cbuescher
left a comment
There was a problem hiding this comment.
I added a few comments, most are minor but I think we need to look into what happens when the GetField objects have an xContent representation that isn't a single value.
There was a problem hiding this comment.
nit: maybe different variable name? I was suprised by "fields" here, but doesn't matter much
There was a problem hiding this comment.
Trying to understand this: GetResult#toXContent outputs another start/endObject token pair now. Why doesn't this affect the current behaviour of GetResponse?
There was a problem hiding this comment.
because the only user of GetResponse#toxContent was RestGetAction which would output start and end object by itself, I moved that out of the rest action to the GetResult object.
There was a problem hiding this comment.
also MultiGetResponse was using this, but I just removed the startObject and endObject from it.
There was a problem hiding this comment.
I was looking at rendering/parsing List similar to this in the SearchHit. I'm not 100% sure but I think builder.value(value) can write Maps (via XContentBuilder#unknownValue, which supports many object types, e.g. if Object is GeoPoint it writes a whole json object) but objectText() cannot read this back. I don't know if these "complex" object values are supported in the GetResponse, but if so we should check this in tests explicitely.
There was a problem hiding this comment.
good point. I have tested geo points manually and we don't actually print anything out if any or those are requested via stored_fields, which makes me think that they are not supported in GetField, but I had exactly the same doubts. Printing things out is always easy as we have all the info (class name etc.) to decide what to do, when reading back it is much trickier as we only have the json data type information. We do end up not being able to parse back what we write. It may be better to change how we print things out... like simply printing out objects by calling their toString, but I am not sure if that's a good idea at the moment
There was a problem hiding this comment.
I just tested geo points manually on 5.1 and it does print out an array if type is set to "stored" and requested via "stored_fields". I will do some further testing, I never used this part of the api tbh.
There was a problem hiding this comment.
who knows what I tested then, I will have another look.
There was a problem hiding this comment.
What' really strange is that stored geo_point fields get printed as a flat array, so if you store two point like:
PUT twitter/tweet/1
{
"counter": 1,
"location": [
{
"lat": 0.45,
"lon": 0.33
},
{
"lon": 1,
"lat": 1.5
}
]
}
I get back:
GET twitter/tweet/1?stored_fields=location
{
"_index": "twitter",
"_type": "tweet",
"_id": "1",
"_version": 4,
"found": true,
"fields": {
"location": [
"0.45, 0.33",
"1.5, 1.0"
]
}
}
So it looks like its printed as array of plain values. From reading ContentBuilder#Writers -> #value(GeoPoint p) -> #latlon() would have expected an array of objects, each containing a lat/lon parameters. It seems tricky to see how these values actually get stored in the index and then written to the response, so its also hard to parse them? I'd at least add the option of parsing maps here, then I think we are on the safe side.
There was a problem hiding this comment.
MappedFieldType#valueForDisplay is called, we can either get strings, numbers of booleans, not more than that. We could potentially change the toXContent to not use XContentBuilder#value(Object) but that may end up breaking stuff. I say we just make the assumption that these are the types we can get, we test those and we parse accordingly.
There was a problem hiding this comment.
Should we have individual smaller tests for GetField, GetResult?
There was a problem hiding this comment.
I wondered too... I don't personally like having 3 classes with 2 test methods each that depend on each other. I also wonder whether it makes sense to test the GetResponse, as well as GetResult and GetField individually. Should we rather only test GetResponse?
There was a problem hiding this comment.
I think we need to add a value type here that doesn't render to a simple json value but to an object, e.g. GeoPoint.
There was a problem hiding this comment.
I think we are good, as I explained above.
fd7fa65 to
af60526
Compare
Moved field values toXContent logic to GetField (from GetResult), which outputs its own fields, and can also parse them now. Also added fromXContent to GetResult and GetResponse. The start object and end object for the get response output have been moved to GetResult#toXContent, from the corresponding rest action. This makes it possible to have toXContent and fromXContent completely symmetric, as parsing requires looping till an end object is found which is weird when the corresponding toXContent doesn't print that out.
Also handled flat issue with SMILE and documented stuff
Also refactored a bit the static methods exposed by XContentParserUtils
af60526 to
4b570ce
Compare
|
I pushed some updates, this should be ready now. |
| } | ||
| return t; | ||
| } | ||
|
|
There was a problem hiding this comment.
sorry @tlrx while using these methods I changed them quite a bit, let me know what you think. I noticed one of them was unused, so I removed it.
There was a problem hiding this comment.
Makes sense, I changed them myself multiple times :/ They are better now
| } | ||
| } | ||
|
|
||
| //TODO move this to some utility class? It should be useful in other tests too, whenever we parse stored fields |
There was a problem hiding this comment.
there's a couple of TODOs like this, I am not even sure it is worth doing, the main issue being I don't where to put the class and what to call it :)
tlrx
left a comment
There was a problem hiding this comment.
I left minor comments, otherwise it LGTM.
| } | ||
| return t; | ||
| } | ||
|
|
There was a problem hiding this comment.
Makes sense, I changed them myself multiple times :/ They are better now
| public static GetField fromXContent(XContentParser parser) throws IOException { | ||
| ensureExpectedToken(XContentParser.Token.FIELD_NAME, parser.currentToken(), parser::getTokenLocation); | ||
| String fieldName = parser.currentName(); | ||
| List<Object> values = new ArrayList<>(); |
There was a problem hiding this comment.
This can be initialized after the next ensureExpectedToken(array) - it is more readable and avoid an init if the XContent is not as expected.
| return fields.values().iterator(); | ||
| } | ||
|
|
||
| static final class Fields { |
There was a problem hiding this comment.
I'm fine with removing these static final Fieldsclasses, but is it something we want to do everywhere?
There was a problem hiding this comment.
I don't know, I think I did it because I wanted to reuse these fields and in general we tend to be removing these. It's a harmless change, not that we have to make it everywhere. I think I will :)
| builder.startObject(); | ||
| builder.field(_INDEX, index); | ||
| builder.field(_TYPE, type); | ||
| builder.field(_ID, id); |
| ensureExpectedToken(XContentParser.Token.FIELD_NAME, parser.currentToken(), parser::getTokenLocation); | ||
| String fieldName = parser.currentName(); | ||
| List<Object> values = new ArrayList<>(); | ||
| XContentParser.Token token = parser.nextToken(); |
There was a problem hiding this comment.
Looks like this reference is not really useful
| * | ||
| * @param xContentType the content type, used to determine what the expected values are for float numbers. | ||
| */ | ||
| public static Tuple<List<Object>, List<Object>> randomStoredFieldValues(XContentType xContentType) { |
There was a problem hiding this comment.
I think it can be private for now and the TODO removed... If really needed outside of this test, we could maybe create a RandomObjects utilities class in the test framework and add all this kind of method there.
There was a problem hiding this comment.
I already know it is needed outside of this test, other apis support stored_fields etc. I will do that
| originalValues.add(randomDouble); | ||
| expectedParsedValues.add(randomDouble); | ||
| break; | ||
| case 5: |
| assertEquals(expectedGetResult, parsedGetResult); | ||
| //print the parsed object out and test that the output is the same as the original output | ||
| BytesReference finalBytes = toXContent(parsedGetResult, xContentType, false); | ||
| assertSameOutput(originalBytes, finalBytes, xContentType); |
There was a problem hiding this comment.
Can we add a comment line like:
// GetResult can contain stored fields of various types, including floats, where order cannot be guaranteed.
// Since a byte-to-byte comparison cannot be done we use assertSameOutput.
There was a problem hiding this comment.
to clarify, I don't think that this is valid only for this case. Ordering doesn't have to do with floats. It has to do with using maps to store stuff. And we use maps in a lot of places. After all keys ordering doesn't matter in json, byte per byte comparison are not going to fly. That is why I proposed to move it out to some utility class.
There was a problem hiding this comment.
of course it also works around the SMILE issue with floats, but using maps is also needed with other formats whenever we have maps.
| } | ||
| } | ||
|
|
||
| private static void addFields(XContentBuilder builder, int currentDepth) throws IOException { |
There was a problem hiding this comment.
Can we add a comment on what the method does?
| } | ||
|
|
||
| private static Object randomFieldValue() { | ||
| switch(randomIntBetween(0, 3)) { |
There was a problem hiding this comment.
Can be changed to randomFrom(randomAsciiOfLength(5), randomIntBetween(0,3), etc);
There was a problem hiding this comment.
it's more concise, but we end up generating many random values that we don't use. We keep only one. I think I prefer the old fashioned switch.
| * Returns the bytes that represent the XContent output of the provided {@link ToXContent} object, using the provided | ||
| * {@link XContentType}. Wraps the output into a new anonymous object depending on the value of the wrapInObject argument. | ||
| */ | ||
| public static BytesReference toXContent(ToXContent toXContent, XContentType xContentType, boolean wrapInObject) throws IOException { |
Moved field values `toXContent` logic to `GetField` (from `GetResult`), which outputs its own fields, and can also parse them now. Also added `fromXContent` to `GetResult` and `GetResponse`. The start object and end object for `GetResponse` output have been moved to `GetResult#toXContent`, from the corresponding rest action. This makes it possible to have `toXContent` and `fromXContent` completely symmetric, as parsing requires looping till an end object is found which is weird when the corresponding `toXContent` doesn't print that out. This also introduces the foundation for testing retrieval of _source and stored field values.
| replaceBytesArrays(actualMap); | ||
| try (XContentParser expectedParser = xContentType.xContent().createParser(expected)) { | ||
| Map<String, Object> expectedMap = expectedParser.map(); | ||
| replaceBytesArrays(expectedMap); |
There was a problem hiding this comment.
I removed this shenanigans on master: 38914f1 cc @tlrx @cbuescher
Moved field values
toXContentlogic toGetField(fromGetResult), which outputs its own fields, and can also parse them now. Also addedfromXContenttoGetResultandGetResponse.The start object and end object for the get response output have been moved to
GetResult#toXContent, from the corresponding rest action. This makes it possible to havetoXContentandfromXContentcompletely symmetric, as parsing requires looping till an end object is found which is weird when the correspondingtoXContentdoesn't print that object out.