[Ingest] Simulate Endpoint#14572
Conversation
22b6077 to
13f9b32
Compare
There was a problem hiding this comment.
Something like this:
threadPool.executor(ThreadPool.Names.MANAGEMENT).execute(() -> {
payload.pipeline().execute(data);
responses.add(new SimulatedItemResponse(data));
});There was a problem hiding this comment.
cool.
then I think all of this can be moved into the SimulatePipelineRequestPayload class (might need to rename class to be shorter 😄)
and called like
payload.execute(threadPool)
226d809 to
8e1934f
Compare
There was a problem hiding this comment.
if we go for supporting get too, use the util RestActions#hasBodyContent and getRestContent instead, those support the body passed in as the source query string parameter, which all of the core apis should support when they support a get with body (for clients that don't support sending a body with get)
|
I did a round of review and left some comments and some questions, but it looks very good already! |
There was a problem hiding this comment.
Maybe move the pipeline factory to be a field of the factory class?
c393940 to
c844da3
Compare
There was a problem hiding this comment.
I believe this needs to be updated to include the metadata as it was given in the input
so something more like
Map<String, Object> dataDoc = ...
dataDoc.put("_index", data.getIndex());
dataDoc.put("_type", data.getType());
dataDoc.put("_id", data.getId());
builder.field(Fields.DOCUMENT, dataDoc);
Although, I think it would be better for Data to provide a getDocumentWithMetadata to do this sort of thing so it can be re-used and agnostic to any new fields we have in metadata.
what do you think @javanna, @martijnvg ?
There was a problem hiding this comment.
unfortunately Data can't implement ToXContent... maybe we should just have a static helper method that serialises the Data with an XContentBuilder?
There was a problem hiding this comment.
oh I just realized that I suggested below to make Data implement both Streamable and ToXContent...which is not correct as it would mess up dependencies. I think sooner or later we should extract these classes to a different module to prevent mistakes and confusion and enforce the no deps rule that we follow. We should have some shared code around outputting Data to ToXContent and serializing it too. Shall we have an enhanced version of Data as part of the plugin that accepts a Data instance but also implements ToXContent and Streamable?
There was a problem hiding this comment.
I don't see yet specific equals/hashcode tests, I think we could benefit from those. I know that they have been generated and are most likely fine at the moment, but pretty sure without tests they will get out of sync with future modifications.
There was a problem hiding this comment.
added a couple,
found a really annoying problem that needs to be worked around...
Throwable comparison does not work...
Objects.equals(Throwable, Throwable)
I added logic to say that two Throwables are equivalent so long as they are of the same class... I am not a fan of this solution, have any ideas?
54d2a90 to
cccad8d
Compare
There was a problem hiding this comment.
This needs cleaning up.
testing whether two Throwable objects are equal is not working on the object level, so I decided to just test whether they are of the same class type
|
@javanna I don't know what I've done! 😄 In trying to clean things up and making them more straightforward to test... I introduced a few things that are not ideal. I left them as inline comments. let me know what you think! I do not think they are that big of a deal, but they added a lot of code. |
e453fee to
154faf0
Compare
…To and streamable
154faf0 to
20384ae
Compare
Removed equals and hashcode whenever they wouldn't be reliable because of exception comparison. at the end of the day we use them for testing and we can simplify our tests without requiring equals and hashcode in prod code, which also would require more tests if maintained. Add equals/hashcode test for Data/TransportData and randomize existing serialization tests
|
@talevy what you did looks good. I did simplify a few things around serialization in my last commit. I also moved some of the classes to implement Writeable rather than Streamable so we can have final fields and we can drop the default constructors that were needed for serialization only. I removed equals/hashcode when they required exceptions comparisons as that wasn't ideal and used only for testing, which can be simplified. Can you please have a look and tell me what you think? |
|
++ LGTM that commit really addressed what I was trying to clean up |
Only MutateProcessor implemented equals / hashcode hence we would only use that one in our tests, since they relied on them. Better to not rely on equals/hashcode, drop them and mock processor/pipeline in our tests that need them. That also allow to make MutateProcessor constructor package private as the other processors.
There was a problem hiding this comment.
I didn't realize we would return NPE. Is that because a field is not there? In general I think we should return nicer errors as much as we can.
There was a problem hiding this comment.
I believe this NPE is thrown by the MutateProcessor.
Data#getProperty should wrap the exception is the suggestion?
There was a problem hiding this comment.
yea ok let's do it on a separate issue though, doesn't need to be part of the PR, although it will require changing these tests ;)
|
I pushed a few more more cleanup commits and left a comment inline on error messages. One more thing is about the fact that we continue processing in case of failure. I understand why but it seems inconsistent given that we do the opposite in reality (when not simulating) till we will add support for continue_on_failure. Seems like this api should behave the same as the real one? What do you think? |
|
continuing through the processors is only done during a Verbose execution. It is still the case that the pipeline halts during a regular non-verbose _simulate run |
ok I guess it's fine for now, we have to remember to adapt it to continue on failure once we add the flag I think |
I agree. I added a note to the |
I think it is ready.
Adds
_simulateendpoint to enable the processing of provided documents without indexing