[ML] Change inference cache to store only the inner part of results by droberts195 · Pull Request #2376 · elastic/ml-cpp

droberts195 · 2022-07-28T18:29:29Z

Previously the inference cache stored complete results, including
a request ID and time taken. This was inefficient as it then meant
the original response had to be parsed and modified before sending
back to the Java side.

This PR changes the cache to store just the inner portion of the
inference result. Then the outer layer is added per request after
retrieving from the cache.

Additionally, the result writing functions are moved into a class
of their own, which means they can be unit tested.

Companion to elastic/elasticsearch#88901

Hacky work in progress to test - do not review yet Previously the inference cache stored complete results, including a request ID and time taken. This was inefficient as it then meant the original response had to be parsed and modified before sending back to the Java side. This PR changes the cache to store just the inner portion of the inference result. Then the outer layer is added per request after retrieving from the cache.

These tests will fail if elastic/ml-cpp#2376 with them unmuted. elastic#88901 will follow up with the Java side changes.

These tests will fail if elastic/ml-cpp#2376 with them unmuted. #88901 will follow up with the Java side changes.

dimitris-athanasiou

LGTM

droberts195 · 2022-08-03T13:14:44Z

retest

…8901) This change will facilitate a performance improvement on the C++ side. The request ID and cache hit indicator are the parts that need to be changed when the C++ process responds to an inference request. Having them at the top level means we do not need to parse and manipulate the original response - we can simply cache the inner object of the response and add the outer fields around it when serializing it. Companion to elastic/ml-cpp#2376

In elastic#2376 the results writer for inference was changed to splice a preformatted cached section of the final result document into an outer wrapper that varies per request. The approach used worked when assertions were disabled but tripped assertions in RapidJSON due to its internal counters thinking that the spliced portion was a single value instead of a key and value together. This PR fixes the code when assertions are enabled by taking advantage of the fact that the internal value count of the RapidJSON writer is accessible to derived classes, so we can add a method that adds a key and value together and increments the counter twice.

Adapt the test script for the result format changes in #2376

droberts195 added >non-issue :ml 3rd party models v8.5.0 labels Jul 28, 2022

droberts195 added 5 commits August 2, 2022 10:46

Merge branch 'main' into refactor_pytorch_results

debe259

Fix JSON syntax error

9e796f6

Fix partial JSON writing

5386757

Merge branch 'main' into refactor_pytorch_results

c9ad451

Move result writer into own class

3103551

droberts195 marked this pull request as ready for review August 3, 2022 09:13

droberts195 added a commit to droberts195/elasticsearch that referenced this pull request Aug 3, 2022

[ML] Mute tests for inference result format change

d30826d

These tests will fail if elastic/ml-cpp#2376 with them unmuted. elastic#88901 will follow up with the Java side changes.

This was referenced Aug 3, 2022

[ML] Mute tests for inference result format change elastic/elasticsearch#89075

Merged

[ML] Move PyTorch request ID and cache hit indicator to top level elastic/elasticsearch#88901

Merged

droberts195 added a commit to elastic/elasticsearch that referenced this pull request Aug 3, 2022

[ML] Mute tests for inference result format change (#89075)

30142ea

These tests will fail if elastic/ml-cpp#2376 with them unmuted. #88901 will follow up with the Java side changes.

droberts195 mentioned this pull request Aug 3, 2022

[ML] add new trained model deployment cache clear API elastic/elasticsearch#89074

Merged

dimitris-athanasiou approved these changes Aug 3, 2022

View reviewed changes

droberts195 merged commit 4266662 into elastic:main Aug 3, 2022

droberts195 deleted the refactor_pytorch_results branch August 3, 2022 14:25

droberts195 mentioned this pull request Aug 4, 2022

[ML] Fix debug build for raw JSON splicing #2384

Merged

davidkyle mentioned this pull request Jun 19, 2023

[NLP] Update evaluate.py for results format changes #2533

Merged

davidkyle added a commit that referenced this pull request Jun 19, 2023

[NLP] Update evaluate.py for results format changes (#2533)

47612a7

Adapt the test script for the result format changes in #2376

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Change inference cache to store only the inner part of results#2376

[ML] Change inference cache to store only the inner part of results#2376
droberts195 merged 6 commits intoelastic:mainfrom
droberts195:refactor_pytorch_results

droberts195 commented Jul 28, 2022 •

edited

Loading

Uh oh!

dimitris-athanasiou left a comment

Uh oh!

droberts195 commented Aug 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

droberts195 commented Jul 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dimitris-athanasiou left a comment

Choose a reason for hiding this comment

Uh oh!

droberts195 commented Aug 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

droberts195 commented Jul 28, 2022 •

edited

Loading