Skip to content

[7.x] [ML] Fixing inference stats race condition (#55163)#55486

Merged
benwtrent merged 1 commit intoelastic:7.xfrom
benwtrent:backport/7.x/pr-55163
Apr 20, 2020
Merged

[7.x] [ML] Fixing inference stats race condition (#55163)#55486
benwtrent merged 1 commit intoelastic:7.xfrom
benwtrent:backport/7.x/pr-55163

Conversation

@benwtrent
Copy link
Copy Markdown
Member

Backports the following commits to 7.x:

`updateAndGet` could actually call the internal method more than once on contention.
If I read the JavaDocs, it says:
```* @param updateFunction a side-effect-free function```
So, it could be getting multiple updates on contention, thus having a race condition where stats are double counted.

To fix, I am going to use a `ReadWriteLock`. The `LongAdder` objects allows fast thread safe writes in high contention environments. These can be protected by the `ReadWriteLock::readLock`.

When stats are persisted, I need to call reset on all these adders. This is NOT thread safe if additions are taking place concurrently. So, I am going to protect with `ReadWriteLock::writeLock`.

This should prevent race conditions while allowing high (ish) throughput in the highly contention paths in inference.

I did some simple throughput tests and this change is not significantly slower and is simpler to grok (IMO).

closes  elastic#54786
@benwtrent benwtrent added :ml Machine learning backport labels Apr 20, 2020
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent benwtrent merged commit cabff65 into elastic:7.x Apr 20, 2020
@benwtrent benwtrent deleted the backport/7.x/pr-55163 branch April 20, 2020 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport :ml Machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants