Skip to content

[7.x][ML] Improve progress reportings for DF analytics (#45856)#45910

Merged
dimitris-athanasiou merged 1 commit intoelastic:7.xfrom
dimitris-athanasiou:better-progress-tracking-for-df-analytics-7x
Aug 23, 2019
Merged

[7.x][ML] Improve progress reportings for DF analytics (#45856)#45910
dimitris-athanasiou merged 1 commit intoelastic:7.xfrom
dimitris-athanasiou:better-progress-tracking-for-df-analytics-7x

Conversation

@dimitris-athanasiou
Copy link
Copy Markdown
Contributor

Previously, the stats API reports a progress percentage
for DF analytics tasks that are running and are in the
reindexing or analyzing state.

This means that when the task is stopped there is no progress
reported. Thus, one cannot distinguish between a task that never
run to one that completed.

In addition, there are blind spots in the progress reporting.
In particular, we do not account for when data is loaded into the
process. We also do not account for when results are written.

This commit addresses the above issues. It changes progress
to being a list of objects, each one describing the phase
and its progress as a percentage. We currently have 4 phases:
reindexing, loading_data, analyzing, writing_results.

When the task stops, progress is persisted as a document in the
state index. The stats API now reports progress from in-memory
if the task is running, or returns the persisted document
(if there is one).

Previously, the stats API reports a progress percentage
for DF analytics tasks that are running and are in the
`reindexing` or `analyzing` state.

This means that when the task is `stopped` there is no progress
reported. Thus, one cannot distinguish between a task that never
run to one that completed.

In addition, there are blind spots in the progress reporting.
In particular, we do not account for when data is loaded into the
process. We also do not account for when results are written.

This commit addresses the above issues. It changes progress
to being a list of objects, each one describing the phase
and its progress as a percentage. We currently have 4 phases:
reindexing, loading_data, analyzing, writing_results.

When the task stops, progress is persisted as a document in the
state index. The stats API now reports progress from in-memory
if the task is running, or returns the persisted document
(if there is one).
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ml-core

@dimitris-athanasiou dimitris-athanasiou merged commit be554fe into elastic:7.x Aug 23, 2019
@dimitris-athanasiou dimitris-athanasiou deleted the better-progress-tracking-for-df-analytics-7x branch August 23, 2019 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants