Skip to content

[7.x][ML] Validate at least one feature is available for DF analytics…#55914

Merged
dimitris-athanasiou merged 1 commit intoelastic:7.xfrom
dimitris-athanasiou:validate-at-least-one-feature-available-7x
Apr 29, 2020
Merged

[7.x][ML] Validate at least one feature is available for DF analytics…#55914
dimitris-athanasiou merged 1 commit intoelastic:7.xfrom
dimitris-athanasiou:validate-at-least-one-feature-available-7x

Conversation

@dimitris-athanasiou
Copy link
Copy Markdown
Contributor

… (#55876)

We were previously checking at least one supported field existed
when the _explain API was called. However, in the case of analyses
with required fields (e.g. regression) we were not accounting that
the dependent variable is not a feature and thus if the source index
only contains the dependent variable field there are no features to
train a model on.

This commit adds a validation that at least one feature is available
for analysis. Note that we also move that validation away from
ExtractedFieldsDetector and the _explain API and straight into
the _start API. The reason for doing this is to allow the user to use
the _explain API in order to understand why they would be seeing an
error like this one.

For example, the user might be using an index that has fields but
they are of unsupported types. If they start the job and get
an error that there are no features, they will wonder why that is.
Calling the _explain API will show them that all their fields are
unsupported. If the _explain API was failing instead, there would
be no way for the user to understand why all those fields are
ignored.

Closes #55593

Backport of #55876

…elastic#55876)

We were previously checking at least one supported field existed
when the _explain API was called. However, in the case of analyses
with required fields (e.g. regression) we were not accounting that
the dependent variable is not a feature and thus if the source index
only contains the dependent variable field there are no features to
train a model on.

This commit adds a validation that at least one feature is available
for analysis. Note that we also move that validation away from
`ExtractedFieldsDetector` and the _explain API and straight into
the _start API. The reason for doing this is to allow the user to use
the _explain API in order to understand why they would be seeing an
error like this one.

For example, the user might be using an index that has fields but
they are of unsupported types. If they start the job and get
an error that there are no features, they will wonder why that is.
Calling the _explain API will show them that all their fields are
unsupported. If the _explain API was failing instead, there would
be no way for the user to understand why all those fields are
ignored.

Closes elastic#55593

Backport of elastic#55876
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ml-core (:ml)

@dimitris-athanasiou dimitris-athanasiou merged commit d9685a0 into elastic:7.x Apr 29, 2020
@dimitris-athanasiou dimitris-athanasiou deleted the validate-at-least-one-feature-available-7x branch April 29, 2020 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport >bug :ml Machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants