Skip to content

[ML] Validate classification dependent_variable cardinality is at lea…#51232

Merged
dimitris-athanasiou merged 5 commits intoelastic:masterfrom
dimitris-athanasiou:validate-classification-dep-var-cardinality-at-least-two
Jan 22, 2020
Merged

[ML] Validate classification dependent_variable cardinality is at lea…#51232
dimitris-athanasiou merged 5 commits intoelastic:masterfrom
dimitris-athanasiou:validate-classification-dep-var-cardinality-at-least-two

Conversation

@dimitris-athanasiou
Copy link
Copy Markdown
Contributor

…st two

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Copy Markdown

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Just a few minor comments

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/limits/constraints
?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Limits/Constraints
?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a "Matchers.empty()" matcher that could be used here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a "Matchers.empty()" matcher that could be used here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is now necessary as we are checking the field cardinality before we call startAnalytics which refreshes the dest index.

@dimitris-athanasiou
Copy link
Copy Markdown
Contributor Author

@przemekwitek I have addressed all your points plus fixes a bug regarding refreshing of the dest index which was caught by the tests.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be reverted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes of course.

@przemekwitek
Copy link
Copy Markdown

@przemekwitek I have addressed all your points plus fixes a bug regarding refreshing of the dest index which was caught by the tests.

Please see my comment in ClassificationIT. Otherwise the PR is good to go.

…st two

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.
@dimitris-athanasiou dimitris-athanasiou force-pushed the validate-classification-dep-var-cardinality-at-least-two branch from e51393e to eab85d5 Compare January 22, 2020 12:03
@dimitris-athanasiou dimitris-athanasiou merged commit a6fa577 into elastic:master Jan 22, 2020
@dimitris-athanasiou dimitris-athanasiou deleted the validate-classification-dep-var-cardinality-at-least-two branch January 22, 2020 13:47
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 22, 2020
…t lea… (elastic#51232)

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of elastic#51232
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 22, 2020
…t lea… (elastic#51232)

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of elastic#51232
dimitris-athanasiou added a commit that referenced this pull request Jan 22, 2020
…t lea… (#51232) (#51309)

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of #51232
dimitris-athanasiou added a commit that referenced this pull request Jan 22, 2020
…t lea… (#51232) (#51310)

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of #51232
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants