Implement new analysis type: classification#46537
Implement new analysis type: classification#46537przemekwitek merged 4 commits intoelastic:masterfrom
Conversation
0193e12 to
6f92943
Compare
0f6518c to
1429840
Compare
ede5e3d to
07ffd52
Compare
e40d5a3 to
1d71028
Compare
|
Pinging @elastic/ml-core |
|
run elasticsearch-ci/bwc |
747d4bd to
96c48a1
Compare
...in/core/src/main/java/org/elasticsearch/xpack/core/ml/dataframe/analyses/Classification.java
Outdated
Show resolved
Hide resolved
...in/core/src/main/java/org/elasticsearch/xpack/core/ml/dataframe/analyses/Classification.java
Outdated
Show resolved
Hide resolved
...plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/dataframe/analyses/Regression.java
Outdated
Show resolved
Hide resolved
...plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/dataframe/analyses/Regression.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
It almost seems like we need a new interface for the different analysis.
Unsupervised vs supervised... But that might be a future refactoring
There was a problem hiding this comment.
It almost seems like we need a new interface for the different analysis.
Yes, we may end up doing that.
But that might be a future refactoring
Agree, let's not add more interfaces too early.
...asticsearch/xpack/ml/dataframe/process/customprocessing/DatasetSplittingCustomProcessor.java
Outdated
Show resolved
Hide resolved
benwtrent
left a comment
There was a problem hiding this comment.
might be good to have @dimitris-athanasiou give it a once over :). I don't see any major problems
dde7a9f to
4a2583f
Compare
c35932e to
ba8bd1d
Compare
dimitris-athanasiou
left a comment
There was a problem hiding this comment.
Looks good! Left a few minor comments.
There was a problem hiding this comment.
Clever trick for reusing code.
However, this made me wonder whether those params should be in a nested object. It'd be ugly though, wouldn't it?
There was a problem hiding this comment.
It's a matter of taste ;)
Parsing code would actually become a bit cleaner as I could just declare the BoostedTreeParams field here and it would have its own parser.
However, with nested object:
- we need to double-check which parameters we want to move there. I just picked the obvious ones but maybe e.g.
dependentVariableshould live there as well? - we need to add BWC handling
WDYT?
There was a problem hiding this comment.
Yeah, I agree we can leave it as is.
There was a problem hiding this comment.
Perhaps add a default constructor for BoostedTreeParams to avoid those nulls?
There was a problem hiding this comment.
Does num_top_classes have a fixed default value? If so we should set it explicitly.
There was a problem hiding this comment.
Done.
I think the default value should be "0".
There was a problem hiding this comment.
We need to add BWC handling here.
There was a problem hiding this comment.
I think the code (as it is written right now) is backward-compatible as the sequence of StreamInput reads in the old version is the same as in the new version (the new version has the reads wrapped in the new BoostedTreeParams(in) constructor.
It would change, however, if I introduced a nested object.
|
run elasticsearch-ci/bwc |
374f275 to
18ee05b
Compare
|
run elasticsearch-ci/bwc |
|
run elasticsearch-ci/default-distro |
18ee05b to
08a9fc1
Compare
08a9fc1 to
00541d9
Compare
Implement new analysis type:
Classification.Also, extract the common parameters between
ClassificationandRegressionto a separate class:BoostedTreeParams.This PR is not fully functional until changes on C++ are made (WIP).
However, I've sent it to review to gather feedback about the Java part.
Relates #46735