[ML] Label anomalies with multi_bucket_impact#34233
Conversation
Add the multi_bucket_impact field to record results.
|
Pinging @elastic/ml-core |
droberts195
left a comment
There was a problem hiding this comment.
I suggested a renaming change but I think IntelliJ's refactoring functionality (right click -> refactor -> rename) should make light work of it. It has the option to automatically rename arguments and accessors.
| * Result fields (all detector types) | ||
| */ | ||
| public static final ParseField PROBABILITY = new ParseField("probability"); | ||
| public static final ParseField IMPACT = new ParseField("multi_bucket_impact"); |
There was a problem hiding this comment.
I think it would be more future proof if the field was MULTI_BUCKET_IMPACT rather than just IMPACT. (You never know if we might need to add some other type of impact in the future.)
Same in the server-side file.
| private final String jobId; | ||
| private int detectorIndex; | ||
| private double probability; | ||
| private double impact; |
There was a problem hiding this comment.
Similar to above, I think this should be called multiBucketImpact, and the accessors should also be renamed to match.
Also, I think it should be an object of type Double so that it can be null. If people get results from old versions the field won't exist and having it as a double will set it to 0.0 in these cases.
Same in the server-side file.
| && this.detectorIndex == that.detectorIndex | ||
| && this.bucketSpan == that.bucketSpan | ||
| && this.probability == that.probability | ||
| && this.impact == that.impact |
There was a problem hiding this comment.
This will need changing to Objects.equals() for type Double.
Same in the server-side file.
| jobId = in.readString(); | ||
| detectorIndex = in.readInt(); | ||
| probability = in.readDouble(); | ||
| impact = in.readDouble(); |
There was a problem hiding this comment.
To make this backwards compatible it needs to be:
if (in.version().onOrAfter(Version.V_6_5_0)) {
impact = in.readOptionalDouble();
}
| out.writeString(jobId); | ||
| out.writeInt(detectorIndex); | ||
| out.writeDouble(probability); | ||
| out.writeDouble(impact); |
There was a problem hiding this comment.
To make this backwards compatible it needs to be:
if (out.version().onOrAfter(Version.V_6_5_0)) {
out.writeOptionalDouble(impact);
}
| builder.field(Job.ID.getPreferredName(), jobId); | ||
| builder.field(Result.RESULT_TYPE.getPreferredName(), RESULT_TYPE_VALUE); | ||
| builder.field(PROBABILITY.getPreferredName(), probability); | ||
| builder.field(IMPACT.getPreferredName(), impact); |
There was a problem hiding this comment.
When the type is changed to Double this should be wrapped in a null check so we write nothing if it's null.
| anomalyRecord.setActual(Collections.singletonList(randomDouble())); | ||
| anomalyRecord.setTypical(Collections.singletonList(randomDouble())); | ||
| anomalyRecord.setProbability(randomDouble()); | ||
| anomalyRecord.setImpact(randomDouble()); |
There was a problem hiding this comment.
To confirm null works once the type is changed to Double, surround this line with if (randomBoolean()) {.
Same for the server-side file.
There was a problem hiding this comment.
Will do - thanks for the quick review @droberts195
droberts195
left a comment
There was a problem hiding this comment.
LGTM as long as the CI goes green.
If there's a failure it's most likely related to the BWC aspects.
* [ML] Label anomalies with multi_bucket_impact Add the multi_bucket_impact field to record results.
* master: (25 commits) HLRC: ML Adding get datafeed stats API (elastic#34271) Small fixes to the HLRC watcher documentation. (elastic#34306) Tasks: Document that status is not semvered (elastic#34270) HLRC: ML Add preview datafeed api (elastic#34284) [CI] Fix bogus ScheduleWithFixedDelayTests.testRunnableRunsAtMostOnceAfterCancellation Fix error in documentation for activete watch SCRIPTING: Terms set query expression (elastic#33856) Logging: Drop remaining Settings log ctor (elastic#34149) [ML] Remove unused last_data_time member from Job (elastic#34262) Docs: Allow skipping response assertions (elastic#34240) HLRC: Add activate watch action (elastic#33988) [Security] Multi Index Expression alias wildcard exclusion (elastic#34144) [ML] Label anomalies with multi_bucket_impact (elastic#34233) Document smtp.ssl.trust configuration option (elastic#34275) Support PKCS#11 tokens as keystores and truststores (elastic#34063) Fix sporadic failure in NestedObjectMapperTests [Authz] Allow update settings action for system user (elastic#34030) Replace version with reader cache key in IndicesRequestCache (elastic#34189) [TESTS] Set SO_LINGER and SO_REUSEADDR on the mock socket (elastic#34211) Security: upgrade unboundid ldapsdk to 4.0.8 (elastic#34247) ...
* rename-ccr-stats: (25 commits) HLRC: ML Adding get datafeed stats API (elastic#34271) Small fixes to the HLRC watcher documentation. (elastic#34306) Tasks: Document that status is not semvered (elastic#34270) HLRC: ML Add preview datafeed api (elastic#34284) [CI] Fix bogus ScheduleWithFixedDelayTests.testRunnableRunsAtMostOnceAfterCancellation Fix error in documentation for activete watch SCRIPTING: Terms set query expression (elastic#33856) Logging: Drop remaining Settings log ctor (elastic#34149) [ML] Remove unused last_data_time member from Job (elastic#34262) Docs: Allow skipping response assertions (elastic#34240) HLRC: Add activate watch action (elastic#33988) [Security] Multi Index Expression alias wildcard exclusion (elastic#34144) [ML] Label anomalies with multi_bucket_impact (elastic#34233) Document smtp.ssl.trust configuration option (elastic#34275) Support PKCS#11 tokens as keystores and truststores (elastic#34063) Fix sporadic failure in NestedObjectMapperTests [Authz] Allow update settings action for system user (elastic#34030) Replace version with reader cache key in IndicesRequestCache (elastic#34189) [TESTS] Set SO_LINGER and SO_REUSEADDR on the mock socket (elastic#34211) Security: upgrade unboundid ldapsdk to 4.0.8 (elastic#34247) ...
* [ML] Label anomalies with multi_bucket_impact Add the multi_bucket_impact field to record results.
Add the multi_bucket_impact field to record results.