Skip to content

[ML] Store anomaly detection config file and input on demand#110582

Merged
valeriy42 merged 10 commits intoelastic:record-anomaly-detection-messagesfrom
valeriy42:record_ad_values
Jul 8, 2024
Merged

[ML] Store anomaly detection config file and input on demand#110582
valeriy42 merged 10 commits intoelastic:record-anomaly-detection-messagesfrom
valeriy42:record_ad_values

Conversation

@valeriy42
Copy link
Copy Markdown
Contributor

@valeriy42 valeriy42 commented Jul 8, 2024

DO NOT MERGE THIS INTO main!

This PR enables the storage of data and configuration of an anomaly detection job in files so it can be reproduced using the autodetect process without Elasticsearch.

To enable the storage, specify keep_job_data parameter in the custom_settings parameter of the job config:

  "custom_settings": {
    "keep_job_data": "true"
    } 

Now, start the job and watch for a log message with the autodetect command similar to the following:

[2024-06-19T16:03:38,248][INFO ][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] [Elastic-MBP.fritz.box] Autodetect process command: [./autodetect, --lengthEncodedInput, --maxAnomalyRecords=500, --validElasticLicenseKeyConfirmed=true, --config=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/config10764979302390040373.json, --logPipe=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_log_45530, --input=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530, --inputIsPipe, --output=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_output_45530, --outputIsPipe, --persist=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_persist_45530, --persistIsPipe, --namedPipeConnectTimeout=10]

and

[2024-06-19T15:29:08,640][INFO ][o.e.x.m.p.w.LengthEncodedWriter]  Opening file: /var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530 for writing.

Copy the config file, the persist file from the first message, and the input file from the second message.

@valeriy42 valeriy42 requested a review from a team as a code owner July 8, 2024 11:27
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jul 8, 2024
@valeriy42 valeriy42 added >non-issue :ml Machine learning and removed needs:triage Requires assignment of a team area label labels Jul 8, 2024
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jul 8, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/ml-core (Team:ML)

@valeriy42 valeriy42 merged commit 411ddcc into elastic:record-anomaly-detection-messages Jul 8, 2024
valeriy42 added a commit to valeriy42/elasticsearch that referenced this pull request Jul 24, 2024
…#110582)

DO NOT MERGE THIS INTO `main`!

This PR enables the storage of data and configuration of an anomaly detection job in files so it can be reproduced using the `autodetect` process without Elasticsearch.

To enable the storage, specify `keep_job_data` parameter in the `custom_settings` parameter of the job config:

```json
  "custom_settings": {
    "keep_job_data": "true"
    } 
```

Now, start the job and watch for a log message with the autodetect command similar to the following:

```bash
[2024-06-19T16:03:38,248][INFO ][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] [Elastic-MBP.fritz.box] Autodetect process command: [./autodetect, --lengthEncodedInput, --maxAnomalyRecords=500, --validElasticLicenseKeyConfirmed=true, --config=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/config10764979302390040373.json, --logPipe=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_log_45530, --input=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530, --inputIsPipe, --output=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_output_45530, --outputIsPipe, --persist=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_persist_45530, --persistIsPipe, --namedPipeConnectTimeout=10]
```
and
```bash
[2024-06-19T15:29:08,640][INFO ][o.e.x.m.p.w.LengthEncodedWriter]  Opening file: /var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530 for writing.
```

Copy the config file, the persist file from the first message, and the input file from the second message.
valeriy42 added a commit to valeriy42/elasticsearch that referenced this pull request Jul 24, 2024
valeriy42 added a commit to valeriy42/elasticsearch that referenced this pull request Aug 6, 2024
…#110582)

DO NOT MERGE THIS INTO `main`!

This PR enables the storage of data and configuration of an anomaly detection job in files so it can be reproduced using the `autodetect` process without Elasticsearch.

To enable the storage, specify `keep_job_data` parameter in the `custom_settings` parameter of the job config:

```json
  "custom_settings": {
    "keep_job_data": "true"
    } 
```

Now, start the job and watch for a log message with the autodetect command similar to the following:

```bash
[2024-06-19T16:03:38,248][INFO ][o.e.x.m.j.p.a.NativeAutodetectProcessFactory] [Elastic-MBP.fritz.box] Autodetect process command: [./autodetect, --lengthEncodedInput, --maxAnomalyRecords=500, --validElasticLicenseKeyConfirmed=true, --config=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/config10764979302390040373.json, --logPipe=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_log_45530, --input=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530, --inputIsPipe, --output=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_output_45530, --outputIsPipe, --persist=/var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_persist_45530, --persistIsPipe, --namedPipeConnectTimeout=10]
```
and
```bash
[2024-06-19T15:29:08,640][INFO ][o.e.x.m.p.w.LengthEncodedWriter]  Opening file: /var/folders/_j/gcj6z4b950bdzpw7_fzrmpf40000gn/T/elasticsearch-12668972032551307591/autodetect_test-2_input_45530 for writing.
```

Copy the config file, the persist file from the first message, and the input file from the second message.
valeriy42 added a commit to valeriy42/elasticsearch that referenced this pull request Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:ml Machine learning >non-issue Team:ML Meta label for the ML team v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants