Found in 7.7.0-SNAPSHOT "build_hash" : "2f0aca992bb8c91c17603050807891cad2e41483", "build_date" : "2020-03-16T02:52:34.086738Z",
- 3 node cluster, all nodes acting as data, master and ml
- All nodes are co-located on the same 16GB VM
"xpack.ml.max_machine_memory_percent" : 16
I have a script that creates 16 jobs in succession. Each job requires 2GB model memory.
The first 3 jobs open and the datafeeds start.
The 4th job returns opened:false and the datafeed fails to start with the following:
open job {"opened":false}
start datafeed {"error":{"root_cause":[{"type":"status_exception","reason":"Could not start datafeed, allocation explanation []"}],"type":"status_exception","reason":"Could not ...
In the job list, the job state is opening and the datafeed state is stopped. No errors are visible.
As one of the first 3 jobs completes, one of the opening jobs transitions its state to opened. However the datafeed remains stopped.
These are the job messages for a job that was lazy opening.

Expected behavior would be for the datafeed to be starting and for it to start once resource became available (which would happen when one of the other jobs closed, in this scenario).
Once jobs have completed, I can manually start the datafeed on one of the opened jobs and it will complete without on-screen errors. (I cannot start one of the opening jobs, which is to be expected.)
Found in 7.7.0-SNAPSHOT
"build_hash" : "2f0aca992bb8c91c17603050807891cad2e41483", "build_date" : "2020-03-16T02:52:34.086738Z","xpack.ml.max_machine_memory_percent" : 16I have a script that creates 16 jobs in succession. Each job requires 2GB model memory.
The first 3 jobs open and the datafeeds start.
The 4th job returns
opened:falseand the datafeed fails to start with the following:In the job list, the job state is
openingand the datafeed state isstopped. No errors are visible.As one of the first 3 jobs completes, one of the
openingjobs transitions its state toopened. However the datafeed remainsstopped.These are the job messages for a job that was lazy opening.

Expected behavior would be for the datafeed to be
startingand for it to start once resource became available (which would happen when one of the other jobs closed, in this scenario).Once jobs have completed, I can manually start the datafeed on one of the
openedjobs and it will complete without on-screen errors. (I cannot start one of theopeningjobs, which is to be expected.)