Skip to content

[ML] Inference returns different predicted_value comparing with model prediction #55332

@wwang500

Description

@wwang500

Spotted in master build 6892

when using seeds dataset, Inference returned different predicted_value comparing with model prediction. model returns 1,2,3, but inference returns 1.0,2.0,0.0

To reproduce,

  1. create a classification job.
{
  "source": {
    "index": [
      "seeds"
    ]
  },
  "dest": {
    "index": "dest_seeds_80",
    "results_field": "ml"
  },
  "analysis": {
    "classification": {
      "dependent_variable": "seed_class",
      "num_top_feature_importance_values": 2,
      "class_assignment_objective": "maximize_minimum_recall",
      "num_top_classes": 2,
      "prediction_field_name": "seed_class_prediction",
      "training_percent": 80
    }
  }
}
  1. run job, then create a pipeline, then run inference.
  • pipeline configure
PUT _ingest/pipeline/inference_pipeline
{
  "processors": [
	{
  	"inference": {
      	"model_id": "dfa_seeds_1587034738_000_0-1587049148665",
      	"target_field": "class_prediction_infer",
      	"field_map": {
      	  
      	},
      	"inference_config": {"classification": {
      	  "num_top_feature_importance_values": 3
      	}}
    	}
	}
  ]
}
  1. Now, against dest index, run aggregations on both predictions, you will get:
"aggregations" : {
    "inference" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1.0,
          "doc_count" : 71
        },
        {
          "key" : 2.0,
          "doc_count" : 70
        },
        {
          "key" : 0.0,
          "doc_count" : 69
        }
      ]
    },
    "ml" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1,
          "doc_count" : 71
        },
        {
          "key" : 2,
          "doc_count" : 70
        },
        {
          "key" : 3,
          "doc_count" : 69
        }
      ]
    }
  }

Metadata

Metadata

Assignees

Labels

:mlMachine learning

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions