Skip to content

[CI] DefaultEndPointsIT testInferDeploysDefaultRerank failing #144162

@elasticsearchmachine

Description

@elasticsearchmachine

Build Scans:

Reproduction Line:

./gradlew ":x-pack:plugin:inference:qa:inference-service-tests:javaRestTest" --tests "org.elasticsearch.xpack.inference.DefaultEndPointsIT.testInferDeploysDefaultRerank" -Dtests.seed=E4F8862B5D8BEFE2 -Dtests.locale=it-Latn-IT -Dtests.timezone=America/Indiana/Marengo -Druntime.java=25

Applicable branches:
main

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

org.elasticsearch.client.ResponseException: method [POST], host [http://[::1]:42505], URI [_inference/rerank/.rerank-v1-elasticsearch?error_trace&timeout=120s], status line [HTTP/1.1 500 Internal Server Error]
{"error":{"root_cause":[{"type":"exception","reason":"Error writing to inference process","stack_trace":"org.elasticsearch.ElasticsearchException: Error writing to inference process\n\tat org.elasticsearch.xcore@9.4.0-SNAPSHOT/org.elasticsearch.xpack.core.ml.utils.ExceptionsHelper.serverError(ExceptionsHelper.java:69)\n\tat org.elasticsearch.ml@9.4.0-SNAPSHOT/org.elasticsearch.xpack.ml.inference.deployment.InferencePyTorchAction.doRun(InferencePyTorchAction.java:157)\n\tat org.elasticsearch.server@9.4.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1114)\n\tat org.elasticsearch.server@9.4.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)\n\tat org.elastics
[truncated]

Issue Reasons:

  • [main] 4 consecutive failures in test testInferDeploysDefaultRerank
  • [main] 2 consecutive failures in step debian-12_platform-support-unix
  • [main] 2 consecutive failures in step rhel-10_platform-support-unix
  • [main] 2 consecutive failures in step sles-15_platform-support-unix
  • [main] 7 consecutive failures in step part-2
  • [main] 2 consecutive failures in step amazonlinux-2023_platform-support-aws
  • [main] 5 consecutive failures in step part2
  • [main] 2 consecutive failures in step ubuntu-2404-aarch64_checkpart2_platform-support-arm
  • [main] 2 consecutive failures in step almalinux-8-aarch64_checkpart2_platform-support-arm
  • [main] 6 consecutive failures in pipeline elasticsearch-pull-request
  • [main] 5 consecutive failures in pipeline elasticsearch-intake
  • [main] 30 failures in test testInferDeploysDefaultRerank (9.6% fail rate in 312 executions)
  • [main] 2 failures in step debian-12_platform-support-unix (100.0% fail rate in 2 executions)
  • [main] 2 failures in step rhel-10_platform-support-unix (100.0% fail rate in 2 executions)
  • [main] 2 failures in step sles-15_platform-support-unix (100.0% fail rate in 2 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :mlMachine learning>test-failureTriaged test failures from CITeam:MLMeta label for the ML teamneeds:riskRequires assignment of a risk label (low, medium, blocker)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions