Skip to content

Score threshold not applied to reranked score in hybrid search #35233

@d5devgodai-blip

Description

@d5devgodai-blip

Self Checks

  • I have read the Contributing Guide and Language Policy.
  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report, otherwise it will be closed.
  • 【中文用户 & Non English User】请使用英语提交,否则会被关闭 :)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

1.13.3

Cloud or Self Hosted

Cloud

Steps to reproduce

Environment

  • Platform: Dify Cloud
  • Retrieval mode: Hybrid search
  • Embedding model: text-embedding-3-large (OpenAI)
  • Rerank model: jina-reranker-v1-base-en (Jina)

Describe the bug

When using hybrid search with a rerank model configured, setting a score threshold of 0.60 returns empty results. However, lowering the threshold to 0.45 returns chunks that display scores of 0.840.96.

If the returned chunks are displaying scores of 0.840.96, they should have easily passed a threshold of 0.60. This is contradictory and confusing behavior.

This indicates that the threshold is being applied against a different score than what is displayed in the result. The user sets a threshold expecting it to filter against the displayed score — but it does not.


Steps to reproduce

  1. Create a knowledge base with hybrid search enabled
  2. Configure Jina reranker
  3. Set score threshold to 0.60 on retrieval node
  4. Send a query
  5. Retrieval node returns empty result
Image
  1. Lower threshold to 0.45 on a second retrieval node with the same query
  2. Returns 10 chunks with displayed scores of 0.840.96

Proof

Node Threshold Chunks returned Displayed score range
Node 1 0.60 0
Node 2 0.45 10 0.840.96

The same chunks that were blocked at threshold 0.60 display scores of 0.840.96 when retrieved at threshold 0.45. This means the threshold and the displayed score are not measuring the same thing.


✔️ Expected Behavior

Expected behavior

The score threshold should filter against the reranked score — which is the score displayed to the user in the result.

If a chunk displays a score of 0.84, it must pass a threshold of 0.60. The threshold value and the displayed score must be consistent so that users can set a meaningful threshold based on what they see.


❌ Actual Behavior

Actual behavior

The threshold filters against the raw hybrid score before reranking. The displayed score is the reranked score. These are two completely different scoring systems with different value distributions — but the same threshold value is silently applied to both without any indication to the user.

This means:

  • A user sets threshold 0.60 expecting to filter out chunks scoring below 0.60
  • But chunks scoring 0.840.96 (displayed) are blocked
  • The user has no visibility into the raw hybrid score that is actually used for filtering
  • The displayed score gives a false impression of what passed the threshold

Impact

  • Valid queries return empty results at reasonable threshold values
  • The score threshold setting is misleading — it does not behave as the UI implies
  • Users are forced to use unnecessary workarounds such as multiple retrieval nodes with different thresholds to compensate for this behavior

Current workaround

Two retrieval nodes are required:

Node Threshold Purpose
Node 1 0.60 Primary retrieval
Node 2 0.45 Fallback when Node 1 returns empty
Code node 0.65 Manual filter on displayed reranked score

This workaround should not be necessary.


Suggested fix

When a rerank model is configured, the score threshold should be applied against the reranked score — the same score that is displayed to the user. The threshold and the displayed score must always refer to the same value.


Related issue

#3146

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions