Skip to content

SOLR-17319: CompoundQueryComponent scribbles continued with CompoundQueryComponentTest#3648

Draft
cpoerschke wants to merge 34 commits intoapache:mainfrom
cpoerschke:SOLR-17319-compound-query-component-continued
Draft

SOLR-17319: CompoundQueryComponent scribbles continued with CompoundQueryComponentTest#3648
cpoerschke wants to merge 34 commits intoapache:mainfrom
cpoerschke:SOLR-17319-compound-query-component-continued

Conversation

@cpoerschke
Copy link
Copy Markdown
Contributor

continuation of the earlier #2597 scribbles

https://issues.apache.org/jira/browse/SOLR-17319

Copy link
Copy Markdown
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I freaking love the elegance / simplicity here. It re-uses / leverages the existing distributed search phases of the SearchComponent design (as I hinted in JIRA may be possible), and meanwhile making that design more flexible for this new use-case. And reuses TopDocs.rrf logic :-) I imagine this approach is doing query work in parallel without any extra code? That said, clearly this is a draft / POC. Perhaps there are serious flaws that aren't easily evident in a code review. Testing was minimal. I see no accommodations for faceting to ensure we don't wastefully compute the docSet.

@ercsonusharma Did you see this? Please give this a good review. I suspect the best solution may be a combination of both efforts. For example your (recently redone) tests, documentation, user experience (how to use it) are nice.

I'd be tempted to run both solutions with distributed tracing enabled to a local Zipkin/Jaeger to visualize what's going on. I used to have a shelved code snippet to instrument a test to do this.

@cpoerschke
Copy link
Copy Markdown
Contributor Author

I imagine this approach is doing query work in parallel without any extra code?

Yes, that's my understanding too.

I see no accommodations for faceting to ensure we don't wastefully compute the docSet.

Yeah, my understanding is that with this sort of approach there is conceptually no (obvious) support for faceting. Today, Solr's client would make two queries and then outside-of-Solr do the RRF locally -- conceptually that is what the approach here does too. Because the query work is in parallel and the doc sets are (I think) not passed back to where the fusion happens, then truly-correct faceting would not implicitly be available. ...

... having said that, currently the sub-queries are there and there are only sub-queries, so I imagine a fusion-aware faceting component might do an OR of the sub-queries and then send that for faceting purposes only i.e. with rows=0 to the shards.

final TopDocs[] hits = new TopDocs[crb.responseBuilders.size()];
for (int crb_idx = 0; crb_idx < crb.responseBuilders.size(); ++crb_idx) {

final SolrDocumentList sdl = crb.responseBuilders.get(crb_idx).getResponseDocs();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering where is this being populated?

if (rb instanceof CompoundResponseBuilder crb) {
for (var rb_i : crb.responseBuilders) {
if (rb_i.isThisFromMe(sreq)) {
super.handleResponses(rb_i, sreq);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using mergeIds on each of the crb and overriding the rb.resultIds?

@ercsonusharma
Copy link
Copy Markdown
Contributor

ercsonusharma commented Sep 19, 2025

Thanks for the draft @cpoerschke - appreciate it!
I did a quick review.

Below are my initial observations:
Honestly, I couldn't understand most of this part here, as this is a draft; however, I have added a few comments.
Creating a new stage in the distributed phase may create lots of complexity around other components. It makes sense when you are creating a new set of queries that have to be fired across the shards, but the ultimate goal of the phase introduced here is just to merge the results of all the shardresponse which can be done in the handleResponse method of QueryComponent without adding any additional phase. Rather, I see reusing the same stage of the distributed phase makes a lot of things easier to handle.

And reuses TopDocs.rrf logic :

I already checked TopDocs.rrf, and it is meant for rrf on TopDocs, especially for Lucene. I find it less effort in creating something ShardDoc.rrf rather than converting the SolrDocumentList to TopcDocs and back.

I imagine this approach is doing query work in parallel without any extra code?

Query work is already happening in parallel by design of the distributed process. The things that need to be parallelised is merging the docs per query post querying, which I see sequentially here as well.

I'm really doubtful at this point in time as to how the faceting & highlighting would work.

@@ -212,27 +212,31 @@ public void handleResponses(ResponseBuilder rb, ShardRequest sreq) {}

@Override
public void finishStage(ResponseBuilder rb) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Hide whitespace' diff mode makes it easier to see that the changes to this method are small (and extremely subtle, should add comments really, later).

@cpoerschke
Copy link
Copy Markdown
Contributor Author

Thanks @ercsonusharma for starting to take a look here!

Yes, there's relatively little code but I totally appreciate that it's not easy to understand.

... with distributed tracing enabled to a local Zipkin/Jaeger to visualize what's going on. ...

That's an interesting idea, thanks @dsmiley for sharing! @hossman's "Lifecycle of a Solr Search Request" talk slides and recording from quite-a-while-ago-now also spring to mind for me here.

Brief replies to some of your initial observations.

... Creating a new stage in the distributed phase may create lots of complexity around other components. ...

Yes, for each component it will need to be considered how it should behave in a fusion scenario and whether or not it should participate in the fusion stage, or just skip it, which is the implied default.

... Query work is already happening in parallel by design of the distributed process. ...

The distributed process provides parallelism on the shard level e.g. if we have 10 shards then all 10 will run in parallel. Within each shard however, as I understand the current #3418 code, the CombinedQueryComponent.process method will run one sub-query after the other, where the line 181 comment says "// TODO: to be parallelized" currently.

The parallelism here on #3648 is on the shard level and also on the sub-queries level:

  • CombinedQueryComponent.distributedProcess will for one sub-query after the other add a request to the outgoing queue of requests,
  • the search handler will send each added request to the 10 shards (so now we have 20 things happening in parallel!!),
  • the shards will process each request sent to them (running exactly only one of the sub-queries),
  • the 20 responses will be received and handled (with the subtlety of needing to make sure that the response is handled only by whoever added it).

... how the faceting & highlighting would work.

Answering only on the highlighting work at this time, I've advanced the proof-of-concept further so that the test case does cover highlighting. The highlighting test passes and it may help to develop an understanding on how that extra fusion stage fits into the picture.

... with distributed tracing enabled to a local Zipkin/Jaeger to visualize what's going on. I used to have a shelved code snippet to instrument a test to do this.

@dsmiley - I don't suppose you'd be in a position to dig up or dust off that code snippet?

@cpoerschke
Copy link
Copy Markdown
Contributor Author

I see no accommodations for faceting to ensure we don't wastefully compute the docSet.

... having said that, currently the sub-queries are there and there are only sub-queries, so I imagine a fusion-aware faceting component might do an OR of the sub-queries and then send that for faceting purposes only i.e. with rows=0 to the shards.

ca3088d explores this approach albeit without changes to the faceting component.

@dsmiley
Copy link
Copy Markdown
Contributor

dsmiley commented Oct 3, 2025

I don't suppose you'd be in a position to dig up or dust off that code snippet?

#3729


public class CompoundResponseBuilder extends ResponseBuilder {

public static final int STAGE_FUSION = STAGE_GET_FIELDS + 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, fusion would come before STAGE_GET_FIELDS so that we don't bother getting fields of docs that don't make the top-X. That's a 2x difference of doc field fetching.

Comment on lines +64 to +66
// undo any 'facets=true' that FacetComponent.modifyRequest might have done
sreq.params.remove(FacetParams.FACET, "true");
sreq.purpose &= ~ShardRequest.PURPOSE_GET_FACETS;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This special case is an unfortunate wrinkle of imperfection on the simplicity of the whole approach. But it's just a couple lines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants