Skip to content

Querier: Added native multi-tenancy support#4141

Closed
Abhishek357 wants to merge 11 commits intothanos-io:mainfrom
Abhishek357:querier-multi-tenancy
Closed

Querier: Added native multi-tenancy support#4141
Abhishek357 wants to merge 11 commits intothanos-io:mainfrom
Abhishek357:querier-multi-tenancy

Conversation

@Abhishek357
Copy link
Contributor

@Abhishek357 Abhishek357 commented May 3, 2021

Signed-off-by: Abhishek357 abhisheksinghchauhan357@gmail.com

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Fixes #3822 Related to Proposal
We take the tenant header and replace the tenant label if specified in the query, if not just append it in prometheus label matches using enforceMatcher(helper function defined in prom-label proxy). Then these labels are further used to fan out a query to store API which advertises these external tenant label. Therefore achieving tenant isolation.

We use the same injection code for raw matches from the prom-label proxy, so the injection and multi-tenancy / cross tenancy are coming from the single, well-tested library.

Alternative considered

  1. Using complete prom-label proxy as a library, importing it, and using it to modify the query itself. In Prom-label proxy we extract expression using parser from the query and then convert to matchers inside EnforceNode, then enforces label in enforceMatcher, then convert again to query. and these queries are again converted to matchers in Thanos and then further processed. Why such overhead if we are finally going to convert queries to the matcher inside Thanos and let's enforce the labels there itself using enforceMatcher.

  2. Having a dedicated thanos field in Info method in Store API.

    • Working smoothly by using tenant label as external label, didn't find a solid use case for this new field.

Verification

Wrote a e2e test.

@kakkoyun kakkoyun requested a review from brancz May 4, 2021 11:45
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far.
It'd be amazing to create a wrapper around existing querier, such as QuerierWithTenant. It'd make the code cleaner and more maintainable.

@kakkoyun kakkoyun assigned bwplotka and unassigned bwplotka May 4, 2021
@kakkoyun kakkoyun requested a review from bwplotka May 4, 2021 11:55
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level questions (maybe even proposal)

  • Injecting identifier what tenant is making the call (to any API)

  • Injecting identifier of data that we have to scope to (single-tenant, multiple in future) - I guess this is the purpose of this PR?

  • Headers have to be configured, the same tenant label (avoid hardcoded strings)

  • Can we reuse promlabel. It's used mainly for PromQL mangling, however, we could decouple injection and have the same injection code for PromQL and raw matches, so the injection and multi-tenancy / cross tenancy are coming from the single, well-tested library. That we can use in Thanos, Observatorium, PromLabel proxy sidecar it. One source of truth.

@Abhishek357
Copy link
Contributor Author

Abhishek357 commented May 15, 2021

Blocked by #66
UPD: PR is merged :)

@Abhishek357 Abhishek357 marked this pull request as ready for review May 15, 2021 07:59
Copy link
Contributor

@yashrsharma44 yashrsharma44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I need to test it out to cross-check the findings :)

@Abhishek357
Copy link
Contributor Author

Looks good, I need to test it out to cross-check the findings :)

We can not test it until we merge PR in prom-label-proxy because only then we will be able to use those enforcing helper functions.

@Abhishek357 Abhishek357 force-pushed the querier-multi-tenancy branch 3 times, most recently from 9fa939c to 4ef4624 Compare May 22, 2021 06:43
defaultRangeQueryStep time.Duration,
defaultInstantQueryMaxSourceResolution time.Duration,
defaultMetadataTimeRange time.Duration,
tenantHeader string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we initiate TenantInfo where we call NewQueryAPI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we want to propagate only valuable TenantInfo, we get this when an HTTP request comes and we extract the tenant access according to the configurable tenant Header name (it's not necessary to propagate this label name).

@kakkoyun
Copy link
Member

kakkoyun commented Jun 1, 2021

Hey @Abhishek357 could you please address the review comments and resolve them afterwards? Also it'd be nice to rebase this PR.

Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
…dcoded strings

Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
Signed-off-by: Abhishek357 <abhisheksinghchauhan357@gmail.com>
@bwplotka
Copy link
Member

bwplotka commented Jul 1, 2021

This sonar type lift thingy is misleading. Those transient versions are not used in most cases, so there is no action required here.

@bwplotka
Copy link
Member

bwplotka commented Jul 1, 2021

Can you look how we can configure this to ignore those kind of false alarms? @Abhishek357 ?

@TomMD
Copy link
Contributor

TomMD commented Jul 1, 2021

Can you look how we can configure this to ignore those kind of false alarms? @Abhishek357 ?

@Abhishek357 and @bwplotka you can add a .lift/config.toml file with a field of:

ignoreRules = [ "Severe OSS Vulnerability", "Critical OSS Vulnerability" ]

Which should screen out these results. You can also look at the full configuration reference for more options.

@Abhishek357 Abhishek357 mentioned this pull request Jul 1, 2021
2 tasks
@stale
Copy link

stale bot commented Sep 3, 2021

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale label Sep 3, 2021
@stale stale bot closed this Sep 11, 2021
@ChanderG
Copy link

@bwplotka @Abhishek357 Looks like this should be merged in? Definitely a useful feature and the conversation above seems to indicate no problems/blockers.

@shaardie
Copy link

I think this is still relevant. I am also a bit confused this is closed and not merged?!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ability to require tenant in queries

7 participants