Skip to content

common, doc: document retirement of client-exposed feature bits#51488

Closed
rzarzynski wants to merge 1 commit intoceph:mainfrom
rzarzynski:wip-all-kickoff-s
Closed

common, doc: document retirement of client-exposed feature bits#51488
rzarzynski wants to merge 1 commit intoceph:mainfrom
rzarzynski:wip-all-kickoff-s

Conversation

@rzarzynski
Copy link
Contributor

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

* 2) For feature bits that were *anyhow* exposed to clients (even through
* a common part like OSDMap), the requirements are stricter: instead of
* the n-2 rule we have for the inter-daemon communication, for clients
* we *guarantee* compatibility up n-3 major upstream versions. To exemplify:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, if the DEPRECATED macros and MASK stuff doesn't make it "do the right thing" for random clients, I think feature bits which are client-visible are just...not reusable or able to be retired.

@athanatos @jdurgin do either of you remember much of the planning around feature bit retirement? I don't think I got involved in its development, but I think I would have noticed if we went from "all clients meeting the admin-configurable min_compat_client" to "3 versions, and incredibly messy for the kernel". I'm sure @idryomov has thoughts about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, I don't remember, I'd need to take a bit of time and think it through.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client-facing feature bits can't really be retired in an orderly fashion, so any N-<something> rule is bound to not work, exactly because

we have little control over kernel versions and user-space libraries within third party's containers (old librbd, ceph-fuse and so on).

We have done this only once before, slurping up all argonaut-ish feature bits. I suppose we could do this again for pre-hammer feature bits some time soon but this would have to be an ad-hoc, review each feature bit sort of thing. If the intent is to document the procedure, I would rather document that there is no procedure and go with something like

For feature bits that were anyhow exposed to clients (even through a common part like OSDMap): don't touch the ones that are already there, think hard before introducing a new one, assume a lifetime of at least 10 years.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input, Ilya!

My intention is just to document the state of the thing – discover & document. I'm aware that another views (i.e. n - 3 as as hard guarantee) also exist. I think it would be good to describe also how mon_osd_initial_require_min_compat_client interplays here.

CC: @gregsfortytwo, @athanatos, @jdurgin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require_min_compat_client is only tangentially related: it's a setting that is intended to prevent an administrator from accidentally enabling features that older clients would choke on (i.e. features that would prevent older clients from connection to the cluster). The fact that it's set to luminous by default doesn't mean much: in the end it's based on the same feature bits, many of which can be manipulated in isolation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear: it's possible to have e.g. hammer-era clients connect to a reef cluster after making some adjustments (mostly regarding CRUSH tunables) no matter what require_min_compat_client is set to.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
@rzarzynski rzarzynski changed the title kickoff v19 squid common, doc: document retirement of client-exposed feature bits Aug 28, 2023
@rzarzynski
Copy link
Contributor Author

I dissected the kick off related commit into #53191 and renamed the PR to keep the discussion on about client-exposed feature bits.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Oct 27, 2023
@github-actions
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants