Skip to content

smb: support custom ip address binds#64142

Merged
adk3798 merged 9 commits intoceph:mainfrom
phlogistonjohn:jjm-smb-ip-bind
Jul 25, 2025
Merged

smb: support custom ip address binds#64142
adk3798 merged 9 commits intoceph:mainfrom
phlogistonjohn:jjm-smb-ip-bind

Conversation

@phlogistonjohn
Copy link
Contributor

@phlogistonjohn phlogistonjohn commented Jun 24, 2025

Depends on #63924

Add a bind_addrs option to the smb cluster resource and smb service spec.

resources:
-   resource_type: ceph.smb.cluster
    cluster_id: mycluster
    # ... other cluster fields ...
    bind_addrs:
      - address: 192.168.76.200
      - address: 192.168.76.201
      - address: 192.168.76.202

OR

resources:
-   resource_type: ceph.smb.cluster
    cluster_id: mycluster
    # ... other cluster fields ...
    bind_addrs:
      - network: 192.168.78.0/24

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

@phlogistonjohn phlogistonjohn requested review from a team as code owners June 24, 2025 17:11
@phlogistonjohn phlogistonjohn marked this pull request as draft June 24, 2025 17:11
@phlogistonjohn phlogistonjohn force-pushed the jjm-smb-ip-bind branch 3 times, most recently from 5aec27d to a8ef39b Compare June 26, 2025 17:47
@phlogistonjohn
Copy link
Contributor Author

So, it seems like I can't test the IP address custom binds for smb in teuthology. Even after actually refactoring the vip code yesterday and adding features to it. It turns out that cephadm doesn't "see" all the IPs on the system. It doesn't recognize vips added to the node, at least the way teuthology does it. This seems to be due to how list-networks works.

The code for ipv4 in host_facts.py only looks at the output of ip route ls. Oddly, the ipv6 code looks at both ip route ls and ip addr ls. I don't know why this code is asymmetric in this way.

I don't really want to change how cephadm reads networks right now, just for this PR. Doubly so because I know that samba is also flakey and not 100% working well in all modes for this feature. So as far as my PR goes I may skip adding teuthology tests. I'll possibly follow it up later with changes to how VIPs work and general changes to cephadm networking detection - but probably after samba is improved.

I just wanted to complain a bit and justify why the final version of this PR will lack teuthology tests but the custom ports one had new tests.

@github-actions
Copy link

github-actions bot commented Jul 9, 2025

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@phlogistonjohn phlogistonjohn changed the title [WIP] smb: support custom ip address binds smb: support custom ip address binds Jul 10, 2025
@phlogistonjohn phlogistonjohn marked this pull request as ready for review July 10, 2025 14:19
@phlogistonjohn
Copy link
Contributor Author

This is ready for review, but this is still a bit of a best-effort feature at the moment, especially for CTDB enabled clusters. To make it reliable I think some samba work is needed - plus as I noted above it lacks teuthology tests. But IMO it's good enough to get into other people's hands for additional testing, etc.

@phlogistonjohn
Copy link
Contributor Author

jenkins test make check

@phlogistonjohn
Copy link
Contributor Author

jenkins test make check arm64

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@phlogistonjohn
Copy link
Contributor Author

There's a bunch of debug logging in here I forgot to clean up. Moving back to draft for now.

@phlogistonjohn
Copy link
Contributor Author

The debug logging was all in one commit and has been cleaned up. Moving back to reviewable.

@phlogistonjohn phlogistonjohn marked this pull request as ready for review July 17, 2025 13:30
@adk3798
Copy link
Contributor

adk3798 commented Jul 17, 2025

@adk3798
Copy link
Contributor

adk3798 commented Jul 17, 2025

jenkins test make check arm64

Copy link
Contributor

@adk3798 adk3798 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cephadm patches LGTM. Would we want some explicit teuthology test for this? It looks like in order to get some of these certain binds to work you're setting some ctdb specific options here and changing the network type of the smb container, which I don't think would get tested with our current set of tests.

addrs = [b.address for b in self._cfg.bind_to]
# filter out any endpoints that don't refer to the specific
# IP addresses our service will bind to
endpoints[:] = [ep for ep in endpoints if ep.ip in addrs]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why we're assigning to endpoints[:] instead of just endpoints here. Online search says the [:] format creates a shallow copy of the original list. Is that allowing this to act as a sort of pointer to the caller's version of the list? Which wouldn't work without this format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the customize_container_endpoints works by mutating the endpoints argument passed to the function. Typically, you can just append an item or whatever and when the function returns (None), the caller just uses the same list it passed to customize_container_endpoints. However, in this case we want to completely change the contents of this list. Since we want the caller to "see" our changes we can't just assign to endpoints (like endpoints = [...]) because that would create a new list that the local variable refers to but the caller would not see it. Here's a simple example:

>>> def foo(a):
...     a = [1,2,3,4]
...     print('foo:', a)
...     
>>> def bar(a):
...     a[:] = [1,2,3,4]
...     print('bar:', a)
...     
>>> 
>>> l = [0,0]
>>> foo(l)
foo: [1, 2, 3, 4]
>>> print(l)
[0, 0]
>>> 
>>> l = [0,0]
>>> bar(l)
bar: [1, 2, 3, 4]
>>> print(l)
[1, 2, 3, 4]
>>> 

It's a bit like running:

>>> l.clear()
>>> l.extend([...])  # imagine the whole list-comp here

but on one line.

Weirdly I tried to find a succinct page that references this specific technique but neither ddg or google turned up anything I'd be happy sharing. So you will need to take my word that this is idiomatic python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amusingly, I let the ddg "AI" box answer 'python replace entire list using slice' and it generated a pretty good response, but neither of it's references we're particularly clear!

It generated:

You can replace an entire list in Python using slicing by assigning a new list to the slice that covers the entire original list. For example, if you have my_list = [1, 2, 3], you can replace it with my_list[:] = [4, 5, 6], which will change my_list to [4, 5, 6]. labex.io sparkbyexamples.com

FWIW

@adk3798
Copy link
Contributor

adk3798 commented Jul 23, 2025

jenkins test make check arm64

Prepare schedule.py for per-service-type host filtering based on allowed
host addresses/networks. Add a new HostSelector protocol type to the
module defining what the filtering interface looks like.

This interface is intended allows CephService classes to "take over" the
network based filtering of nodes prior to placement and customize the
behavior of this step in cephamd's placement algorithm.

Note that the type must be passed in to the HostAssignment class as an
optional argument. If nothing is passed the class behaves as it did
before.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
A previous commit added a HostSelector protocol type to the schedule
code. This change makes it so the function calling upon the
HostAssignment class detects if a CephService provides a
filter_host_candidates method - meaning the service class can act as a
HostSelector. If the class can be a HostSelector pass it to the
HostAssignment so that the custom selection operation can be run.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add a `bind_addrs` field and `SMBClusterBindIPSpec` to the smb service
spec. If specified the `bind_addrs` field can contain one or more
SMBClusterBindIPSpec value. In JSON these values can contain either an
address `{"address": "192.168.76.10"}` or network `{"network":
"192.168.76.0/24"}`.

These specs will be used by cephadm to place the smb service only on
hosts that have IPs matching the supplied IP Address/Network values. It
will also instruct the smb services to only bind to these addresses.

A suggested future enhancement may be include an IP address range
representation for the SMBClusterBindIPSpec.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add a filter_host_candidates method to the smb service class allowing
that class to act as a HostSelector. The HostSelector was added in an
earlier commit to allow classes like this one to make specific host
selections based on unique to that class (or it's spec) criteria.

This method uses the newly added `bind_addrs` field of the smb service
spec to ensure only hosts that meet the desired set of
networks/addresses get used in placement.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Within the cephadm smb service class we have logic to help manage CTDB's
nodes. Ensure that this node handling logic also conforms to the recent
addition of the smb service's bind_addrs field.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Tell the cephadm binary deploying an smb service about the networks
this smb service will be binding to.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add a bunch of code to support specific IP address (and/or interface -
see below) binds for the smb service. When the smb service is not
clustered it is using container networking - in this case we use
publish options for the container manager to only listen on the supplied
addresses.

When the smb service is clustered we need to jump through a bunch of
hoops to configure each service individually. Many are easy with just
a short set of CLI options. CTDB only listens on the (first) node
address that it can bind to and only that. smbd has complex interactions
based on the `interfaces` and `bind interfaces only` config parameters.
Because these parameters may be unique to a node (addresses certainly
will be - and interfaces names could be) we can not store this in
the registry based conf. Instead, we take the slightly hacky approach
of generating a stub conf file with just the interfaces related params
in them and telling sambacc to generate a config that includes this
stub config.

IMPORTANT: When using ctdb with public addresses smbd doesn't know what
additional IPs it may need to listen to, so instead of binding to
a fixed IP we configure it to use an interface. This does have a
downside of possibly listening to another address on the same interface
we don't want it to. Additionally, I have observed that as addresses
are added or removed from the interface by ctdb, smbd doesn't
consistently start listening to those addresses.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add bind_addrs, which is largely a wrapper around the service spec's
bind_addrs to the smb cluster resource.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Add some documentation for the bind_addrs option including a warning
about how combining public_addrs and bind_addrs gets quirky.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Copy link
Contributor

@anoopcs9 anoopcs9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general approach lgtm.

@anoopcs9
Copy link
Contributor

jenkins test make check

@adk3798
Copy link
Contributor

adk3798 commented Jul 24, 2025

jenkins test make check arm64

@adk3798
Copy link
Contributor

adk3798 commented Jul 24, 2025

@phlogistonjohn
Copy link
Contributor Author

Somehow I Missed the comment Re: Teuthology until going through my mail. As I noted in #64142 (comment) I can't really test this with the current state of teuthology's VIP support and/or the way cephadm detects IP addresses. I'd like to enhance teuthology and cephadm to make these tests happen but right now my focus is on the features due to downstream deadlines. Plus, I know the IBM QE team is manually testing this stuff and most users are not going to immediately jump to using this new stuff and the existing tests have good coverage of the existing behavior. So I'm deferring adding teuthology tests for this feature for now.

Unfortunately, a lot of the smb features I am adding are in a similar situation because they all have somewhat complex external dependencies like needing TLS with CA certificates or a KMIP server, or a grpc client. So I want to get the feature PRs done and eventually come back to do a "teuthology refresh" for smb in the future.

@adk3798 adk3798 merged commit e01f17a into ceph:main Jul 25, 2025
14 checks passed
@phlogistonjohn phlogistonjohn deleted the jjm-smb-ip-bind branch August 15, 2025 16:40
@phlogistonjohn phlogistonjohn added the wip-spuiuk-tracking Sachin Prabhu - tracking label Sep 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants