Skip to content

Conversation

@fmoehler
Copy link
Contributor

@fmoehler fmoehler commented Apr 15, 2025

What is this change about?

As described in RFC https://github.com/cloudfoundry/community/blob/main/toc/rfc/rfc-0038-ipv6-dual-stack-for-cf.md bosh shall be enabled to allocate prefix ip addresses (ipv4 and ipv6). Currently bosh only supports attaching ip addresses with a /32 (ipv4) or /128 (ipv6) prefix, which are single ip addresses. To apply these changes a new property called 'prefix' is introduced in the cloud config networks section. Please refer to the example below for a manual network:

- name: diego-cells-ipv6
  subnets:
  - az: z1
    cloud_properties:
      security_groups:
      - sg-0005f94257313417d
      - sg-06acfe8fb0a6247f0
      - sg-05a8b2b2e26ac1d5d
      - sg-064a667fb375e2dac
      - sg-01bb6e1e1f821fe4c
      subnet: subnet-0beae7541e0ebf5f1
    dns:
    - 2600:1f18:7415:8f00:0000:0000:0000:0253
    prefix: 80
    gateway: 2600:1f18:7415:8f00:0000:0000:0000:0001
    range: 2600:1f18:7415:8f00:0000:0000:0000:0000/56
    reserved:
    - 2600:1f18:7415:8f00:0000:0000:0000:0002
    - 2600:1f18:7415:8f00:0000:0000:0000:0003
    - 2600:1f18:7415:8f00:ffff:ffff:ffff:ffff
  type: manual

This example network tells bosh that instead of assigning a single ip address, to assign a prefix. So to slash the /56 network into multiple /80 blocks.

The ip address allocation of the bosh director is adapted to consider these prefixes (previously the director was just counting up by 1)

One major change is how the ip addresses are stored inside the database. The address_str field in the ip_addresses table will change from storing the ip address as an integer representation of the ip to store the ip address in cidr notation. This change is necessary to not "lose" the prefix information when storing the ip address. Also it has the advantage that you can directly create an IpAddrOrCidr Object out of the string coming from the database.

This PR also changes the RPC Interface for the create_vm method. It will include a separate field called "prefix". We will send the Prefix information in a separate field to not break existing cpis. Older CPIs that do not support prefix allocation will just ignore this field. Below you can find an example network section of the create_vm call:

{"ha_proxy":{"type":"manual","ip":"10.0.72.1","prefix":"32","netmask":"255.255.224.0","cloud_properties":{"security_groups":"<redacted>","subnet":"<redacted>"},"default":["dns","gateway"],"dns":["10.0.0.2"],"gateway":"10.0.64.1"}

The prefix here is 32 indicating a single ip address.

Once the vm creation is done. The bosh director api will return prefix ips with the normal ips. E.g. the bosh cli would display this as it displayed single ip addresses before. Here one example with one ipv4, one ipv4 prefix, one ipv6 and one ipv6 prefix with two instances:

image

Documentation PR: cloudfoundry/docs-bosh#877

What tests have you run against this PR?

Unit tests
Acceptance Tests in progress

How should this change be described in bosh release notes?

Enable prefix allocation support for manual networks.

Does this PR introduce a breaking change?

No

Tag your pair, your PM, and/or team!

@anshrupani @DennisAhausSAP

@peanball
Copy link

peanball commented Apr 20, 2025

Hi @aramprice, I just watched the recording of the foundational infra meeting from April 17th.

To clarify a bit:
Diego is currently using Silk as overlay network that is independent from the IP address(es) of the Diego VM itself.

The idea for IPv6 is to use as much native network routing as possible, i.e. not use an overlay network for IPv6. This means that the IP addresses that would be assigned to the containers will be delegated from an IPv6 prefix that is assigned to the Diego VM.

Your question about whether those IP addresses would move in case of an evacuation: no. These IP addresses are not "sticky" to the app or app instance and would not move.

The goal is to make the networking setup for Diego simpler by using "native" IPv6 addressing. Traffic aimed at a particular container will reach the Diego VM via its CIDR range, and then the Diego VM's kernel can forward the traffic to the container's virtual NIC.

Please also note that the networks are supposed to be dual stack, not pure IPv6. So you would want to assign multiple (at least one IPv4 and one IPv6) networks to the same VM, as @fmoehler mentioned in the call already.

The "prefix" parameter is also the size of the prefix to delegate to each VM from a larger range. Have a look at the discussion while creating the RFC for a more extensive example.

The VM is supposed to self-assign an address to itself. Usually this is the x:y:z::1 address (i.e. the "first" address in the provided CIDR range). Addressing the VM is done via that IP address. The "remaining" addresses are then to do for the VM as it wishes.

@fmoehler fmoehler force-pushed the introduce-prefix-allocation branch 9 times, most recently from 354d3e8 to 7cdb704 Compare April 23, 2025 19:58
@aramprice
Copy link
Member

Hi @peanball,

Thanks for that additional context - the info and the RFC were helpful.

In thinking about the overall changes to Bosh a few goals or principles (maybe too strong a word) came to mind:

  • Make conditional behavior / IP-version specific code and settings as limited as possible
    • I really like that prefix: is used and not ipv6_prefix:
    • Should an IPv4 network allow/require a prefix: (one that selects a single IP) for consistency?
  • Reduce the back and forth conversion / formatting where possible (format_ip(), ip.to_i and similar)
    • Ideally a single IpAddrOrCidr (I apologize for the name) or IpAddress representation would be used, and conversions would be handled only where need by the class itself, ex: #to_db_value - for persistence, or #to_s - for logs
  • Avoid assumption in the code that bosh will always be dual-stack
    • I realize the system will start out dual-stack but this may not be forever

This isn't to imply disagreement about the current state of things, only to captures my thoughts at the moment.

@fmoehler
Copy link
Contributor Author

@aramprice thanks for your feedback!

I just want to elaborate a little on our current idea, but of course this is open for ideas. Regarding your points:

  1. We will introduce a "prefix" property in the subnet section of the network (Maybe we can also put it to the network section itself, so that it applies automatically to all subnets?). This property will be considered for ipv4 and ipv6 networks. However if the property is not defined (e.g. for existing networks), the director will consider the "prefix" as /32 (ipv4) or /128 (ipv6) and maintain this accordingly in the database.
  2. Yes we will try to get rid of as many conversion as possible. Sometimes it can be handy to have the ips in integer representation, but most of the time we will pass them as IpAddrOrCidr Objects
  3. Agree with that. I did not test this (yet), but it might not even be specific for a dual stack setup. Probably it would also need changes in the bosh agent.

@peanball
Copy link

Thanks @aramprice, I fully agree with you there. The logical unit of "IP address", with or without netmask (i.e. prefix) should be supported in either scenario and contain the logic of representation.

As @fmoehler mentioned, omitting the prefix will default to a single address (previous behavior).

And there should not be assumptions about dual stack but dual stack should be possible. So far, BOSH supported either v4 or v6, not both at the same time. The mechanism to support more than one network is a classic n+1 problem and can be solved as such where possible.

We just happen to choose n=2 with one v4 and one v6 network. This should be the mindset.

@peanball
Copy link

@fmoehler I have a general comment on the term prefix in the config. It's a bit terse and does not convey the meaning very clearly.

In the end this is the prefix (size) delegated to each instance attached to this subnet, right? Can we somehow express this better? My suggestion would be

  • delegate_prefix
  • delegate_prefix_size
  • delegate_cidr_size

alternatively s/delegate/assign/g.

@fmoehler fmoehler force-pushed the introduce-prefix-allocation branch 3 times, most recently from 9f17169 to 4b6f351 Compare May 6, 2025 13:47
@DennisAhausSAP
Copy link
Contributor

One thing regarding the display of the ip addresses. The bosh-cli is showing the ips as they come from the directors API. As @fmoehler mentioned the prefixes are stored in the db with their prefixes. This also means that the bosh-cli will show the ip addresses with their prefix. Because all the docs are showing examples with ipv4 only this would also mean that everybody has to be prepared that with this change the cli could possibly show the ips (i.e. with command bosh ... vms) in format "ip<v4\v6>/prefix".

@fmoehler
Copy link
Contributor Author

fmoehler commented Jun 2, 2025

One thing regarding the display of the ip addresses. The bosh-cli is showing the ips as they come from the directors API. As @fmoehler mentioned the prefixes are stored in the db with their prefixes. This also means that the bosh-cli will show the ip addresses with their prefix. Because all the docs are showing examples with ipv4 only this would also mean that everybody has to be prepared that with this change the cli could possibly show the ips (i.e. with command bosh ... vms) in format "ip<v4\v6>/prefix".

So far we ensured not to change the API and only return the ip (without the prefix), please refer: https://github.com/cloudfoundry/bosh/pull/2611/files#diff-c9cf4222f4ba33b6b3f35d7f32bfbb2c03a91470e18672bd3dda5468c38a0808R22 so it should not change the way ip addresses are displayed.

@fmoehler fmoehler force-pushed the introduce-prefix-allocation branch 2 times, most recently from 01c56da to acc71bd Compare June 18, 2025 08:32
fmoehler and others added 21 commits September 23, 2025 15:46
commit a52e682
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Tue Jun 24 13:24:21 2025 +0200

    adapt migration

commit 3cdec28
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Mon Jun 23 15:50:05 2025 +0200

    fix unit tests

commit 3ed1a7a
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Fri Jun 20 11:50:46 2025 +0200

    cleanup and bug fix

commit 1c092e7
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Wed Jun 18 15:42:41 2025 +0200

    fixes after db migration

commit 0a1f304
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Wed Jun 18 15:41:07 2025 +0200

    fix bug subnet and prefix reservation

commit 925a6e6
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Wed Jun 18 10:22:45 2025 +0200

    perform migration

commit ebef432
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Tue Jun 17 17:57:47 2025 +0200

    fix bug where prefix between subnet and reservation did not match

commit a9ad2ff
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Mon Jun 16 11:19:27 2025 +0200

    refactor ip_repo

commit 9f31e94
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Mon Jun 16 11:13:45 2025 +0200

    refactor instance network reservations

commit 9097d56
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Mon Jun 16 09:50:32 2025 +0200

    adapt dynamic network

commit fa2ad8f
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Jun 12 20:20:08 2025 +0200

    fix unit next part

commit d5c1dbe
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Jun 12 20:09:59 2025 +0200

    fix unit tests and error message for invalid address

commit fcaf710
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Jun 12 19:38:45 2025 +0200

    fix bug and unit tests

commit ff30c40
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Wed Jun 11 10:28:35 2025 +0200

    fix compatibility issue between new and old ip_address format

commit 48c90fa
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Tue Jun 10 16:41:06 2025 +0200

    dont update restricted ips in place

commit bceda51
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Tue Jun 10 11:39:39 2025 +0200

    return prefix for prefix addresses

commit 55bad9a
Author: Ansh Rupani <ansh.rupani@sap.com>
Date:   Mon Jun 2 14:53:00 2025 +0200

    fix unit tests

commit 776fb94
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Fri May 9 13:30:58 2025 +0200

    fix prefix for ipv6

commit acd894a
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu May 8 16:11:16 2025 +0200

    continue

commit 3a616e8
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Wed Apr 30 09:29:36 2025 +0200

    export ips_cidr in different field

commit 139e0ba
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Tue Apr 29 12:58:16 2025 +0200

    introduce static ip check for prefixes

commit f53be1a
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Mon Apr 28 17:05:50 2025 +0200

    more work

commit d0af77f
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Mon Apr 28 14:35:18 2025 +0200

    return prefix in create_vm_response

commit 5a6dfa4
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Mon Apr 28 14:14:13 2025 +0200

    unit test fixes + adaptions

commit 5da086f
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Apr 24 16:05:03 2025 +0200

    fixes

commit bb89a65
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Apr 24 15:20:53 2025 +0200

    fixes

commit 97230a3
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Apr 24 15:09:07 2025 +0200

    fixes

commit 0a63fd8
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Apr 24 14:36:46 2025 +0200

    fixes

commit 473d022
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Apr 24 12:39:42 2025 +0200

    fixes

commit 14901f4
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Apr 24 09:48:42 2025 +0200

    fixes

commit 25d00c2
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Thu Apr 24 09:36:07 2025 +0200

    remove prefix from job network

commit a4f8444
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Tue Apr 15 19:46:19 2025 +0200

    introduce new prefix column in ip addresses table

commit ca54445
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Tue Apr 15 09:03:07 2025 +0200

    create seperate reservation for prefix

commit 00d71a5
Author: Felix Moehler <felix.moehler@sap.com>
Date:   Mon Apr 14 17:05:30 2025 +0200

    introduce prefix property for networks
Co-authored-by: Ned Petrov <71276794+neddp@users.noreply.github.com>
@fmoehler fmoehler force-pushed the introduce-prefix-allocation branch from c157a3e to 44535d1 Compare September 24, 2025 07:39
@fmoehler
Copy link
Contributor Author

Rebased to 282.0.9

Copy link
Contributor

@a-hassanin a-hassanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All review comments has been addressed, and it has been running stable for the last 4 weeks on SAP landscapes, with that it has my approval.

@github-project-automation github-project-automation bot moved this from Waiting for Changes | Open for Contribution to Pending Merge | Prioritized in Foundational Infrastructure Working Group Sep 24, 2025
@aramprice aramprice merged commit 5288ef2 into cloudfoundry:main Sep 25, 2025
8 checks passed
@github-project-automation github-project-automation bot moved this from Pending Merge | Prioritized to Done in Foundational Infrastructure Working Group Sep 25, 2025
fmoehler added a commit that referenced this pull request Sep 29, 2025
This issue was introduced with #2611 . It should only reserve the first entry of the range and not everything
@fmoehler fmoehler mentioned this pull request Sep 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

9 participants