Skip to content

VPC Subnet Routing#490

Merged
FelixMcFelix merged 14 commits into
masterfrom
custom-subnet-routing
Jun 13, 2024
Merged

VPC Subnet Routing#490
FelixMcFelix merged 14 commits into
masterfrom
custom-subnet-routing

Conversation

@FelixMcFelix

@FelixMcFelix FelixMcFelix commented Apr 9, 2024

Copy link
Copy Markdown
Collaborator

This PR introduces various features needed in OPTE side for VPC subnet routing:

  • Add ioctl for deletion of router rules.
  • Support for the 'system' vs 'custom' router distinction, and prioritisation of 'custom' rules of equal prefix length.
  • Allow IP spoof-detection to be disabled (and re-enabled) on a per-CIDR basis. This is necassary to enable rules such as 192.168.0.0/16 -> ip(instance_abc) for software routing or enabling VPN endpoints.

I'm leaving this in draft while the omicron-side pieces are in development.

This also adds in the `RouterClass` specifier, which does not yet do
anything.
@FelixMcFelix FelixMcFelix force-pushed the custom-subnet-routing branch from 9149747 to 1ebbbb6 Compare April 9, 2024 12:17
RFD 21 goes into some detail describing use cases where users might want
to use one instance as a tunnel endpoint for a VPN, and how traffic
towards an arbitrary CIDR can be routed to an instance for, e.g.,
software routing purposes.

To enable this, we need to be able to setup holes in the anti-spoof
filters in the gateway, which are added here as a new pair of ioctls to
add and remove these holes. It's not clear from the RFD whether this is
to be an on/off toggle or as fine-grained as the new ioctls allow, but
we can suoport either case here.
@FelixMcFelix FelixMcFelix self-assigned this Apr 22, 2024
@FelixMcFelix FelixMcFelix added this to the 9 milestone Apr 22, 2024

@zeeshanlakhani zeeshanlakhani left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor stuff while looking at this code in draft.

Comment thread lib/oxide-vpc/src/engine/gateway/mod.rs Outdated
Comment thread lib/oxide-vpc/src/engine/gateway/mod.rs Outdated
dest: IpCidr,
vpc_mappings: Arc<VpcMappings>,
) -> (Rule<Finalized>, Rule<Finalized>) {
let vpc_meta = Arc::new(VpcMeta::new(vpc_mappings));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n00b2 Q: Would it make sense to implement MetaAction (used with vpc_meta / Action::Meta below) on vpc_mappings directly or impl it on Arc<VpcMappings> so you don't have to initialize the Arc here (again)?

@FelixMcFelix FelixMcFelix May 31, 2024

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double-Arc is really unfortunate for sure. We could impl MetaAction for VpcMappings, but we'd need to recast the design of the MetaAction to use something additional like RuleDisplay instead of Display, I think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FelixMcFelix ah interesting, I'll look further at this then.

@FelixMcFelix FelixMcFelix marked this pull request as ready for review May 31, 2024 16:27
@FelixMcFelix FelixMcFelix changed the title WIP: VPC Subnet Routing VPC Subnet Routing May 31, 2024
Breaks out the transit IPs/CIDR allowlist functions as requested.

@luqmana luqmana left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together Kyle! The demo was great and excited to see this land. Looks good overall but one question about dealing with partial failures during add/remove.

Comment thread lib/oxide-vpc/src/engine/router.rs Outdated
Comment thread lib/oxide-vpc/tests/integration_tests.rs
Comment thread lib/oxide-vpc/tests/integration_tests.rs
Comment thread bin/opteadm/src/bin/opteadm.rs Outdated
Comment thread lib/oxide-vpc/src/engine/gateway/transit.rs Outdated
Comment thread lib/oxide-vpc/src/engine/gateway/transit.rs Outdated

@FelixMcFelix FelixMcFelix left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review Luqman!

Comment thread bin/opteadm/src/bin/opteadm.rs Outdated
Comment thread lib/oxide-vpc/src/engine/gateway/transit.rs Outdated
Comment thread lib/oxide-vpc/src/engine/gateway/transit.rs Outdated
Comment thread lib/oxide-vpc/src/engine/router.rs Outdated
Comment thread lib/oxide-vpc/src/api.rs
//
// `prefix_len` comes from the passed in `cidr` argument.
//
// One bit is used to ensure that 'custom' router rules take precedence

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: is there a test that specifically tests precedence? It's probably already captured, but I thought to ask.

@FelixMcFelix FelixMcFelix Jun 12, 2024

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. The new system vs. custom route precedence is tested in the new intra_subnet_routes_with_custom test, adding a second rule with an identical destination CIDR:

// Suppose the user now installs a 'custom' route in the first subnet to
// drop traffic towards the second subnet. This rule must take priority.
router::add_entry(
&g1.port,
cidr.clone(),
RouterTarget::Drop,
RouterClass::Custom,
)
.unwrap();
incr!(g1, ["epoch", "router.rules.out"]);
let mut pkt2 = gen_icmpv4_echo_req(
g1_cfg.guest_mac,
g1_cfg.gateway_mac,
g1_cfg.ipv4().private_ip,
dst_ip,
7777,
1,
data,
1,
);
let res = g1.port.process(Out, &mut pkt2, ActionMeta::new());
assert!(matches!(
res,
Ok(ProcessResult::Drop {
reason: DropReason::Layer { name: "router", .. }
})
));

I don't believe we have any explicit tests on e.g. the priority values we're using to get LPM-like behaviour, though.

@luqmana luqmana left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggestion for opteadm but otherwise lgtm! Thanks Kyle.

Comment thread bin/opteadm/src/bin/opteadm.rs Outdated
/// Control whether traffic on the CIDR should be allowed for
/// inbound or outbound traffic.
#[arg(long = "dir")]
direction: Direction,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given opteadm is primarily a dev tool, I think making this an Option<Direction> and doing the ioctl calls for both directions if it's None is fine.

@FelixMcFelix FelixMcFelix merged commit 915975f into master Jun 13, 2024
@FelixMcFelix FelixMcFelix deleted the custom-subnet-routing branch June 13, 2024 11:10
FelixMcFelix added a commit to oxidecomputer/omicron that referenced this pull request Jun 26, 2024
This PR wires up all the backing machinery for VPC subnet routing, and
automatically resolves and pushes updated rules to sleds using an RPW.
This allows instances in all subnets of a VPC to talk with one another
-- assuming no firewall rules have been configured otherwise. At a high
level, this works by a few changes:
* During the VPC create saga, we now push two rules explicitly to the
system router -- default routes from `(0.0.0.0/0, ::/0) ->
inetgw:outbound`.
* Any CRUD operation on a VPC subnet will reconcile the set of VPC
subnet routes within the system router to have one entry per subnet.
This takes the form `subnet:{name} -> subnet:{name}` for each subnet,
which are later resolved to both v4 and v6 entries.
* Ports are created using route information known to sled-agent -- this
defaults to an empty route set for instances/probes, and an internet
gateway rule for services to enable early NTP sync.
* Routes are sync'd with sleds using a new background task. Broadly,
this asks each sled for the set of VPCs and subnets it has ports on, and
a version for the current route set installed in each. The background
task will use this information to determine which routes must be
rebuilt, and will send updated versions out in response.

The most immediate consequence in this PR is that hosts within a subnet
-- on different VPCs -- will be able to talk with one another at last.
The user facing API (#2116) will be re-enabled in a concurrent PR --
#5823 -- as will NIC spoof detection hole-punching.

Depends on oxidecomputer/opte#490.

Closes #2232, Fixes #1336.

---

A few pieces will block tests passing & merge-readiness:
- [x] Creation of a `lab-2.0-opte-0.32` image.
- [x] Merge of oxidecomputer/maghemite#274 (and updating all the right
SHAs in this PR).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants