Introduce vhost crate to support vhost-kernel and vhost-user backend drivers #2

jiangliu · 2019-04-12T09:35:40Z

This PR migrates vhost_backend and vhost_gen code from the firecracker project into the vhost crate.
Support of vhost-user protocol will be added later.

src/lib.rs

sboeuf · 2019-04-12T14:32:04Z

sys_util/src/eventfd.rs

+///
+/// An eventfd is useful because it is sendable across processes and can be used for signaling in
+/// and out of the KVM API. They can also be polled like any other file descriptor.
+pub struct EventFd {


What's the status on this crate vmm-sys-utils? I'm asking because we cannot merge the PR as it is now, right? I don't think we want to pull sys_utils/... into the vhost crate.

Yes, that's a problem. This crate depends on the vmm-sys-utils and vm-memory.

sboeuf · 2019-04-12T14:45:20Z

src/vhost_kern/vhost_binding.rs

+
+/* Auto-generated by bindgen then manually modified for simplicity */
+
+#![allow(non_upper_case_globals)]


This is being discussed on the ML, and this is not clear yet, but it seems you will have to put those bindings in a separate crate.

I have replied to the ML thread. For the vhost case, I do feel it's better to include the binding into the crate with manual modifications.

@sameo @bonzini @andreeaflorescu @zachreizner need feedback here :)

From the coverage test perspective is better to have the bindings in a different crate. In Firecracker we are excluding auto-generated files when computing the coverage. If we have them in the same crate it means we will have to make changes to the test_coverage.py so that it is more flexible and we can exclude independent files depending on the repository being tested. The context is the plan to have common integration tests in one repository and all other repositories to consumes them.

But if we don't use most of the things that are auto-generated then maybe we should just have a minimal vhost_bindings file inside the repsository. I don't have any preference between these two.

Good point about the test_coverage.py. Could we solve this issue by adding more test cases?

src/lib.rs

sboeuf · 2019-05-22T22:49:29Z

@jiangliu @bonzini @andreeaflorescu what's the status here?
I'm very interested in seeing this crate moving forward in order to base some vhost-user implementations off of it.

jiangliu · 2019-05-23T01:46:24Z

@jiangliu @bonzini @andreeaflorescu what's the status here?
I'm very interested in seeing this crate moving forward in order to base some vhost-user implementations off of it.

I'm refine the crate to better support reconnection and almost ready, plan to send out early next week. I have also decoupled Unix domain socket client/server from host-user master/slave roles to enable the use case that the vhost-user master acts as servers.

Cargo.toml

sboeuf · 2019-06-03T22:35:49Z

@jiangliu the PR LGTM! Are we waiting for review from some folks or do you still have some work planned on this PR?
/cc @andreeaflorescu @bonzini @sameo

jiangliu · 2019-06-04T03:03:49Z

@jiangliu the PR LGTM! Are we waiting for review from some folks or do you still have some work planned on this PR?
/cc @andreeaflorescu @bonzini @sameo
Currently waiting reviews, and then need to refine documentation and other requirements of the rust-vmm project.

stefanha · 2019-06-21T16:57:35Z

src/backend.rs

+///
+/// Vhost-based virtio devices are different from regular virtio devices because the the vhost
+/// backend takes care of handling all the data transfer. The device itself only needs to deal with
+/// setting up the the backend driver and managing the control channel.


This comment is not generally applicable and relates to vhost-net only. It may cause confusion.

The comment is essentially saying that in vhost-net the rx/tx virtqueues are handled by vhost and the ctrl virtqueue is handled by the VMM.

In the general case the responsibility of which virtqueue is handled by vhost and which is handled by the VMM is decided when the first vhost implementation of a device is designed. Usually the dataplane is handled by vhost and anything else is handled by the VMM, but this is not a strict rule or enforced in any way.

I think a clearer way of explaining vhost is that a vhost device is a subset of a virtio device. Sometimes all virtqueues but often just those considered the dataplane. The purpose of vhost is to implement a subset of a virtio device's functionality outside the VMM process.

The comment has been updated as:
/// Vhost devices are subset of virtio devices, which improve virtio device's performance by
/// delegating data plane operations to dedicated IO service processes. Vhost devices use the
/// same virtqueue layout as virtio devices to allow vhost devices to be mapped directly to
/// virtio devices.
/// The purpose of vhost is to implement a subset of a virtio device's functionality outside the
/// VMM process. Typically fast paths for IO operations are delegated to the dedicated IO service
/// processes, and slow path for device configuration are still handled by the VMM process. It may
/// also be used to control access permissions of virtio backend devices.

stefanha · 2019-06-21T17:03:02Z

src/lib.rs

+//!
+//! Virtio devices use virtqueues to transport data efficiently. Virtqueue is a set of three
+//! different single-producer, single-consumer ring structures designed to store generic
+//! scatter-gather I/O.


This is outdated as of VIRTIO 1.1. The new Packed Virtqueue layout is a single ring.

This comment could be updated to:
Virtqueues are often implemented as VIRTIO 1.1 Split Virtqueues with a set of three ...

Eventually this crate should support Packed Virtqueues too for better performance.

The comment has been updated as:
Virtio devices use virtqueues to transport data efficiently. The first generation of virtqueue
is a set of three different single-producer, single-consumer ring structures designed to store
generic scatter-gather I/O. The virtio specification 1.1 introduces an alternative compact
virtqueue layout named "Packed Virtqueue", which is more friendly to memory cache system and
hardware implemented virtio devices. The packed virtqueue uses read-write memory, that means
the memory will be both read and written by both host and guest. The new Packed Virtqueue is
preferred for performance.

stefanha · 2019-06-21T18:37:20Z

@sboeuf @jiangliu It seems that the item max_queue_num in node should not be configured by VMM. max_queue_num is obtained from slave by calling get_queue_num, to indicate the maximum queue number slave could support. It could be a gatekeeper to check if VMM setup the queues more than slave cloud handle. I see the set_vring_* functions is handling like this. Thus, I thought the support added here should be removed. Correct me if I'm wrong :-P

This appears to be wrong. Here is what I've found:

In vhost-user "multiple virtqueues" and "multiqueue" are two separate concepts. Devices with multiple virtqueues have always been possible. From the beginning of vhost the vhost-net device had an rx virtqueue and a tx virtqueue. There was just no way for the master to ask the slave if more virtqueues are supported.

Devices with fixed virtqueue layouts are NOT required to advertise VHOST_USER_PROTOCOL_F_MQ because the VMM already knows the number of virtqueues.

When multiqueue devices were implemented it was necessary to extend the protocol so that the slave could report how many virtqueues are available. Multiqueue devices have a variable number of virtqueues that depends on the capabilities of the device (i.e. the vhost-user slave).

I think that ideally all vhost-user device backends should implement VHOST_USER_PROTOCOL_F_MQ and VHOST_USER_GET_QUEUE_NUM so the master can fetch max_queue_num from the vhost device backend.

However, according to the vhost-user specification and how existing devices are implemented, it is not possible to rely on this today. Here is how it works:

The VMM configures max_queue_num based on the number of queues offered by its virtio device (NOT the vhost device).
If the vhost-user device backend supports VHOST_USER_GET_QUEUE_NUM then max_queue_num can be fetched and updates its value.
If the VMM user explicitly requested a specific multiqueue number (e.g. on the command-line or through a management API) then device creation fails if this number exceeds max_queue_num (e.g. "Cannot create virtio-net device with 1024 virtqueues, vhost-user slave only supports 64").

Therefore Sebastien's patch or something similar is necessary. The current master code conflates multiqueue with multiple virtqueues when they are actually separate concepts.

I have sent QEMU patches to make libvhost-user always advertise VHOST_USER_PROTOCOL_F_MQ but please do not rely on this.

bjzhjing · 2019-07-18T02:51:42Z

@stefanha There is one thing I would to confirm with you, that max_queue_num represents the queue pair number, rather than the total number of virtqueues, right?

stefanha · 2019-07-18T07:50:00Z

@stefanha There is one thing I would to confirm with you, that max_queue_num represents the queue pair number, rather than the total number of virtqueues, right?

max_queue_num is the number of virtqueues.

These interfaces are derived from the firecracker and crosvm projects. They may be extended when enable support of vhost-user protocol. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

jiangliu · 2019-09-02T01:47:26Z

Great, I'll let you know when this will be solved on the virtio-fs side of things.

I plan to integrate this workaround into vhost with comments stating that it's a temporary workaround.

jiangliu · 2019-09-02T06:02:43Z

In vhost-user "multiple virtqueues" and "multiqueue" are two separate concepts. Devices with multiple >virtqueues have always been possible. From the beginning of vhost the vhost-net device had an rx >virtqueue and a tx virtqueue. There was just no way for the master to ask the slave if more virtqueues are >supported.
Thanks for the explanation, it's easy to mix up these two concepts.
I will carry the patch from @sboeuf in next PR.

The auto-generated vhost binding is small, it would be better to include it into the vhost crate instead of building another dedicated crate for it. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

The implementation is derived from the crosvm project. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

Implement vhost_kern::Vsock to control the in-kernel vhost-vsock driver. The implementation is derived from the crosvm project. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

Only basic messages are defined, and the vhost-user spec is also under development. So feel free to add needed messages. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Daniel Prilik <daniel@prilik.com>

Implement unix domain socket based communication channel for vhost-user, which could pass file decriptors between processes. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel Prilik <daniel@prilik.com>

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Daniel Prilik <daniel@prilik.com>

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel Prilik <daniel@prilik.com>

Not all slave requests are implemented, only the subset of messages used by virtio-fs. The SlaveReqHandler struct creates the slave-master socketpair, and handles all incoming / outgoing message validation and handler routing. SlaveReqHandler::new accepts a struct that implements VhostUserSlaveReqHandler, overriding the default handlers for relevant slave requests. SlaveReqHandler::new also returns a UnixStream that must be sent to the slave via Master::set_slave_request_fd. As both Master and Slave requests share the same wire-format, it could be easy to mix up the two when handling requests, so this commit introduces PhantomData markers to Endpoints and VhostUserMsgHeader that specify if they operate on MasterReqs or SlaveReqs. Lastly, some library internals, notably the Listener and Endpoint structs, have been marked as pub(crate). This necessitates a small refactor of how Slaves are constructed, introducing a SlaveListener struct which returns a valid Slave when the socket accepts a connection. Signed-off-by: Daniel Prilik <danielprilik@gmail.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

We need the vhost-user protocol crate to be more generic to allow communication with a configurable amount of virtqueues. This is a workaround for the virtio-fs backend. According to the vhost-user specification, the master should issue get_queue_num() to get the maximum number of queues supported by the slave. With current virtio-fs implementations, it needs multiple virtques, but it doesn't support the get_queue_num() request and not set the MQ feature bit too. So this is a workaround to hard-code the max_queue_num on the master side. Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>

The latest vhost user spec from Qemu doc only define two members in VhostSetConfigType, master and live migration. These changes can make rust-vmm compatible with vhost user backend. Signed-off-by: Yang Zhong <yang.zhong@intel.com>

Use acked_protocol_features to replace acked_virtio_features in get_config()/set_config() for protocol features like CONFIG. This patch also fix wrong GET_CONFIG setting for set_config(). Signed-off-by: Yang Zhong <yang.zhong@intel.com>

The VhostUserConfig carries a message with a payload, the contents of which depend on the kind of device being emulated. With this change, we calculate the offset of the payload within the message, check its size corresponds to the expected one, and pass it to the backend as a reference to a slice adjusted to the payload dimensions. The backend will be responsible of validating the payload, as it's the one aware of its expected contents. Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

This patch replace send_request_with_payload() with send_request_with_body() to make get_config work. From vhost-user Spec: A vhost-user message consists of 3 header fields and a payload. +---------+-------+------+---------+ | request | flags | size | payload | +---------+-------+------+---------+ but for config space, the payload include: Virtio device config space ^^^^^^^^^^^^^^^^^^^^^^^^^^ +--------+------+-------+---------+ | offset | size | flags | payload | +--------+------+-------+---------+ Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

sameo · 2020-03-17T07:30:26Z

This is superseded by PR #9

Gpu set socket feature

Add license files for MIT, BSD-Alibaba, BSD-Google

ff01d87

jiangliu requested review from andreeaflorescu and sboeuf April 12, 2019 09:51

sboeuf reviewed Apr 12, 2019

View reviewed changes

bonzini reviewed Apr 12, 2019

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

jiangliu force-pushed the v1 branch from 1b3f716 to 7eef366 Compare April 29, 2019 09:13

jiangliu force-pushed the v1 branch from 7eef366 to 1c2a546 Compare May 29, 2019 04:35

sboeuf reviewed May 29, 2019

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

sboeuf mentioned this pull request May 29, 2019

Add virtio-fs support cloud-hypervisor/cloud-hypervisor#30

Merged

stefanha reviewed Jun 21, 2019

View reviewed changes

Add traits to support vhost backend devices

892b6f9

These interfaces are derived from the firecracker and crosvm projects. They may be extended when enable support of vhost-user protocol. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

jiangliu changed the title ~~Migrate vhost code from the firecracker project into the vhost crate~~ Introduce vhost crate to support vhost-kernel and vhost-user backend drivers Sep 2, 2019

jiangliu and others added 8 commits September 2, 2019 14:22

Simplify the auto-generated vhost binding

e850bd0

The auto-generated vhost binding is small, it would be better to include it into the vhost crate instead of building another dedicated crate for it. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

Enable support of in-kernel vhost drivers

b311a36

The implementation is derived from the crosvm project. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

Enable support of the in-kernel vhost-vsock driver

40c8fd9

Implement vhost_kern::Vsock to control the in-kernel vhost-vsock driver. The implementation is derived from the crosvm project. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>

Define communication messages defined by the vhost-user spec

e2c9dff

Only basic messages are defined, and the vhost-user spec is also under development. So feel free to add needed messages. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Daniel Prilik <daniel@prilik.com>

Implement vhost-user master side endpoint

c6ab70c

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Daniel Prilik <daniel@prilik.com>

Implement vhost-user slave side endpoint

7cfcb36

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel Prilik <daniel@prilik.com>

jiangliu and others added 5 commits September 2, 2019 14:22

Add tests for vhost-user

fcecadd

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

vhost-user: Change the VhostUserConfigFlags

ce4e319

The latest vhost user spec from Qemu doc only define two members in VhostSetConfigType, master and live migration. These changes can make rust-vmm compatible with vhost user backend. Signed-off-by: Yang Zhong <yang.zhong@intel.com>

vhost-user: Change get_config()/set_config()

2924174

Use acked_protocol_features to replace acked_virtio_features in get_config()/set_config() for protocol features like CONFIG. This patch also fix wrong GET_CONFIG setting for set_config(). Signed-off-by: Yang Zhong <yang.zhong@intel.com>

jiangliu force-pushed the v1 branch from 18b5081 to b5db8ec Compare September 2, 2019 07:31

jiangliu mentioned this pull request Sep 20, 2019

Implement a TAP based vhost-user-net backend cloud-hypervisor/cloud-hypervisor#242

Closed

sameo closed this Mar 17, 2020

This was referenced Mar 17, 2020

Inital vhost implementation #9

Closed

Initial vhost implementation #10

Merged

dorindabassey pushed a commit to dorindabassey/vhost that referenced this pull request Mar 27, 2024

Merge pull request rust-vmm#2 from dorindabassey/gpucomp

a7a589a

Gpu set socket feature


		/* Auto-generated by bindgen then manually modified for simplicity */

		#![allow(non_upper_case_globals)]

Introduce vhost crate to support vhost-kernel and vhost-user backend drivers #2

Introduce vhost crate to support vhost-kernel and vhost-user backend drivers #2

Uh oh!

Conversation

jiangliu commented Apr 12, 2019

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sboeuf commented May 22, 2019

Uh oh!

jiangliu commented May 23, 2019

Uh oh!

Uh oh!

sboeuf commented Jun 3, 2019

Uh oh!

jiangliu commented Jun 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefanha Jun 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stefanha commented Jun 21, 2019

Uh oh!

bjzhjing commented Jul 18, 2019

Uh oh!

stefanha commented Jul 18, 2019

Uh oh!

jiangliu commented Sep 2, 2019

Uh oh!

jiangliu commented Sep 2, 2019

Uh oh!

sameo commented Mar 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

jiangliu commented Jun 4, 2019 •

edited

Loading

stefanha Jun 21, 2019 •

edited

Loading