[ALICE] Expose some shm API to allow shm message metadata to be passed through a side channel#551
[ALICE] Expose some shm API to allow shm message metadata to be passed through a side channel#551aalkin wants to merge 3 commits into
Conversation
|
@rbx does this make sense to you? Do you have any better suggestions on how to deal with this? |
|
Can you elaborate (e.g. on a simplified pseudo topology) which device owns the msg and which will need which (read/write) access to it (also with regard to timing/parallel access)? I am trying to understand which features of the fmq memory mgmt you still need while you want to opt-out of the msg api. This is not clear to me yet. |
|
The objects that we want to cache have validity intervals in terms of timestamps in data. Those intervals can span many timeframes, or less than a timeframe, in both cases this means that the border between the two intervals, where we need to change the object, is not, in general, a timeframe border. And, of course, there are several consuming devices that have different processing rates to complicate things further. Our solution is to provide a centralized cache, that retrieves and stores corresponding objects based on the timestamp tables for the currently used timeframes, that are then transparently delivered to consuming devices through a table isomorphic with the timestamps table. Trying to deliver the objects through messages would be overly complicated, since there could be cases where the same message needs to be sent for several consecutive timeframes, or the opposite, several message for a single timeframe, and this is for a single such object. Instead, the objects are allocated in shared memory as messages in the source device, using the transport of the channels pointing to the consuming devices, but are not sent. What is sent are arrow tables, isomorphic to timestamps tables, with each row containing metadata for the corresponding unsent messages with the objects for the particular timestamp. This way the control is not passed to consuming devices and stays with the central cache, but devices can still access the content of the unsent messages - provided the pointer can be inferred from the metadata. Specifically, we use the preconfigured channels, with their transport, to allocate the messages and then send their metadata in an unrelated message. The consumers, having access to the same channel, are able to use the contents of those unsent messages, while the cache is still managed by a single device. The consumers need read-only access and are not concerned with validity intervals or life-time of the objects, the cache will drop everything that belongs to a timeframe that is already reported as consumed and will not send a new metadata table until all of objects are ready. I hope this clarifies the intent. |
|
Thx for the explanation I think I got the constraints now. I don't object your proposal in this PR. One alternative that still comes to mind is that you create a |
|
I'd rather keep the fairmq abstraction, actually. I do not want to have a parallel transport which needs to be configured and so on. |
|
Indeed, this could also be achieved with manually managed shared memory, however going through the FairMQ API is easier simply because everything is already preconfigured in the workflow deployment, we can just re-use existing transport with minimal changes. |
|
I think the use case is well-motivated and keeping the cache centralised while letting consumers resolve pointers from metadata is fine. But there are some issues with this implementation:
I propose an alternative that covers the same use case without GetManager() or the refactor:
|
For a specific case where we want to have a cache of preallocated shared memory messages that is accessible by other devices through a time-based table, to avoid complicated synchronizations, we send a table with only the messages metadata. To access the objects in shared memory, referred to by this metadata, the target device needs to calculate the device-local pointer to the corresponding memory. This is achieved by exposing the shared memory manager API from the transport of the specific channel, that is used to send the metadata table. This approach allows us to centralize the cache management and re-use FairMQ-based shared memory management meaning the client code remains largely unchanged.
To summarize, shem message now exposes its metadata through a public method, that is then transferred to a target device, shmem transport exposes its shmem manager, that, in turn, exposes its API to get the local pointer from message handle and managed segment id for the target device. This is, of course, a draft, any suggestions as to how to handle this better (specifically, better aligned to the FairMQ architecture) are welcome.
Since "shmem/TransportFactory.h" needs to be included in the client code, the class is out-of-lined so that we do not need to expose other internal headers or link to ZeroMQ directly.
@ktf