Skip to content

ARROW-2195: [Plasma] Return auto-releasing buffers#1807

Closed
pitrou wants to merge 11 commits intoapache:masterfrom
pitrou:ARROW-2195-plasma-buffers
Closed

ARROW-2195: [Plasma] Return auto-releasing buffers#1807
pitrou wants to merge 11 commits intoapache:masterfrom
pitrou:ARROW-2195-plasma-buffers

Conversation

@pitrou
Copy link
Copy Markdown
Member

@pitrou pitrou commented Mar 29, 2018

On the C++ side, add a new PlasmaClient::GetAuto() method to return buffers that release
the corresponding object on destruction.

On the Python side, return such buffers in PlasmaClient.get_buffers().

@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Mar 29, 2018

This changes/adds some APIs on the CUDA side so will need reviewing from someone familiar with that code base.

@pitrou pitrou force-pushed the ARROW-2195-plasma-buffers branch 2 times, most recently from 43d488f to e1a22da Compare March 29, 2018 14:05
@pitrou pitrou changed the title [WIP] ARROW-2195: [Plasma] Return auto-releasing buffers ARROW-2195: [Plasma] Return auto-releasing buffers Mar 29, 2018
@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Apr 2, 2018

@robertnishihara @pcmoritz @Wapaul1 Is one of you interested in reviewing this?

@pitrou pitrou force-pushed the ARROW-2195-plasma-buffers branch from 8a01b22 to c0db264 Compare April 2, 2018 10:39
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put this into https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/macros.h. It seems more broadly useful for the whole arrow project.

Copy link
Copy Markdown
Contributor

@pcmoritz pcmoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! On a high level, here is what we should attempt: Make it so the new APIs CreateAuto and GetAuto can be used as replacements for Create and Get and then deprecate the old ones and rename CreateAuto -> Create and GetAuto -> Get. I can give some more lower level comments about that in the PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have ARROW_UNUSED for this:

#define ARROW_UNUSED(x) (void)x

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be exposed through the Info method, see

Status Info(const ObjectID& object_id, int* object_status);

Given that is is private, we can leave it for now, but can you create a JIRA for it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, see ARROW-2379

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info requires an IPC to the plasma store and seems too heavy-weight since this can all be determined with locally available information.

IsInUse is probably better named something like IsInUseByThisClient (that's an ugly name, but I think it's more clear).

We have a lot of code in client.cc that's already doing stuff like

auto elem = objects_in_use_.find(object_id);
if (elem == objects_in_use_.end()) {
  ...
}

so it may make sense for IsInUse to replace some of that code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of this callback, and do the wrapping into PlasmaBuffer as a post-processing step to GetBuffers?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds a bit brittle. You want to have a single PlasmaBuffer per exported (e.g. mmap'ed) piece of memory, but both data and metadata share that piece of memory.

@pcmoritz
Copy link
Copy Markdown
Contributor

pcmoritz commented Apr 2, 2018

How do people feel about replacing the old Get and Create methods with new ones which return PlasmaBuffers, and make Release private (which means PlasmaBuffer needs to be friend of PlasmaClient). Given that 0.9 has just been released it might be the right time to do it.

@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Apr 2, 2018

I suppose that depends how you feel about breaking compatibility without notice :-)

@pcmoritz
Copy link
Copy Markdown
Contributor

pcmoritz commented Apr 2, 2018

Ok, let's do the following:

  • Deprecate the old Create and Get and Release

  • Make new ones with the following signature:

// sizes are already available through buffers, no need to store them separately
struct PlasmaObject {
  std::shared_ptr<Buffer> data;
  std::shared_ptr<Buffer> metadata;
  int device_num;
}

Status Get(const std::vector<ObjectID>& object_ids, int64_t timeout_ms, std::vector<PlasmaObject>* objects);

Status Create(const ObjectID& object_id, int64_t data_size, int64_t metadata_size, PlasmaObject* object);

// (the metadata would be written to shared memory directly by the client instead of passing it in)

which are backed by PlasmaBuffers and don't need explicit release

I'm slightly uncomfortable that the two overloaded ones have different behavior and the same name, but going forward that seems to be the best solution (we should try to avoid having two rounds of deprecations).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok for it to be sealed before it is released.

Maybe "The returned object must be released once it is done with. It must also be either sealed or aborted."?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it seems you're right.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be changed to use GetAuto?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had overlooked that. Thank you.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we can get rid of

cdef class PlasmaBuffer(Buffer):

now?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we migrate Create() too, yes.

@pitrou pitrou force-pushed the ARROW-2195-plasma-buffers branch 2 times, most recently from 167dc0c to e869757 Compare April 3, 2018 11:03
@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Apr 3, 2018

@pcmoritz I would suggest tackling Create() in a separate PR, since there are additional complications in how it interacts with Seal() and Abort(). What do you think?

@pitrou pitrou force-pushed the ARROW-2195-plasma-buffers branch from e869757 to 04c1842 Compare April 3, 2018 11:27
@pitrou pitrou force-pushed the ARROW-2195-plasma-buffers branch from 04c1842 to ac2d84c Compare April 3, 2018 11:30
@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Apr 3, 2018

Note an ideal API would have the Seal(), Abort() and Release() methods on the PlasmaBuffer, not on PlasmaClient.


// ----------------------------------------------------------------------
// From googletest
// XXX also in parquet-cpp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the XXX

ARROW_CHECK_OK(client2_.Disconnect());
system("killall plasma_store &");
// Kill all plasma_store processes
// XXX should only kill the processes we launched
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace XXX by TODO

@pcmoritz
Copy link
Copy Markdown
Contributor

pcmoritz commented Apr 3, 2018

Awesome, this is really great! Let's remove the XXX in the PR and then we can merge it.

@robertnishihara
Copy link
Copy Markdown
Contributor

Nice job @pitrou! This will be a lot cleaner. Especially once we can do the same thing for Create and then remove the old code paths.

@pcmoritz
Copy link
Copy Markdown
Contributor

pcmoritz commented Apr 3, 2018

I also created a JIRA for the Create API (very similar to what you propose, in addition it allows us to get rid of the Abort() method): https://issues.apache.org/jira/browse/ARROW-2386

@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Apr 4, 2018

@pcmoritz I fixed the XXX's.

@pcmoritz pcmoritz closed this in cf39686 Apr 4, 2018
@pitrou pitrou deleted the ARROW-2195-plasma-buffers branch April 4, 2018 17:55
@robertnishihara
Copy link
Copy Markdown
Contributor

@pitrou I'm seeing https://issues.apache.org/jira/browse/ARROW-2448 when using this PR.

It seems like we need each client to keep track of the buffers that it produces and to invalidate them when the client disconnects (or something like that). cc @pcmoritz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants