Add use_bfc_allocator parameter to StreamExecutor C API #47598

annarev · 2021-03-05T20:21:18Z

use_bfc_allocator parameter will decide whether to wrap pluggable device SubAllocator with BFC allocator or not.

I am thinking it can be used as follows:
See lines here:

tensorflow/tensorflow/core/common_runtime/pluggable_device/pluggable_device_process_state.cc

Line 115 in 2c31420

PluggableDeviceBFCAllocator* device_bfc_allocator =

They should be replaced with something like

Allocator * device_allocator = nullptr;
if (use_bfc_allocator) {
  PluggableDeviceBFCAllocator* device_bfc_allocator =
        new PluggableDeviceBFCAllocator(
            sub_allocator, total_bytes, options,
            strings::StrCat("PluggableDevice_", tf_device_id.value(), "_bfc"));
  device_allocator = device_bfc_allocator;
} else {
  PluggableDeviceSimpleAllocator* simple_allocator = new PluggableDeviceSimpleAllocator(sub_allocator);
  device_allocator = simple_allocator;
}

PluggableDeviceSimpleAllocator should just be a minimal wrapper over the sub_allocator (currently not implemented as a part of this PR). This seems much simpler to me than what I was thinking of originally (sorry for the delay).

PTAL @penpornk @kulinseth, @jzhoulon

jzhoulon · 2021-03-09T05:55:39Z

@annarev Thanks for the PR!
Do you mean the PluggableDeviceSimpleAllocator will directly call the stream_exec_->AllocateArray(), which calls SP_StreamExecutor::allocate(const SP_Device* device, uint64_t size, int64_t memory_space,SP_DeviceMemoryBase* mem ), and plugin authors decides whether allocate is build on top of a custom allocator? if it is, I think it is better to add an field such as void* custom_allocator in SP_Device? since the allocator should be per-device? Thanks

kulinseth · 2021-03-10T19:13:34Z

Thanks @annarev for the PR. SimpleAllocator can be something straightforward calling into AllocateRaw and that would be fine for us. We had brief discussion of this during the design phase. To keep it simple we can leverage SubAllocator currently. After the pluggable device implementation merges we can use the use_bfc_allocator flag to pass this information in device_process_state..

There are few places in pluggable_device_process_state like AllocatorParts will require changes to accomodate the SimpleAllocator design, currently it assumes the Device memory allocator is BFC. I can upload a PR which I have tested locally works with our design. Please let me know.

annarev · 2021-03-10T19:31:57Z

@annarev Thanks for the PR!
Do you mean the PluggableDeviceSimpleAllocator will directly call the stream_exec_->AllocateArray(), which calls SP_StreamExecutor::allocate(const SP_Device* device, uint64_t size, int64_t memory_space,SP_DeviceMemoryBase* mem ), and plugin authors decides whether allocate is build on top of a custom allocator? if it is, I think it is better to add an field such as void* custom_allocator in SP_Device? since the allocator should be per-device? Thanks

@jzhoulon Sounds about right. use_bfc_allocator will decide whether to use a simple allocator or a BFC allocator. So, a plugin that wants to use their own custom allocation strategy and doesn't want to wrap with BFC, can set this to true. Do you mean add use_bfc_allocator parameter per device? I was just thinking having one setting for all device types is easier to pass through on TF end. Do we have a usecase for using or not using bfc allocator per device?

Thanks @annarev for the PR. SimpleAllocator can be something straightforward calling into AllocateRaw and that would be fine for us. We had brief discussion of this during the design phase. To keep it simple we can leverage SubAllocator currently. After the pluggable device implementation merges we can use the use_bfc_allocator flag to pass this information in device_process_state..

@kulinseth Yep that makes sense. I misread it back then as custom allocator wrapper itself being something customized by the plugin.

There are few places in pluggable_device_process_state like AllocatorParts will require changes to accomodate the SimpleAllocator design, currently it assumes the Device memory allocator is BFC. I can upload a PR which I have tested locally works with our design. Please let me know.

Sounds good, you can upload the PR and we can comment on it if anything needs changing.

jzhoulon · 2021-03-11T01:14:29Z

@annarev Thanks for the PR!
Do you mean the PluggableDeviceSimpleAllocator will directly call the stream_exec_->AllocateArray(), which calls SP_StreamExecutor::allocate(const SP_Device* device, uint64_t size, int64_t memory_space,SP_DeviceMemoryBase* mem ), and plugin authors decides whether allocate is build on top of a custom allocator? if it is, I think it is better to add an field such as void* custom_allocator in SP_Device? since the allocator should be per-device? Thanks

@jzhoulon Sounds about right. use_bfc_allocator will decide whether to use a simple allocator or a BFC allocator. So, a plugin that wants to use their own custom allocation strategy and doesn't want to wrap with BFC, can set this to true. Do you mean add use_bfc_allocator parameter per device? I was just thinking having one setting for all device types is easier to pass through on TF end. Do we have a usecase for using or not using bfc allocator per device?

No, I mean an opaque handle (void* custom_allocator) in SP_Device. See the following code example:

void create_device (SP_Platform platform, SE_CreateDeviceParams params, status) {
 params->custom_allocator = new CustomAllocator()
}

void allocate(SP_Device device, unit64_t size, ...) {
  CustomAllocator* allocator = static_cast<CustomAllocator*>(device->custom_allocate);
  allocator->allocate_raw(size, ...)
}

annarev · 2021-03-11T01:51:52Z

@annarev Thanks for the PR!
Do you mean the PluggableDeviceSimpleAllocator will directly call the stream_exec_->AllocateArray(), which calls SP_StreamExecutor::allocate(const SP_Device* device, uint64_t size, int64_t memory_space,SP_DeviceMemoryBase* mem ), and plugin authors decides whether allocate is build on top of a custom allocator? if it is, I think it is better to add an field such as void* custom_allocator in SP_Device? since the allocator should be per-device? Thanks

@jzhoulon Sounds about right. use_bfc_allocator will decide whether to use a simple allocator or a BFC allocator. So, a plugin that wants to use their own custom allocation strategy and doesn't want to wrap with BFC, can set this to true. Do you mean add use_bfc_allocator parameter per device? I was just thinking having one setting for all device types is easier to pass through on TF end. Do we have a usecase for using or not using bfc allocator per device?

No, I mean an opaque handle (void* custom_allocator) in SP_Device. See the following code example:
void create_device (SP_Platform platform, SE_CreateDeviceParams params, status) {
 params->custom_allocator = new CustomAllocator()
}

void allocate(SP_Device device, unit64_t size, ...) {
  CustomAllocator* allocator = static_cast<CustomAllocator*>(device->custom_allocate);
  allocator->allocate_raw(size, ...)
} 

I think setting a custom allocator makes things more complicated. I think we should do it only if we think of a usecase that cannot be covered by just implementing allocate/deallocate in the StreamExecutor API: https://cs.opensource.google/tensorflow/tensorflow/+/master:tensorflow/c/experimental/stream_executor/stream_executor.h;l=245.

kulinseth · 2021-03-11T03:53:14Z

I think setting a custom allocator makes things more complicated. I think we should do it only if we think of a usecase that cannot be covered by just implementing allocate/deallocate in the StreamExecutor API: https://cs.opensource.google/tensorflow/tensorflow/+/master:tensorflow/c/experimental/stream_executor/stream_executor.h;l=245.

Agreed, I can queue up the PR after #45784 merges.

penpornk

I think this should work. I'm approving this now in the interest of time. We can continue the discussion here.

penpornk · 2021-03-12T19:34:28Z

Agreed, I can queue up the PR after #45784 merges.

That would be awesome. Thank you very much, Kulin!

Add use_bfc_allocator parameter to StreamExecutor C API

e065755

google-ml-butler bot added the size:S CL Change Size: Small label Mar 5, 2021

google-cla bot added the cla: yes label Mar 5, 2021

penpornk self-requested a review March 5, 2021 21:55

gbaned self-assigned this Mar 8, 2021

penpornk mentioned this pull request Mar 11, 2021

[PluggableDevice] PluggableDevice mechanism implementation #45784

Closed

penpornk approved these changes Mar 12, 2021

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Mar 12, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Mar 12, 2021

copybara-service bot merged commit 844d24f into tensorflow:master Mar 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add use_bfc_allocator parameter to StreamExecutor C API #47598

Add use_bfc_allocator parameter to StreamExecutor C API #47598

Uh oh!

annarev commented Mar 5, 2021 •

edited by penpornk

Loading

Uh oh!

jzhoulon commented Mar 9, 2021

Uh oh!

kulinseth commented Mar 10, 2021

Uh oh!

annarev commented Mar 10, 2021 •

edited

Loading

Uh oh!

jzhoulon commented Mar 11, 2021

Uh oh!

annarev commented Mar 11, 2021

Uh oh!

kulinseth commented Mar 11, 2021

Uh oh!

penpornk left a comment

Uh oh!

penpornk commented Mar 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Add use_bfc_allocator parameter to StreamExecutor C API #47598

Add use_bfc_allocator parameter to StreamExecutor C API #47598

Uh oh!

Conversation

annarev commented Mar 5, 2021 • edited by penpornk Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jzhoulon commented Mar 9, 2021

Uh oh!

kulinseth commented Mar 10, 2021

Uh oh!

annarev commented Mar 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jzhoulon commented Mar 11, 2021

Uh oh!

annarev commented Mar 11, 2021

Uh oh!

kulinseth commented Mar 11, 2021

Uh oh!

penpornk left a comment

Choose a reason for hiding this comment

Uh oh!

penpornk commented Mar 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

annarev commented Mar 5, 2021 •

edited by penpornk

Loading

annarev commented Mar 10, 2021 •

edited

Loading