optimize_cuda_cache in PPOConfig is marked as deprecated, and optimize_device_cache is recommended instead.
However, when I use optimize_device_cache instead of optimize_cuda_cache, time/ppo/optimize_step in my runs increases from ~42 to ~230 seconds.
It seems like currently when optimize_cuda_cache is not set it overrides whatever value was assigned to optimize_device_cache with False:
|
if optimize_cuda_cache is not None: |
|
warnings.warn( |
|
"The `optimize_cuda_cache` argument will be deprecated soon, please use `optimize_device_cache` instead." |
|
) |
|
optimize_device_cache = optimize_cuda_cache |
|
else: |
|
optimize_device_cache = False |
optimize_cuda_cacheinPPOConfigis marked as deprecated, andoptimize_device_cacheis recommended instead.However, when I use
optimize_device_cacheinstead ofoptimize_cuda_cache,time/ppo/optimize_stepin my runs increases from ~42 to ~230 seconds.It seems like currently when
optimize_cuda_cacheis not set it overrides whatever value was assigned tooptimize_device_cachewithFalse:trl/trl/trainer/ppo_config.py
Lines 142 to 148 in 151a452