Skip to content

--userns=keep-id performance is slow for large images  #20845

@lastephey

Description

@lastephey

Feature request description

Dear Podman developers,

This is a request to improve the performance of --userns=keep-id. This seems to matter most for large images like Pytorch (~10GB).

Here are some timing data from my laptop.

Without keep-id

(base) DOE-7616476:~ stephey$ time podman run --rm nvcr.io/nvidia/pytorch:23.10-py3 date

=============
== PyTorch ==
=============

NVIDIA Release 23.10 (build 71422337)
PyTorch Version 2.1.0a0+32f93b1

Container image Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copyright (c) 2014-2023 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU                      (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015      Google Inc.
Copyright (c) 2015      Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
   Use the NVIDIA Container Toolkit to start this container with GPU support; see
   https://docs.nvidia.com/datacenter/cloud-native/ .

Wed Nov 29 23:37:22 UTC 2023

real	0m1.267s
user	0m0.051s
sys	0m0.029s

With keep-id

(base) DOE-7616476:~ stephey$ time podman run --rm --userns=keep-id nvcr.io/nvidia/pytorch:23.10-py3 date

=============
== PyTorch ==
=============

NVIDIA Release 23.10 (build 71422337)
PyTorch Version 2.1.0a0+32f93b1

Container image Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copyright (c) 2014-2023 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU                      (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015      Google Inc.
Copyright (c) 2015      Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
   Use the NVIDIA Container Toolkit to start this container with GPU support; see
   https://docs.nvidia.com/datacenter/cloud-native/ .

Wed Nov 29 23:42:52 UTC 2023

real	5m22.367s
user	0m0.054s
sys	0m0.030s
(base) DOE-7616476:~ stephey$ 

It's better on NERSC hardware with podman-hpc (1 minute instead of 5 minutes), so I think my laptop is probably closer to a worst-case scenario.

Thanks very much for looking into this,
Laurie

cc @vrothberg @rhatdan @giuseppe @Dfulton @scanon

Suggest potential solution

No response

Have you considered any alternatives?

No response

Additional context

Motivating use case- at the NERSC computing center, users may wish to run rootless Podman containers as themselves rather than as root for various reasons, including to interact with our Slurm scheduler. Since we discussed this at our last meeting we wanted to file an issue to help track this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.locked - please file new issue/PRAssist humans wanting to comment on an old issue or PR with locked comments.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions