I had difficulty launching this stack successfully out of the box, and had to manually increase the Docker /dev/shm with an additional volumeMount as:
volumeMounts:
- name: {{ .Release.Name }}-storage
mountPath: /data
- name: shm
mountPath: /dev/shm
volumes:
- name: shm
emptyDir:
medium: Memory
sizeLimit: 20Gi
to avoid NCCL shared memory allocation issues as described in vllm-project/vllm#6574. It might be nice to add this to the tutorial steps and expose an expedient configuration option here for anyone facing similar issues.
Thanks!
I had difficulty launching this stack successfully out of the box, and had to manually increase the Docker
/dev/shmwith an additional volumeMount as:to avoid NCCL shared memory allocation issues as described in vllm-project/vllm#6574. It might be nice to add this to the tutorial steps and expose an expedient configuration option here for anyone facing similar issues.
Thanks!