Fine-tuning on a V100 GPU 

Hi, I'm trying to perform the fine-tuning of _Llama-3 8B_ on a **V100 GPU**. To do this, as required by Unsloth I upgraded the torch version to 2.1, and I followed the recommended installations for google Colab as in this [tutorial](https://colab.research.google.com/drive/1mPw6P52cERr93w3CMBiJjocdTnyPiKTX?usp=sharing), however fine-tuning cannot be performed because Xformers requires a computational capacity of 8 and I have 7, anyway Unsloth is able to perform the fine-tuning of  _Llama-3 8B_ on a T4 which has computational capability of 7.5. What I'm missing? Is there a version of Xformers that is compatible with my hardware and Unsloth requirements?

My torch version is: `2.1.0+cu121`

This is my GPU setup:
```
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04             Driver Version: 535.171.04   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla V100S-PCIE-32GB          Off | 00000000:3B:00.0 Off |                    0 |
| N/A   46C    P0              28W / 250W |      5MiB / 32768MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
```
This is my code:
```python
import torch
print(torch.__version__)
major_version, minor_version = torch.cuda.get_device_capability()
# Must install separately since Colab has torch 2.2.1, which breaks packages
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
if major_version >= 8:
    # Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40)
    !pip install --no-deps packaging ninja einops flash-attn "xformers<0.0.26" trl peft accelerate bitsandbytes
else:
    # Use this for older GPUs (V100, Tesla T4, RTX 20xx)
    !pip install --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes
pass
```

This is the error I get:
```python
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(2, 555, 8, 4, 128) (torch.float16)
     key         : shape=(2, 555, 8, 4, 128) (torch.float16)
     value       : shape=(2, 555, 8, 4, 128) (torch.float16)
     attn_bias   : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
     p           : 0.0
`flshattF@0.0.0` is not supported because:
    xFormers wasn't build with CUDA support
    requires device with capability > (8, 0) but your GPU has capability (7, 0) (too old)
    operator wasn't built - see `python -m xformers.info` for more info
`cutlassF` is not supported because:
    xFormers wasn't build with CUDA support
    operator wasn't built - see `python -m xformers.info` for more info
`smallkF` is not supported because:
    max(query.shape[-1] != value.shape[-1]) > 32
    xFormers wasn't build with CUDA support
    dtype=torch.float16 (supported: {torch.float32})
    attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'>
    operator wasn't built - see `python -m xformers.info` for more info
    operator does not support BMGHK format
    unsupported embed per head: 128
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fine-tuning on a V100 GPU #496

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Fine-tuning on a V100 GPU #496

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions