GGUF + --fast pinned_memory = CUDA crash

### Custom Node Testing

- [x] I have tried disabling custom nodes and the issue persists (see [how to disable custom nodes](https://docs.comfy.org/troubleshooting/custom-node-issues#step-1%3A-test-with-all-custom-nodes-disabled) if you need help)

### Expected Behavior

GGUF Qwen models (e.g., Q4_K_M) should run with the `--fast` argument and not crash.

### Actual Behavior

Even smaller GGUF Qwen models (e.g., Q4_K_M) that have run previously now produce the following error when run with the `--fast` argument or `--fast pinned_memory` argument:

KSampler
```
CUDA error: invalid argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
```

I'm aware the `--fast` argument "enables some untested and potentially quality deteriorating optimizations". The culprit appears to be the `pinned_memory` optimization.

### Steps to Reproduce

Launch ComfyUI with `--fast` or `--fast pinned_memory` argument. Run a simple workflow that includes a GGUF Unet loader node. Notice the (likely) CUDA crash.

### Debug Logs

```powershell
--
```

### Other

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GGUF + --fast pinned_memory = CUDA crash #10601

Custom Node Testing

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GGUF + --fast pinned_memory = CUDA crash #10601

Description

Custom Node Testing

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions