Skip to content

Vulkan in koboldcpp-1.85 #1398

@Danik-droid

Description

@Danik-droid

Describe the Issue
I have an AMD and NVIDA card so I often use Vulkan to load a larger model, unfortunately it stopped working properly for me:

  • llama 3.1 models (moe divided into both cards or AMD only) immediately/or after the first sentence display nonsense. Single 1080 ti works but much slower than before, RADEON - sometimes works sometimes doesn't (no errors in console)
  • mistral small 24b: divided into two cards, in the first utterance it loops and stays on a word which it repeats endlessly.
  • generally splitting a model into two cards often breaks the generation immediately or after some time. The AMD card seems to be affected more than the GTX. (AMD card on Vulcan sometimes doesn't work properly even alone??? I don't know why, clearing the cache and reinstalling the drivers didn't change anything)
  • the problem does not occur in the previous version.

Additional Information:
Windows 10 (64bit, updated)
New NVIDIA and AMD drivers
GPU - Radeon 6900xt
GPU - GTX 1080ti

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions