Name and Version
0.3.2 / 0.3.1 / 0.3.0
Operating systems
Linux
GGML backends
Vulkan
Hardware
7900XTX - 24GB - 64GB DDR5
Models
Qwen 27B Q4KXL / Qwen 35B A3B Q4KM
Problem description & steps to reproduce
Vulkan is way too slow, about 40ts with MTP even at very low context, while llama.cpp gives 85t/s as a context is starting.
HIP is even worse, at 25t/s with MTP.
First Bad Commit
No response
Relevant log output
.
Name and Version
0.3.2 / 0.3.1 / 0.3.0
Operating systems
Linux
GGML backends
Vulkan
Hardware
7900XTX - 24GB - 64GB DDR5
Models
Qwen 27B Q4KXL / Qwen 35B A3B Q4KM
Problem description & steps to reproduce
Vulkan is way too slow, about 40ts with MTP even at very low context, while llama.cpp gives 85t/s as a context is starting.
HIP is even worse, at 25t/s with MTP.
First Bad Commit
No response
Relevant log output
.