Skip to content

Eval bug: Vulkan/HIP is way too slow in comparison to llama.cpp #59

@Ezzz-dev

Description

@Ezzz-dev

Name and Version

0.3.2 / 0.3.1 / 0.3.0

Operating systems

Linux

GGML backends

Vulkan

Hardware

7900XTX - 24GB - 64GB DDR5

Models

Qwen 27B Q4KXL / Qwen 35B A3B Q4KM

Problem description & steps to reproduce

Vulkan is way too slow, about 40ts with MTP even at very low context, while llama.cpp gives 85t/s as a context is starting.
HIP is even worse, at 25t/s with MTP.

First Bad Commit

No response

Relevant log output

.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions