llama on My learning and diary

llama on My learning and diary https://jackliusr.github.io/tags/llama/ Recent content in llama on My learning and diary Hugo -- gohugo.io en-us Mon, 01 Jun 2026 16:00:00 +0800 Running Qwen3.6 MTP GGUF on AMD AI MAX 395 with llama.cpp ROCm https://jackliusr.github.io/posts/2026/06/running-qwen3.6-mtp-gguf-on-amd-ai-max-395-with-llama.cpp-rocm/ Mon, 01 Jun 2026 16:00:00 +0800 https://jackliusr.github.io/posts/2026/06/running-qwen3.6-mtp-gguf-on-amd-ai-max-395-with-llama.cpp-rocm/ Background MTP support was recently merged into llama.cpp through the following pull request: MTP support merged into llama.cpp After the merge, I wanted to test MTP models on my mini-PC powered by the AMD AI MAX 395. I tried several approaches, including manually building llama.cpp and using Unsloth GGUF models directly. However, despite multiple attempts, I could not get a stable working setup. I also searched through GitHub issues and asked several AI assistants, including ChatGPT, Gemini, and DeepSeek.