Below you will find pages that utilize the taxonomy term “vllm”
Posts
Running vLLM on AMD AI MAX+ 395 (ROCm, Ubuntu 24.04)
Finally, I managed to get vLLM running on my AMD AI MAX+ 395 GPU on Ubuntu 24.04.
It was not straightforward — ROCm support on Ryzen AI (gfx1151) is still evolving, and I ran into multiple low-level GPU faults before finding a stable setup.
This post documents: - What didn’t work - The errors I encountered - The working configuration
Hopefully this saves you a few hours (or days).
Posts
setup vllm on macbook m4
Introduction Several days ago, I setup ollama on my MacBook M4, and it works pretty well. At that time, I tried to use it with copilt with local models codegemma:7b and qwen3:8b. My expectation was not so high as the hardware configuration of my macbook pro m4 is just a entry level, just want to see how it works. I also learned there are other options such as vllm. After comparing the two, I found vllm is more flexible, powerful, product-ready, used widely in enterprises.