Recent Posts
Running Qwen3.6 MTP GGUF on AMD AI MAX 395 with llama.cpp ROCm
Background MTP support was recently merged into llama.cpp through the following pull request:
MTP support merged into llama.cpp
After the merge, I wanted to test MTP models on my mini-PC powered by the AMD AI MAX 395. I tried several approaches, including manually building llama.cpp and using Unsloth GGUF models directly. However, despite multiple attempts, I could not get a stable working setup.
I also searched through GitHub issues and asked several AI assistants, including ChatGPT, Gemini, and DeepSeek.
read more
Fault-Oblivious Stateful Workflows: Durable Execution Matters More Than Orchestration
Introduction Last year, I spent some time studying Oracle Banking Microservices Architecture (OBMA), together with enterprise schedulers and orchestration platforms such as Control-M .
Part of the work involved understanding how to convert traditional Control-M jobs into Airflow DAGs. During this process, I started to observe an important architectural distinction:
Not all workflows are the same.
While studying OBMA, I noticed that Netflix Conductor was used as the workflow engine inside the architecture.
read more
Working Around ROCm PyTorch Replacement Issues with uv and ComfyUI
Introduction When working with AMD GPUs and ROCm-based AI workloads, one common issue appears when using uv for Python dependency management.
The problem becomes especially visible when setting up projects like ComfyUI on Linux with ROCm-enabled PyTorch builds.
Although ROCm-specific wheels are manually installed, running commands such as uv add or dependency synchronization may silently replace ROCm-enabled packages with standard PyPI versions that only support CUDA.
This leads to broken GPU acceleration and unexpected runtime failures.
read more