Recent Posts
Exploring spec-driven development with Ontologies and Domain Design Models
Introduction One of the more interesting recent developments in Retrieval-Augmented Generation (RAG) is Knowledge Graph–based RAG. Unlike traditional vector-based retrieval, this approach introduces structured relationships between entities, enabling deeper reasoning and better contextual understanding.
This naturally raises a question:
Can we build more robust and reliable software systems by combining:
Ontologies
Domain-specific design models
Spec-driven development (SDD)
This article explores that idea through a practical comparison.
read more
Running Qwen3.6 35B Locally with Ollama and VS Code Integration
Overview Running large language models locally is becoming increasingly practical, even for developers without access to massive GPU clusters.
In this post, I walk through how to:
Run Qwen3.6 35B (A3B, Q4_K_M quantized) locally using Ollama
Integrate the model into VS Code
Use it as a local coding assistant
This setup is especially useful for:
Air-gapped environments
read more
Running vLLM on AMD AI MAX+ 395 (ROCm, Ubuntu 24.04)
Finally, I managed to get vLLM running on my AMD AI MAX+ 395 GPU on Ubuntu 24.04.
It was not straightforward — ROCm support on Ryzen AI (gfx1151) is still evolving, and I ran into multiple low-level GPU faults before finding a stable setup.
This post documents: - What didn’t work - The errors I encountered - The working configuration
Hopefully this saves you a few hours (or days).
read more