Posts
From Conventional Commits to LLM-Generated Release Notes
Introduction For several years, I adopted Conventional Commits across my software projects. The premise was straightforward: write commit messages in a structured, machine-readable format, then leverage tooling to generate changelogs and release notes automatically.
For example:
feat: add user login fix: resolve payment retry issue docs: update API usage guide This approach served me well. Tools like Conventional Changelog could parse commit history and produce structured release notes with minimal manual effort.
Posts
How I Keep Learning Without Forgetting Everything
Sometimes people look at my profile and wonder: “How can you keep learning so many things? What is your secret?”
The truth is, there is no magic. My approach is simple: I write things down, practise important skills repeatedly, and choose technologies carefully after real hands-on exploration.
1. I Write Down What I Learn Whenever I learn something useful, I try to capture it.
Sometimes I write it as a blog post.
Posts
one GEPA report of DSPy
What is GEPA? GEPA stands for Graph-based Evolutionary Program Adaptation — a DSPy optimizer that automatically improves the prompts/instructions of a multi-module LLM program through evolutionary search. It iteratively mutates module instructions, evaluates the changes, and keeps the best-performing candidates on a Pareto front.
What’s Happening in This Run This file captures a GEPA optimization run on a financial news extraction system that classifies M&A (merger/acquisition) articles and extracts structured data from them.
Posts
Running Qwen3.6 MTP GGUF on AMD AI MAX 395 with llama.cpp ROCm
Background MTP support was recently merged into llama.cpp through the following pull request:
MTP support merged into llama.cpp
After the merge, I wanted to test MTP models on my mini-PC powered by the AMD AI MAX 395. I tried several approaches, including manually building llama.cpp and using Unsloth GGUF models directly. However, despite multiple attempts, I could not get a stable working setup.
I also searched through GitHub issues and asked several AI assistants, including ChatGPT, Gemini, and DeepSeek.
Posts
Fault-Oblivious Stateful Workflows: Durable Execution Matters More Than Orchestration
Introduction Last year, I spent some time studying Oracle Banking Microservices Architecture (OBMA), together with enterprise schedulers and orchestration platforms such as Control-M .
Part of the work involved understanding how to convert traditional Control-M jobs into Airflow DAGs. During this process, I started to observe an important architectural distinction:
Not all workflows are the same.
While studying OBMA, I noticed that Netflix Conductor was used as the workflow engine inside the architecture.
Posts
Working Around ROCm PyTorch Replacement Issues with uv and ComfyUI
Introduction When working with AMD GPUs and ROCm-based AI workloads, one common issue appears when using uv for Python dependency management.
The problem becomes especially visible when setting up projects like ComfyUI on Linux with ROCm-enabled PyTorch builds.
Although ROCm-specific wheels are manually installed, running commands such as uv add or dependency synchronization may silently replace ROCm-enabled packages with standard PyPI versions that only support CUDA.
This leads to broken GPU acceleration and unexpected runtime failures.
Posts
Reducing Architecture Drift in Spec-Driven Development with coding agents and LLMs
Introduction Spec-driven development is becoming increasingly popular in the era of AI-assisted software engineering. Instead of starting directly from implementation, teams define specifications, domain rules, contracts, and architectural intentions first, allowing Large Language Models (LLMs) and automation tools to generate significant parts of the system.
This approach can dramatically improve development speed, documentation quality, and alignment between business and engineering.
However, one important challenge emerges quickly:
Architecture drift.
Posts
Exploring spec-driven development with Ontologies and Domain Design Models
Introduction One of the more interesting recent developments in Retrieval-Augmented Generation (RAG) is Knowledge Graph–based RAG. Unlike traditional vector-based retrieval, this approach introduces structured relationships between entities, enabling deeper reasoning and better contextual understanding.
This naturally raises a question:
Can we build more robust and reliable software systems by combining:
Ontologies
Domain-specific design models
Spec-driven development (SDD)
This article explores that idea through a practical comparison.
Posts
Running Qwen3.6 35B Locally with Ollama and VS Code Integration
Overview Running large language models locally is becoming increasingly practical, even for developers without access to massive GPU clusters.
In this post, I walk through how to:
Run Qwen3.6 35B (A3B, Q4_K_M quantized) locally using Ollama
Integrate the model into VS Code
Use it as a local coding assistant
This setup is especially useful for:
Air-gapped environments
Posts
Running vLLM on AMD AI MAX+ 395 (ROCm, Ubuntu 24.04)
Finally, I managed to get vLLM running on my AMD AI MAX+ 395 GPU on Ubuntu 24.04.
It was not straightforward — ROCm support on Ryzen AI (gfx1151) is still evolving, and I ran into multiple low-level GPU faults before finding a stable setup.
This post documents: - What didn’t work - The errors I encountered - The working configuration
Hopefully this saves you a few hours (or days).