Linkblog

Fission-AI/OpenSpec: Spec-driven development (SDD) for AI coding assistants. is a lighter-weight variant of juxt/allium: A language for sharpening intent alongside implementation.. Could also be quite interesting—the tooling is more developed, but the language is weaker. Allium is much more formal and oriented toward pseudocode.

juxt/allium: A language for sharpening intent alongside implementation. is a very exciting project. Specifically, a specification language for the behavior of software systems. A language for which there is no runtime environment or compiler, except the LLM. Implemented as Agent Skills. Super exciting because it also includes a Distill, with which you can analyze existing software and retroactively build specifications, or work out a specification in an interview process with the AI that is much more precisely understandable to an LLM than general English. A funny detail on the side: I tried Allium with Qwen3-Coder-Next, my current favorite model for local hosting, in pi.dev. I couldn't install the binary for Allium (a syntax checker and linter) with homebrew, so pi.dev simply downloaded the binary and installed it itself.

scitrera/cuda-containers: Scitrera builds of various CUDA containers for version consistency, starting primarily with NVIDIA DGX Spark Containers - I'm currently a big fan of eugr/vllm-node as a base package because it always provides up-to-date versions for vLLM, but if I want to play around with sglang sometime, this is probably the most similar project. I'm particularly interested in EAGLE-3 speculative decoding - basically, tokens are generated in parallel via a very small model and the main model checks what fits and takes it, or generates itself if necessary. This way you can often have a third of the tokens generated via a much faster simple model in the <3B range and only pass every third one through the large model.

thushan/olla: High-performance lightweight proxy and load balancer for LLM infrastructure. Intelligent routing, automatic failover and unified model discovery across local and remote inference backend might be the better choice following the LiteLLM debacle (hacked supply chain with data extractor in the package). For me, it's definitely interesting because I simply want to run two models and make them available under a single endpoint, and all the other packages are significantly overkill for that.

Running Mistral Small 4 119B NVFP4 on NVIDIA DGX Spark (GB10) - DGX Spark / GB10 User Forum / DGX Spark / GB10 - NVIDIA Developer Forums - lifesaver discussion in the NVIDIA forums. With what's in there, I got Mistral Small 4 running smoothly. And it runs cleanly with 150K context and 100 tokens/second in generation. Wow. This is the first time I've really noticed the power of this machine.

Zed: The Fastest AI Code Editor — Zed's Blog is a pretty cool editor that finally comes with a good VIM implementation. It supports Coding Agents and is open to open weight and self-hosted models. To be honest, I really like it and it's genuinely fast. And it's written in Rust, which excites me even more.

Introducing Mistral Small 4 | Mistral AI is another interesting candidate for the ASUS Ascent GX10||ASUS Deutschland, especially since I don't need side-car models for vision there, because the model itself already comes with vision capabilities built-in. And as a MoE model, it should also deliver good speed performance.

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 · Hugging Face will likely be the first large model on the ASUS Ascent GX10||ASUS Deutschland because it was trained optimized on NVFP4 - and thus does not experience any "dumbing down" due to quantization, but functions completely normally as expected. And it is optimized for agentic workflows, which should benefit OpenClaw, just as the 1M context, which can probably even be utilized in the model (different architecture than classic transformer-based models).

WireGuard® for Enterprise sounds funny on a purely private blog with that name, but they have pretty affordable plans for VPN-like constructs that make it very easy to reach your local home devices while on the go. There are certainly other alternatives, but I think if I run my agent locally, I'd rather put the subscription money into a proper VPN service instead, where I get more value overall. Update: it's free for private users up to 3 users. Nice.

ASUS Ascent GX10||ASUS Deutschland is arriving in the next few days. AI powerhouse that will allow me to run larger models locally and, for example, operate an OpenClaw Agent autonomously at home without needing any subscriptions. I'm really looking forward to seeing what's possible with it.

Cognee is also something I’ll keep an eye on for later. Basically a knowledge graph controlled by an LLM to make memory available for another LLM. Certainly exciting to play around with when I have good local hardware to run larger models on. But for now, just a memory keeper.

Docker Model Runner Adds vLLM Support on macOS | Docker - at first just noticed, that could be interesting later because I can run models via Docker with vllm, while using Apple Silicon at the same time. Interesting here is that it comes as a Docker image ready to use, and I don't have to fiddle with setup. I'm currently working more with my own rfc1437/MLXServer: a simple MLX based server for small models to run locally simply because I only need it for offline operation, but vllm-metal could be very exciting later.

rfc1437/MLXServer: a simple MLX based server for small models to run locally is a tool that I built (with AI assistance) to run small models directly locally, without heavy overhead. It doesn't consume much memory, has a built-in local chat for personal experiments, and feels significantly more practical to me compared to the big alternatives—fewer knobs to adjust, but consequently less confusion. I just want to run a small model locally for my on-the-road blog.

mlx-community/Qwen3.5-9B-MLX-4bit · Hugging Face is another nice, small model — larger than the others, thus slightly more consistent in execution, but still pretty fast. And that's the upper limit of what you can run on a MacBook Air M4 with 16GB RAM without crashing the computer.

Google/gemma-3-4b-it · Hugging Face is a pretty nice model that has been trained for many European languages and is therefore well suited for local translations – it loads under 4G into memory and occupies approximately 6.5G in interference during operation. And it has Vision Capability, so it can also be used to get image descriptions. Ideal, for example, to be used locally with bDS when you want to be offline on the go. And significantly smaller than mlx-community/gemma-3-12b-it-4bit · Hugging Face – that was borderline on my Macbook Air.

Inferencer | Run and Deeply Control Local AI Models is an interesting tool that allows you to run LLMs locally. Of course, LM Studio or Ollama or vllm-mlx can do this as well. But Inferencer has a feature called "Model streaming" that's pretty cool: it can run models that are actually too large for memory. Of course, you're trading time for memory, but for a local model for image captioning or similar smaller tasks, you could definitely use it. However, I have the feeling that the model becomes somewhat more fragile this way - for example, it suddenly doesn't use tools correctly anymore (I tried it with gemma3 12b, which is just scratching the memory limit of my laptop).

Pagefind | Pagefind — Static low-bandwidth search at scale is a static search engine for statically generated HTML, like my blog. And it will soon power the search on this blog. No external dependencies, no server, no infrastructure complexity - just a few additional files that get uploaded. And of course active JavaScript in the browser.

pi.dev is a minimalist harness for agentic coding whose focus is not on features, but on extensibility. The underlying idea is solid: a very simple harness that can be extended through TypeScript plugins, so the harness can adapt to your own workflow and requirements. Maybe I'll take a closer look at it soon.

steveyegge/beads: Beads - A memory upgrade for your coding agent is a to-do list tool for agents. Essentially a memory system for projects that agents can use to manage themselves (storing tasks and features) and to coordinate more complex workflows where an agent needs to work through a series of issues and resolve them. Tasks can have dependencies, and agents only see the set of tasks that can actually be worked on right now. Interesting for projects where you want to run agents in loops to solve larger tasks.

dolthub/doltgresql: DoltgreSQL - Version Controlled PostgreSQL is the PostgreSQL-flavored partner of dolthub/dolt: Dolt – Git for Data and offers the same features, but with PostgreSQL syntax and a binary interface that is compatible with PostgreSQL.

dolthub/dolt: Dolt – Git for Data is a SQL database that internally uses mechanisms similar to git, thereby supporting data branches and merges and generally providing the ability to version data in the database and work with history. And like git, it also allows data changes to be distributed via version control. In principle, "slow transaction shipping".

paperclipai/paperclip: Open-source orchestration for zero-human companies is an approach to orchestrating and managing agents. What's interesting here is that it works with an organization modeled after a company structure and uses means to depict agent communication that could provide good transparency. I haven't tried it yet, but alongside gastown, it's one of the more interesting projects on this topic. The whole field of agent orchestration is still quite new, so there's still a large area of experimentation, but it's exciting to watch.

Ghostty is the foundation that cmux — The Terminal for Multitasking was built on. Generally also a very nice terminal that responds noticeably faster than the standard terminal and already works very well. What I didn't like was that tabs weren't automatically reopened in the appropriate directories when the program was closed. I simply have too many persistent sessions that I keep coming back to. cmux just does that better.

cmux — The Terminal for Multitasking is a pretty brilliant terminal program for CLI workflows, which are experiencing a resurgence especially through agentic coding. And what I love about it: it has clean persistence of open workspaces where you can have multiple tabs, so I don't have to keep my work environment open all the time. I've always liked having various directories open directly because I switch back and forth between several while programming, and this works much better with CMUX than with anything else.

NuGet Gallery | Photino.Blazor.CustomWindow 1.3.1 is a library for Photino Blazor that allows you to make windows chrome-less. That is, you can remove the title bar and standard decorations. The idea behind it is to gain more control over the look and feel and create more compact UIs. With this library, you get back basic functions like window resizing and other standard elements that users expect, but under full control of the application.

OpenClaw Memory Masterclass: The complete guide to agent memory that survives • VelvetShark - interesting compilation of the memory system and the pitfalls with compaction in Openclaw. The agent is meant to run for a long time, but there is always the risk that compaction will strike right in the middle of complex situations. And openclaw runs autonomously, so you want to be sure that it continues continuously.

unum-cloud/USearch - that's what it says. So a library that offers an index for vectors that can come from embeddings, for example, and can find semantically similar texts. Not text-similar, but semantically, i.e., content. Interesting topic, the models required for this are related to LLMs, but not large, but small - they don't need to fully understand and generate because they only create vectors that can then be compared against each other and the higher the similarity, the higher the similarity of the texts in the topic. Cool little feature for bDS.

waybarrios/vllm-mlx: OpenAI and Anthropic compatible server for Apple Silicon. I use this to run mlx-community/gemma-3-12b-it-4bit on my MacBook Air. It works very well, a small shell script to start the server and then I am autonomous. Not as comfortable as Ollama, but it perfectly supports Apple's MLX and thus makes good use of Silicon.

mlx-community/gemma-3-12b-it-4bit · Hugging Face is currently the best model for local operation, allowing me to implement image captioning and even local chat. It's not the fastest, as it's quite large, but it's absolutely suitable for offline operation if I come up with a few mechanisms for batch processing of images, etc. This could be super exciting for vacation times. An image description might take a minute, but hey, no dependencies.

Models.dev — An open-source database of AI models is a very practical site that provides framework parameters for all kinds of providers and all kinds of LLMs, including even API prices. And technical parameters such as input/output tokens.

Ollama - a runtime environment for LLMs that allows models to be run locally. My favorite model at the moment: qwen2.5vl:7b-q4_K_M. With only 6.6 GB in size, this runs smoothly on a MacBook Air M4 and still has enough memory and capacity to run programs alongside it. The model is surprisingly usable in chat and above all has excellent vision capabilities. Ideal for providing titles, alt text, or summaries for images without having to pay big providers for it. And an important building block to bring bDS back to full-offline.

mistralai/mistral-vibe: Minimal CLI coding agent by Mistral - accompanying the AI Studio - Mistral AI there is also the Vibe coding interface to Devstral as open source. Very nice, because it makes a good pair. Will definitely try it out, even if I will probably rather reach for the powerhouses (Opus 4.6) for larger projects.

AI Studio - Mistral AI - as the situation in the USA becomes a bit more tense again, and simply because one should always check what is happening outside the USA, here is a link to a European alternative to the major US operators. Mistral offers a coding model with Mistral 2 that is not only open weights (i.e., freely available and operable if you have the necessary hardware), but also quite affordable when using Mistral itself. And the performance is slightly above Claude Haiku 4.5, and below Sonnet 4.5, but not by much. So quite usable and my first experiments were not bad. Unfortunately, no vision capability, so not very suitable for experiments with images (and therefore not ideal for my bDS), but still interesting enough to keep an eye on.

ZK I Zettel 1 (1) - Niklas Luhmann-Archiv - where the inspiration for my blog comes from, or what has always driven me beneath the technical surface.

If you, like me, want an overview of UI integration for LLMs and are wondering how A2UI and MCP Apps compare and what they offer: Agent UI Standards Multiply: MCP Apps and Google’s A2UI - Richard MacManus helps. I have implemented A2UI in bDS so that the LLM can also use visual aspects in the internal chat, and I really like that. But the idea of incorporating parts of my UI into external agents is also fascinating. Even if I find that "local HTML/JS in an IFrame" somehow sounds like a hack at first, but much in the LLM environment gives me the feeling right now, simply because everything is pushed through a normal text stream and you hope that the LLMs adhere to the formats (even A2UI works like this).

cloudflare/cobweb: COBOL to WebAssembly compiler - I'll just leave this here. Let someone else clean up the old stuff. what else can you say.

Pyodide is a bit of a lifesaver for me in bDS: I must admit, I'm not really super deep into TypeScript and I actually don't feel like writing it myself. If the AI does it, that's okay, there's enough knowledge for an LLM to draw on, but I don't really want to delve deep into it myself. And Python has been one of my favorite languages for a long time. And Pyodide offers exactly that: a port of CPython to WebAssembly. It provides a pleasant language for scripts and macros that can access everything the application does and - if I ever want to - can also load Python libraries.

Drizzle ORM - was suggested by AI during the construction of bDS and has proven to be very reliable. A clean ORM for TypeScript with a quite nice API that strongly reminds me of Django. Additionally, clean mapping of migrations, which then simply allow SQL, thus enabling more complex migrations. And so far completely unobtrusive in operation. What is particularly interesting: there is a direct translation to GraphQL, so that the objects can also be exposed to an API, which I think I should take a look at (or rather, I should complain to the AI that it should take a look at it). I am always a fan of flexible standard integrations to connect external tools.

A2UI is an interesting project for a streaming-oriented protocol for applications with LLM integration. The basic idea is streaming JSON snippets from which the UI is then incrementally built, and the LLM controls what goes into the UI. It allows the LLM to produce more than just simple textual descriptions or basic ASCII graphics and also gives LLM queries a different look. The whole thing comes from Google and is maintained on an open github project. In bDS, I have also integrated it, and it's quite something when you can get a heatmap of the distribution of your blog posts across the months.

Waggle Dance . oldie, but goldie. Always fun how this rather simple game can captivate people, even over 10 years after its release.

OpenCode | The Open-Source AI Coding Agent - an open-source coding agent that could soon outperform Claude Code. Very good in execution and above all provider-agnostic. You can attach any model, even one self-hosted with LM Studio or ollama. If you just want to play around, you can simply download it and get started, even without providing a credit card (then of course with limited tokens, but quite usable for initial experiments. And independent of the major providers (ok, except for their models if you want to use them - and for serious coding, those are unfortunately still necessary). OpenCode also offers its own AI models, which are then billed, but a few models are always freely available and thus offer quite serious experimentation without investment.

Big Damn Stupid - no, Blogging Desktop Server. Currently my favorite project where I play around with Vibe-Coding and build software with which I can run the blog. But this time with the clear perspective that I won't succumb to bit rot or the complexity spiral again. Simple software with a usable interface for maintaining blog posts, but storage as Markdown files with YAML front matter to be easily stable for the future. Then with sqlite for caching and full-text search and other comfort features and git for synchronizing the blog data, and git-lfs for the images. Feels quite usable right now.

Schotten Totten 2 is becoming one of our favorite two-player games. A remake of the old Schotten Totten (which unfortunately is no longer available under that name, only as Battle Line, but that only in English). Unlike the first game, now asymmetric with just enough difference in gameplay that the two sides definitely feel different, but still mostly doing the same things. Quick to set up, quick to play, and quick to put away with enough tactics to keep you hooked for a few games. And the graphic style is just nice.

Tak is quite an interesting game: inspired by a fantasy novel by Patrick Rothfuss, brought to life as a fictional "classic" strategy game. The beauty of it: you can easily make it yourself if you want and are skilled. The rules are super simple and easy to learn, but the game is tricky with many opportunities to set traps for the opponent. It will probably be taken for our next vacation because it's practical to play outdoors.

Because Google Authenticator annoys me: In-depth tutorial: How to set up 2FA TOTP with KeepassXC, Aegis and Authy. | Linux.org. Keepassxc is much nicer, in my opinion, and is much more controllable for me.

Prime Time for Prime Slime is the second of my "reinventing the past" deck lists. It is actually my first commander precon deck - Mimeoplasm - just spiced up to 11. It kinda is funny how I stuck with the theme of the deck, transitioning it from the precon to Ooze tribal and later reanimator, then turned it into Muldrotha combo and again turned it back to Mimeoplasm, when the Prime Slime secret lair came out (I love the art style), going full on Necrotic Ooze combo this time. It is a ton of fun to play, which is why some of the old stuff from my collection made it's way into that deck. #EDH #MtG #MagicTheGathering

Kaalia of the Blast is one of my "lets recreate original commander" decks. The idea is to keep close to the original structure - one commander with the focus of the deck, a supporting commander that could become the main with some changes, and a big dragon in the same colors. And have the game plan of one of the originals, too. So in a way an updated Kaalia with a bit more focus and dedication, but still Angel/Dragon/Demon beatdown. #EDH #MtG

Abdel, Agent of the Iron Throne is my latest high-power, near-competitive creation. The deck has a very linear combo plan with a lot of redundancy in the parts, and above all both combo element and combo payoff in the command zone. What is lacking in interaction due to the colors is replaced by resilience. I like linear combo because it gives the game a clear plan and, for example, makes mulligan decisions much easier.

Dargo for the Lulz is a deck that has positively surprised me with how strong it is. I would place it in a similar category as Godo, so definitely cEDH viable. Although the combo is not the commander alone, it only needs one more card (Phyrexian Altar) and it goes off if there are a few Treasures lying around or creatures to sacrifice. And the Plan B with Beatdown also works well.

Jhoira, Scrap That! is now my primary deck for cEDH and it has proven to be damn good so far. The deck actually always has a line, can react to almost everything and has good rebuild capabilities. Through multiple overlapping combos, it offers flexible ways to win even through Stax or Hate. And even if I don't win: the deck definitely leaves a lasting impression on the opponents.