The prediction machine
The models are good now. The engineering discipline is managing certainty vs uncertainty in their output — knowing when to trust, when to verify, and how to push the certainty floor higher.
! 17 posts tagged
The models are good now. The engineering discipline is managing certainty vs uncertainty in their output — knowing when to trust, when to verify, and how to push the certainty floor higher.
Vibe coding is great — until your project outgrows a single conversation. For complex, long-lived systems, spec-driven development with OpenSpec gives AI assistants deterministic input instead of fuzzy chat history.
What a network MCP server is, why multi-device correlation is the real value, and the hard problems: two paths for different use cases, command whitelists as a safety model, and auth delegation.
AI coding agents start every session with zero knowledge of your project. A project constitution fixes that. Here's a copy-paste template for Python projects using UV.
A hands-on guide to installing and configuring Claude Code for network engineering — from CLAUDE.md setup to encoding BGP best practices as a skill and generating device configs.
The architecture of an AI agent maps onto a traditional computer. The LLM is the CPU, the agent runtime is the OS, and skills, commands, and MCPs are the applications you install.
Prompting is the foundation, not the ceiling. Four cumulative disciplines — prompt craft, context engineering, intent engineering, and specification engineering — define how humans communicate with autonomous AI agents.
A deep technical analysis of draft-yang-nmrg-mcp-nm and its constellation of companion specs — what the IETF is proposing for MCP in network management, what holds up, and what's missing.
How a .ai/ directory and project constitution turned ad-hoc AI coding sessions into a repeatable engineering workflow.
Simon Willison is writing an evolving guide on agentic engineering patterns — not a blog post, more like a living book. It covers principles for working with coding agents, red/green TDD workflows, subagent patterns, and includes annotated prompt examples you can steal.
The framing around “writing code is cheap now” and the emphasis on building personal knowledge repos to feed into agents resonates. Worth bookmarking and checking back — he’s clearly adding to it over time.
Andrej Karpathy on the No Priors podcast talking about agents, AutoResearch, and what he calls the “loopy era” of AI.
Two things stood out. First, the Frontier Lab vs. Outside framing — frontier labs have massive trusted compute, but the Earth has far more untrusted compute. If you design the right verification systems (discover is expensive, verify is cheap), a distributed swarm of outside contributors could outpace closed labs. There’s something appealing about that asymmetry as a balancing force.
Second, AutoResearch — fully autonomous research loops where an agent edits training code, runs experiments, evaluates results, and commits improvements via Git. No human in the loop. In a 2-day run it executed ~700 experiments and found 20 real optimizations on a single GPU. The human role shifts to writing evaluation criteria and research prompts, not the code itself.
Dwarkesh Patel and Dylan Patel (SemiAnalysis) got an exclusive tour of Microsoft’s Fairwater 2 datacenter with Satya Nadella. Each Fairwater building has hundreds of thousands of GB200s & GB300s, with over 2 GW of total capacity across the interconnected sites — a single building already outscales any other AI datacenter that exists today.
The interview covers how Microsoft is preparing for AGI across the full stack: business models, the CAPEX explosion turning Microsoft into a capital-intensive industrial company, in-house chip development, the OpenAI partnership structure, and whether the world will trust US companies to lead AI. Worth the full watch.
The Pragmatic Engineer interviewed Mitchell Hashimoto about his new way of writing code. The bit that stuck with me: always have an agent running in the background. Don’t wait for it to finish — kick off a task, context-switch to something else, come back when it’s done. Treat agents like background jobs, not pair programmers.
It’s a subtle shift but it changes how you structure your work. You stop thinking sequentially and start thinking in parallel — like managing async workers instead of typing code yourself.
Worth watching: The 5 Levels of AI Coding. A solid framework for thinking about where you actually sit on the AI-assisted development spectrum — from basic autocomplete all the way to fully autonomous agents. Honest about what each level demands from the engineer and where the real productivity gains (and risks) live.
Most of us are somewhere in the middle and kidding ourselves about it. Good gut check.
Watched the Cisco Live 2026 Amsterdam opening keynote. Cisco is going full AI mode — no surprise there, but interesting to see how they’re positioning it across the portfolio. Will be following the rest of Cisco Live remotely to catch the technical sessions and see what’s actually substance vs. slide deck hype.
Testing Claude Code’s new Agent Teams feature — spinning up multiple specialized agents that work in parallel. The tradeoff is clear: order of magnitude speed, order of magnitude risk of technical debt. But having multiple domain experts collaborating (instead of one generalist context-switching) does help catch things.
Already used it to build out this Astro blog. Next experiment: ChessKids — an AI-powered chess tutorial for my 6-year-old daughter. Teaching chess to kids feels like a good test case for agentic workflows: visual, structured rules, incremental difficulty.
Migrated the blog from Ghost to a custom Astro site built entirely with Claude Code. Went with a Cisco CLI-inspired dark-mode design — syslog timestamps, IOS command prompts, terminal aesthetics. The goal was readability first while keeping the network engineering feel baked into the UI.
The whole process was surprisingly straightforward. Astro’s content collections handle MDX well, and having an LLM pair on layout and component decisions made it fast to iterate. Happy with where it landed.