AI Agent Engineering
Overview
Building production AI agents is still mostly unsolved engineering. The core architecture is deceptively simple - a tool-calling loop in ~200 lines - but production reality requires context management, agent coordination, and infrastructure thatās still being figured out. This note collects sources and patterns relevant to the Agency of Agents essay.
The Core Architecture
An HN discussion (citing Thorsten Ballās āHow to Build an Agentā) revealed that the fundamental agent loop is just: send prompt + tools to LLM, execute tool calls, loop with updated context. The hard part isnāt the loop - itās everything around it. Dynamic TODO lists prevent premature termination (disabling them drops performance ā1-2 grade jumpsā). Context management is routinely underestimated - one commenter noted that DIY builders often ālose the rest of the yearā on that step alone.
Production Lessons
Armin Ronacherās practical lessons from building agents are worth reading in full. The highlights: skip high-level SDK abstractions and target provider SDKs directly (Anthropic, OpenAI) because model differences are too significant. Manage caching explicitly rather than relying on platform magic. Use reinforcement (injecting reminders after each tool call) as the dynamic guidance mechanism. Isolate failures in subagents to avoid polluting the main context. Use a shared filesystem as the coordination layer between tools.
Parallel Agent Teams at Scale
Nicholas Carlini at Anthropic demonstrated 16 Claude instances working in parallel to build a 100,000-line C compiler that boots Linux - nearly 2,000 sessions, $20K in costs, 2 weeks of runtime. Key patterns: infinite loop harness in containers, git-based synchronization with task locks via text files, fresh container per session, specialized agent roles (code quality, performance, documentation), and no orchestration agent - emergent coordination via shared state. The most important insight: the test harness design matters more than the model. Source: github.com/anthropics/claudes-c-compiler.
Agent Orchestration
In May 2025, before agent orchestration became widely discussed, I predicted the category:
Agents of agents - with Nimble Claude, I put Claude Code in a docker container and wrote an API around it that lets you send in commands from Slack or anywhere. Because heās in a docker container, he has a sandbox to create files and execute programs. I gave him his own access to GitHub and Heroku as well. The thought is: what if Nimble Claude had a tool that let him spin up/down other Claudes and then send them instructions? That would be a coding and infrastructure challenge, but I donāt think itās terribly difficult. Iām not sure yet how I would position this as a product. It could just be selling access to a tool that does this or it could enable a particular workflow in a specific domain. Iād call this an āahead of the curveā idea. I think people are probably already working on it and weāll see hype/a big product category in the next year or so - whenever agents become trustworthy enough to let loose on entire projects and coordinate with each other. Thereās a big safety and cyber security question in the middle of this that would be interesting to pontificate about.
ā William Huster, May 2025
The Nimble Claude setup itself (March 2025) - a containerized agent with sandbox, GitHub/Heroku access, and Slack as the interface - predated and predicted the core pattern that Rampās Inspect, Stripeās Minions, OpenClaw, and the background-agents framework all independently converged on months later. By February 2026, the orchestration prediction has also materialized in frameworks with sessions_spawn, subagent management, and isolated sessions. The key open questions I identified - safety, cybersecurity implications, and product positioning - remain very relevant.
Swarm-Native Architectures
Random Labsā Slate represents a different approach to orchestration: swarm-native rather than message-passing. Released March 2026, Slate is the first frontier agent to use a code environment (TypeScript DSL) for direct subagent orchestration. The key innovation is threads - not isolated subagent contexts (as in traditional multi-agent systems), but shared work streams that can be composed and delegated.
The threading model: Rather than isolating subagent context, Slate genuinely shares it with the main orchestration thread. The main agent āprograms in action spaceā using the TypeScript DSL, delegating tactical work to threads one operation at a time. Each thread maintains its own āRAMā (borrowing Andrej Karpathyās LLM OS terminology), but the main thread retains visibility into thread state and can compose threads into complex working behaviors.
Why this matters for verification: The threading architecture directly addresses Verification Complexity. By delegating simple tactical actions to threads one at a time, it creates āan almost perfect boundary over which we can compress the context.ā This compression is the key to tractable verification - instead of verifying the full combinatorial explosion of agent interactions, you verify at thread boundaries. The system retains only the tool calls that contribute to success (episodic memory), filtering context to verified correct actions.
Context engineering: Slateās ānovel context engineeringā maximizes caching through subthread reuse while keeping costs tractable. The architecture separates strategic knowledge (what high-level work to do) from tactical knowledge (how to execute specific operations). This separation lets different model tiers handle different verification loads: frontier models (Sonnet, Opus) orchestrate the swarm, while smaller/cheaper models (Codex, GLM, Haiku) execute bounded tactical operations. Slate automatically selects the right model for each job.
Comparison to prior work: The threading approach shares principles with Cognition (Devin), Fundamental (formerly Altera), and ManusAI - all separate high-level strategy from low-level delegation and compress lower-level context for the strategizing agent. The difference: Slateās threads use the TypeScript DSL as the coordination layer rather than natural language message-passing. This makes orchestration logic auditable code rather than emergent conversation, shifting some verification burden from runtime to review-time.
Empirical results: A less flexible version of Slateās architecture passed 2/3 tests on the make-mips-interpreter task (Terminal Bench 2.0) - a task that Opus 4.5 and 4.6 solve <20% of the time in most harnesses. The team emphasizes they ādo not believe in benchmaxxing,ā but the result suggests architectural choices (threading, strategic/tactical separation, context compression) may matter more than raw model capability for complex tasks.
Convergent evolution: Slateās threading architecture was developed independently but converged with RLM (by @a1zhang and @lateinteraction) on core ideas: use a REPL to decompose tasks into known operations, letting the model think strategically about the execution graph rather than being overwhelmed by context. The team introduces two useful terms: knowledge overhang (knowledge the model has but doesnāt use during task execution) and expressivity (the interplay between interface expressiveness and the modelās bias to use it). The TypeScript DSL is a deliberate expressivity trade-off - less flexible than natural language, but more amenable to verification.
Open question: Does swarm-native orchestration solve the trust gap identified above, or just push it to a different layer? The threading model makes verification boundaries explicit, but verifying the TypeScript orchestration logic itself is still a human review task. As William noted in the Trust Gap discussion: ālaunching agents in parallel is the easy part - the hard part is trusting them.ā Slateās contribution is making the trust boundary auditable code rather than opaque agent interactions.
ā Claude (AI Assistant), March 2026
The Trust Gap
Orchestration tools like Conductor and claude-squad let you run parallel coding agents in isolated workspaces. But launching agents in parallel is the easy part - the hard part is trusting them. And thereās a structural limit to parallelism:
The design ā code ā test loop is still fundamentally serial per feature. You can only use 100s of agents when your problem is embarrassingly parallel.
ā @whusterj, February 2026
The Factory Model
The enterprises solving the trust gap are building proprietary agent infrastructure - what background-agents.com calls āthe self-driving codebase.ā The pattern: isolated sandboxes, event-driven triggers, deterministic governance layers, and human review gates. Background agents āreceive a trigger, reason about the problem, write code, run tests, and open a pull requestā autonomously, excelling at repetitive, well-defined tasks with bounded blast radius.
Rampās Inspect (builders.ramp.com) writes ~30% of all PRs merged to their frontend and backend repos. Key insight: āOwning the tooling lets you build something significantly more powerful than an off-the-shelf tool will ever be.ā Agents access the same tools engineers use (Sentry, Datadog, LaunchDarkly, GitHub, Slack), run in Modal sandboxes with 30-minute repo rebuilds, and can spawn nested child sessions for parallel research. Multiplayer-first design lets teams collaborate in a live session.
Stripeās Minions (stripe.dev) merges 1,300+ PRs per week with zero human-written code. The āone-shotā model goes from Slack message to CI-passing PR with no human interaction in between. Architecture: devbox sandboxes (10-second spin-up), āblueprintā orchestration that interleaves deterministic nodes (git, linting, testing) with free-flowing agent nodes, and a āToolshedā of ~500 MCP tools for internal context. Agents get an intentionally small subset of tools - deliberate constraint, not unlimited capability. Part 2 details the blueprint pattern and scaling lessons.
The common thread: these companies didnāt adopt off-the-shelf orchestration tools. They built proprietary process definitions around their codebases - exactly the āsmaller LLMs + proprietary process definitionsā model predicted in the Trust Gap discussion above.
Agent Threads as Literate Programming
Gustav van Rooyen argues that agent conversation threads are the modern embodiment of Knuthās literate programming. Where Knuthās 1984 WEB system āwoveā documentation and ātangledā code from the same source, agent threads capture both intent (user prompts) and implementation (agent reasoning + code) in a persistent, shareable format. Best practice emerging: one thread, one git commit - every line of code annotated with its reasoning.
Beyond LLMs: The Alberta Plan
Rich Sutton, Michael Bowling, and Patrick Pilarskiās āThe Alberta Plan for AI Researchā argues for an agent-centric approach to AI grounded in reinforcement learning rather than language modeling. The core premise: intelligence emerges from agents that continuously interact with a complex world, learning to predict and control their sensory input over time. The agent architecture decomposes into perception (situational state), policy (state ā action), value function (state ā expected reward), and transition model (enabling planning). The plan extends this with feature-based subtasks and temporally extended options - a fundamentally different path from scaling transformer architectures. If LLMs hit the complexity walls described in Verification Complexity, Suttonās agent-first framework may point toward the architectural shift needed to break through.
Supporting Infrastructure
Telos (Daniel Miessler) provides structured templates for āDeep Contextā - mission, goals, strategies, KPIs - that agents need to make aligned decisions. As AI becomes more capable, the bottleneck shifts from āwhat can AI do?ā to āwhat should AI do in this specific context?ā
Walkie enables zero-infrastructure P2P encrypted communication between agents using Hyperswarm DHT. No servers, no IP addresses - just a shared channel name and secret. Interesting primitive for multi-agent coordination, though production readiness (persistence, reliability) remains unclear.
The Inflection Point
SemiAnalysis argues Claude Code marks the āWeb 2.0 momentā for AI: the shift from selling tokens (call-and-response) to orchestrating tokens into outcomes (agentic workflows). Claude Code accounts for 4% of GitHub public commits, projected to hit 20%+ by end of 2026. METR data shows autonomous task horizons doubling every 4-7 months. The pattern extends beyond coding: any READ-THINK-WRITE-VERIFY workflow is automatable, touching ~33% of the global workforce. Current adoption: 84% of developers use AI but only 31% use agents - the gap is where the next wave hits.