February 17, 2026

Agent Lightning

project

Microsoft Research · 2025

  • Decouples agent optimization from agent execution via a sidecar that collects traces non-intrusively — the same separation of concerns that Unix process boundaries already enforce between a program and its observer
  • The sidecar pattern is tee for training data: watch what the agent does, capture the interaction traces, optimize separately — no modification to the agent’s own code or workflow
  • Proves that optimization is orthogonal to orchestration; if your agent is a shell loop, the traces are already in your filesystem (stdout logs, exit codes, tool call records) and the training pipeline is just another downstream consumer

AgentFS — The Missing Abstraction for AI Agents

post

Turso · November 2025

  • Everything an agent does — files, state, tool calls — lives in a single SQLite database exposed as a POSIX filesystem; the abstraction is not a new API, it is the filesystem itself
  • FUSE support lets agents use git, grep, and standard Unix tools directly against their state store with zero integration code; the trust boundary is the mount point, not a permission model in application code
  • Makes agent state portable (one file), auditable (SQL queries over history), and composable (multiple agents share a filesystem with conflict resolution) — the same properties Unix gives processes via /tmp and pipes

Bash One-Liners for LLMs

post

Justine Tunney · December 2023

  • Treats LLMs as standard Unix filters: pipe data in via stdin, get structured output on stdout, chain with sed, curl, and links — the model is just another composable process
  • Uses –temp 0 to make LLM output deterministic, turning a stochastic model into a reproducible Unix tool suitable for scripting and automation
  • Demonstrates that llamafile turns an LLM into a single-file executable callable from bash — no Python, no framework, no daemon; the filesystem is the package manager

llm-functions

project

sigoden · 2024

  • Defines LLM tools as plain Bash functions with structured comments — the tool schema is generated automatically from the script itself, no SDK or serialization layer needed
  • Agents are composed from tools + prompts + documents, assembled at the filesystem level; adding a capability means dropping a shell script into a directory
  • Proves that function calling does not require a framework: a shell function, a naming convention, and a comment block are sufficient for an LLM to discover and invoke a tool

From Commands to Prompts: LLM-based Semantic File System for AIOS

paper

Shi, Mei, Zhang et al. · 2025

  • Proposes replacing shell commands with natural-language prompts that compile down to the same POSIX file operations — the filesystem API is the stable interface, whether the caller is a human or an LLM
  • Demonstrates 15%+ retrieval accuracy gains and 2.1x speed improvement over traditional file systems by adding a semantic index layer, while preserving full POSIX semantics underneath
  • Includes safety mechanisms (confirmation before destructive ops, rollback) that map exactly to the trust-gradient argument: the OS already has the permission model, the agent just needs to respect it

smolagents

framework

Hugging Face · December 2024

  • The entire library is roughly 1,000 lines of code — a deliberate rejection of the sprawling framework approach; minimal abstraction means you can read the whole agent runtime in one sitting
  • Code agents outperform JSON tool-calling agents by ~30% fewer steps because code is inherently composable: you can nest calls, define variables, and loop — the same properties that make shell scripts powerful
  • The core design insight maps directly to the shell thesis: agents should write executable actions (code), not describe desired actions (JSON) — the agent is a script

The Unreasonable Effectiveness of an LLM Agent Loop with Tool Use

post

sketch.dev · May 2025

  • The entire agent pattern reduces to a 9-line while loop: read input, call tool, feed output back — this is a read-eval-print loop, the same pattern shells have used for fifty years
  • With just one general-purpose tool — bash — current models can solve many problems in a single shot; the agent does not need a framework, it needs a shell
  • Argues custom agent loops will replace tasks “too specific for general tools and too unstable to automate traditionally” — the exact niche shell scripts have always filled
February 16, 2026

agent-browser

project

GitHub / Vercel · 2026

  • Browser automation CLI that reduces context usage by 93% through a “snapshot + refs” system — elements get short labels (@e1, @e2) instead of dumping the full accessibility tree into the LLM
  • Three-layer architecture (Rust CLI → Node.js daemon → Playwright) that looks like any other Unix tool from the agent’s perspective: commands in, structured output out
  • Same thesis as Playwright CLI but pushed further — the agent controls a browser through shell commands, reinforcing that tool use is just command execution with minimal data passing

Building Effective Agents

post

Anthropic · December 2024

  • Argues the most effective agent architectures are augmented LLMs with simple tool loops, not multi-agent frameworks
  • Distinguishes “workflows” (predetermined tool orchestration) from “agents” (model-directed tool use) — both reduce to tool loops at different autonomy levels
  • Recommends starting with the simplest implementation and adding complexity only when measurably needed

Playwright CLI

project

GitHub / Microsoft · 2025

  • Browser automation as a CLI instead of MCP — agents discover commands from help output rather than tool schemas, proving that shell conventions are sufficient for tool integration
  • Deliberately “token-efficient” by not forcing page data into the LLM context, which is the Unix philosophy applied to agents: do one thing, pass minimal data between steps
  • Validates the thesis that agent tool use reduces to command execution — browser automation is just another program the model shells out to

Taste Is Not a Moat

post

sshh.io · 2026

  • Argues that taste is “alpha” (a decaying edge) not a “moat” — as AI baselines improve every few months, individual judgment only matters relative to what the tools do by default
  • Reframes the human role as “taste extractor”: articulating tacit preferences so tool loops can operationalize them, which is exactly the shell pattern of encoding intent into composable commands
  • Proposes concrete extraction techniques (A/B interviews, ghost writing, external reviews) that all reduce to the same structure — a human-in-the-loop refining outputs through iterative feedback cycles
February 15, 2026

Model Context Protocol (MCP)

protocol

Anthropic · November 2024

  • Open protocol for connecting AI assistants to external data sources and tools through a standardized JSON-RPC interface
  • Servers expose tools, resources, and prompts; clients (LLMs) discover and invoke them — the AI equivalent of USB-C for context
  • Keeps tool integration composable: each server is a single-purpose process, orchestrated by the model’s own tool loop
February 14, 2026

ReAct: Synergizing Reasoning and Acting in Language Models

paper

arXiv · October 2022

  • Interleaves chain-of-thought reasoning traces with concrete actions in an observe-think-act loop
  • Outperforms pure reasoning (chain-of-thought) and pure acting (action-only) on knowledge-intensive tasks by grounding thoughts in tool outputs
  • Foundational pattern behind most modern agent frameworks — the shell-like “read, eval, print” loop applied to LLMs
February 13, 2026

Toolformer: Language Models Can Teach Themselves to Use Tools

paper

arXiv · February 2023

  • Demonstrates that language models can learn when and how to call external tools (calculator, search, calendar) through self-supervised training
  • The model inserts API calls into its own text generation when doing so reduces perplexity — tool use emerges from utility, not instruction
  • Shows that tool augmentation is a natural extension of next-token prediction, not a bolted-on capability
February 12, 2026

LangChain

framework

GitHub · October 2022

  • Framework for composing LLM calls with tools, memory, and retrieval into multi-step chains and agents
  • Popularized the “chain” abstraction — sequential LLM calls where each step’s output feeds the next — and the “agent” pattern with dynamic tool selection
  • Useful as a reference for what complexity emerges when tool loops scale; argues for the shell thesis by showing what happens without simplicity constraints
February 11, 2026

Anthropic Tool Use Documentation

docs

Anthropic Docs · 2024

  • Reference for Claude’s native tool-use interface: define tools as JSON schemas, the model emits structured tool_use blocks, you execute and return results
  • The interaction pattern is a synchronous tool loop — exactly the shell paradigm of prompt → command → output → prompt
  • Supports forced tool use, parallel tool calls, and streaming, showing how the simple loop extends without changing its fundamental shape