A Lisp with LLM primitives·Rust·MIT
Stop rewriting the agent loop.
Every LLM script grows the same scaffolding: retries, caching, cost caps, rate limits, tool dispatch, conversation state. Sema makes that scaffolding the runtime — your script stays the size of its idea, ships as a single binary, and your coding agent already speaks the language.
macOS · Linux · Windows · single static binary, no toolchain required
The argument
The same agent, twice.
A coding agent that reads files and runs commands, with a tool loop, retries, and a spend limit. Once with an SDK, once in Sema.
import anthropic, time client = anthropic.Anthropic() TOOLS = [{ "name": "read_file", "description": "Read a file's contents", "input_schema": { "type": "object", "properties": {"path": {"type": "string"}}, "required": ["path"], }, }, { "name": "run_command", "description": "Run a shell command", "input_schema": { # ...same again }, }] def call_with_retry(messages, attempt=0): try: return client.messages.create( model=MODEL, max_tokens=4096, tools=TOOLS, messages=messages) except anthropic.RateLimitError: if attempt > 5: raise time.sleep(2 ** attempt) return call_with_retry(messages, attempt + 1) def dispatch(name, args): if name == "read_file": return open(args["path"]).read() if name == "run_command": # subprocess, capture stdout+stderr... messages = [{"role": "user", "content": task}] for turn in range(10): resp = call_with_retry(messages) track_cost(resp.usage) # you wrote this too if resp.stop_reason != "tool_use": break results = [] for block in resp.content: if block.type == "tool_use": results.append({ "type": "tool_result", "tool_use_id": block.id, "content": dispatch(block.name, block.input), }) messages.append(...)
(deftool read-file "Read a file's contents" {:path {:type :string}} (lambda (path) (file/read path))) (deftool run-command "Run a shell command" {:command {:type :string}} (lambda (command) (:stdout (shell "sh" "-c" command)))) (defagent coder {:system "You are a coding assistant." :tools [read-file run-command] :max-turns 10}) (llm/with-budget {:max-cost-usd 0.50} (lambda () (agent/run coder "Find TODOs in src/")))
“Wait — a Lisp?”
You won't write most of it anyway.
Your coding agent will. And a Lisp is the language with the least surface for an agent to be wrong about.
- Sixty years of training data. Lisp predates nearly everything else in the corpus. Scheme, Common Lisp, Clojure, Racket — your agent has read all of it, and a Lisp is a Lisp.
- Nothing to hallucinate. One syntax rule. No borrow checker, no venv, no lockfiles, no build config, no framework versions that drifted since training. The agent can't misremember machinery that doesn't exist.
- The whole language fits in context. Point your agent at one short page — where Sema diverges from the dialects it already knows, and nothing else. Constraints, not a textbook.
- Errors self-correct. Dialect drift is the shallow kind of wrong: “oh, it's
equal?here, notstring=?” — one check, one fix, moving on.
Sema is LLM-native in both directions: LLMs are primitives in the language — and the language is a target LLMs write without special training.
The other fair questions
“Why not just—”
…a Python script with the SDK?
That's where everyone starts, and it's fine — until the script matters. Then you bolt on retries, then a cache so dev runs stop costing money, then cost tracking, then the second provider. The scaffolding ends up bigger than the idea.
In Sema those are forms, not code you maintain: llm/with-cache, llm/with-budget, llm/with-fallback, defagent.
…a framework like LangChain?
Frameworks stack abstractions on a language that wasn't built for them — so a "chain" is a class, a prompt is a template object, a conversation is hidden inside an opaque memory wrapper.
Sema makes them language constructs instead. A conversation is an immutable value you can fork, diff, and inspect. A prompt is an s-expression. A tool is a lambda with a schema. There's nothing to wrap, because nothing is foreign.
The runtime, in one screen
Everything you'd otherwise hand-roll.
(llm/with-budget {:max-cost-usd 1.00} f)hard spend cap, scoped to a block(llm/with-cache {:ttl 3600} f)response cache — dev loops stop costing money(llm/with-fallback [:anthropic :openai] f)provider failover, in order(llm/extract {:amount {:type :number}} text)typed maps back, not strings to re-parse(conversation/say conv "...")immutable history — fork it, replay it, inspect it(llm/pmap prompt-fn items)parallel batch over a collectionEleven providers, configured from environment variables — set the key and go. Browse the LLM reference →
Then ship it
One file out the other end.
The part Python never solved. No virtualenv on the server, no dependency pinning, no container just to run a script.
- Standalone executables.
sema buildtraces your imports, bundles assets, and emits a self-contained binary. - Capability sandbox.
--sandboxfences shell, filesystem, network, and LLM access per group. - Starts in milliseconds. Fast enough for a git hook, a cron job, or a CI step — no JVM tax, no import dance.
Read this before adopting
Where Sema won't fit.
Knowing the boundaries up front beats discovering them in production.
- Single-threaded. Rc-based values, no cross-thread sharing. Parallelism is at the LLM-call level, not the compute level.
- No JIT. A bytecode compiler and a stack-based VM. If your bottleneck is number crunching, use Rust — or embed Sema in it.
- Not a full Scheme. No numeric tower, no call/cc, auto-gensym instead of syntax-rules.
- Young. Solid and tested, not battle-hardened at scale. Pin a version; read the changelog.
Your next LLM script, without the scaffolding.
Install it — or skip the tutorial and hand the docs to your agent.