Response Caching
Sema caches LLM responses so identical calls don't hit the API twice. The cache is persistent: responses are written to ~/.sema/cache/llm/ (one JSON file per entry, named by a SHA-256 key), so a re-run of a script — even in a new process — serves the answer recorded by an earlier run. An in-memory layer sits on top for the current session.
A call is a cache hit when its model, temperature, system prompt, and full message list all match a stored entry. :max-tokens and :tools are not part of the key.
Caching is off by default — turn it on for a block with llm/with-cache.
For replay that you commit and share (deterministic tests, offline demos), see Cassettes instead. They're a different tool: a cassette stores a tape next to your code rather than in your personal cache dir, and the response cache is turned off inside
llm/with-cassette.
Cache scope
llm/with-cache
Run a thunk with caching enabled for every LLM call inside it. The options map comes first when you pass one; with a single argument it's just the thunk. :ttl sets the time-to-live in seconds (default 3600). Previous cache settings are restored on exit.
;; thunk only
(llm/with-cache (lambda () (llm/complete "hello")))
;; with options — opts FIRST, then the thunk
(llm/with-cache {:ttl 7200} (lambda () (llm/complete "hello")))A cache hit costs nothing: it makes no provider call, so it reports zero token usage and spends nothing against a budget. The two calls below show a miss then a hit:
(llm/with-cache (lambda ()
(llm/complete "what is 2+2?") ; miss — calls the model, stores the answer
(llm/complete "what is 2+2?") ; hit — served from the cache, no API call
(llm/cache-stats))) ; => {:hits 1 :misses 1 :size 1}Inspection & debugging
llm/cache-key
Generate the SHA-256 cache key for a prompt and options — handy for debugging why two calls do or don't share a cache entry. Takes a prompt string and an optional options map.
(llm/cache-key "hello" {:model "gpt-4" :temperature 0.5})llm/cache-stats
Returns {:hits :misses :size}. Note that :size counts only the entries loaded into memory this session — a cold start can serve hits from disk before :size reflects them.
(llm/cache-stats) ; => {:hits 0 :misses 0 :size 0}Cache management
llm/cache-clear
Clear cached responses — both the in-memory entries and the files in ~/.sema/cache/llm/. Returns the number of entries cleared.
(llm/cache-clear) ; => 0