LLM Primitives
Sema's differentiating feature: LLM operations are first-class language primitives with prompts, conversations, tools, and agents as native data types.
Setup
Set one or more API keys as environment variables:
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
# or any other supported providerSema auto-detects and configures all available providers on startup. Use --no-llm to skip auto-configuration.
See Provider Management for the full list of supported providers and configuration options.
Features
Completion & Chat
Simple completions, multi-message chat, and streaming responses.
Prompts & Messages
Prompts as composable s-expressions, message construction, and prompt inspection.
Conversations
Persistent, immutable conversation state with automatic LLM round-trips.
Tools & Agents
Define tools the LLM can invoke, and build agents with system prompts, tools, and multi-turn loops.
Embeddings & Similarity
Generate embeddings (as bytevectors), compute cosine similarity, and access embedding dimensions.
Structured Extraction
Extract structured data from text and images, classify inputs, and work with multi-modal content.
Vector Store & Math
In-memory vector store for semantic search, plus vector math utilities (cosine similarity, dot product, normalize, distance).
Caching
In-memory LLM response caching for iterative development and deduplication.
Cassettes (Record & Replay)
Record real LLM/agent responses to a file once, then replay them deterministically — keyless, offline tests and reproducible demos.
Resilience & Retry
Fallback provider chains, rate limiting, generic retry with exponential backoff, and convenience functions (llm/summarize, llm/compare).
Provider Management
Auto-configuration, runtime provider switching, custom providers, and OpenAI-compatible endpoints.
Cost Tracking & Budgets
Usage tracking, budget enforcement, and batch/parallel operations.
Observability (OpenTelemetry)
Built-in, standards-compliant OpenTelemetry tracing + metrics for every LLM and agent run — no manual instrumentation. Each completion and tool call is auto-traced (invoke_agent → chat → execute_tool) with tokens, cost, and latency, exportable to any OTLP backend or a JSONL file. Off by default, zero-cost when off.
- Tracing & Metrics — the GenAI spans and metrics, sessions, privacy controls, and embedding Sema in your own app.
- Backend Compatibility — label the data so tools that use their own attribute names (Arize Phoenix, Langfuse, Traceloop, LangSmith) read it too via
SEMA_OTEL_COMPAT. Most other tools work with no extra setup.