Why Shim Exists
LLMs hallucinate. They truncate JSON mid-stream. They wrap valid objects in markdown fences. They forget closing brackets. Your options:- Retry → Expensive (2/day)
- OutputFixingParser → Slow (waits for full output, 2-5s delay)
- Regex hacks → Brittle (breaks on nested objects)
<10ms API response.
What Shim Does
Shim is a streaming reliability layer for LLM outputs. Route all your LLM traffic through Shim to get:- Real-time repair (token-by-token, no buffering)
- Schema validation (type coercion, missing field detection)
- Confidence scoring (high/medium/low, never silent corruption)
- Zero data storage (privacy-first, GDPR compliant)
How It Works
Key Features
1. Streaming-First Architecture
Batch repair is a tool. Streaming repair is infrastructure. Shim repairs JSON token-by-token without buffering delays.2. Schema-Aware Repair
Provide a JSON Schema and Shim will:- Coerce types (
"30"→30) - Detect missing required fields
- Remove extra fields (strict mode)
- Validate nested objects
3. Never Fails Silently
Every response includes:- Confidence score (high/medium/low)
- Exact repairs applied
- Warnings for risky changes
- Errors for unrecoverable issues
4. Zero Data Persistence
Shim never stores:- Raw LLM outputs
- Repaired JSON
- Field names or values
5. Performance
Sub-millisecond repair, edge-deployed globally:| Metric | Performance |
|---|---|
| Repair engine | <0.1ms (P99: 0.03ms) |
| Total API latency | 5-15ms (including auth, logging, network) |
| KV lookup | <1ms |
| Edge deployment | 330+ locations globally |
Who Uses Shim
Indie Developers: Building AI apps with LangChain/LlamaIndexStartups: Shipping AI features to production
Enterprises: Deploying AI at scale with compliance requirements
Next Steps
Quick Start
Get your first repair working in 5 minutes
API Reference
Complete API documentation
Demo
Try Shim without signing up
TypeScript SDK
Install the official SDK
