Why Shim Exists

LLMs hallucinate. They truncate JSON mid-stream. They wrap valid objects in markdown fences. They forget closing brackets. Your options:

Retry → Expensive ( $0.002/retry × 1000 failures =$ 2/day)
OutputFixingParser → Slow (waits for full output, 2-5s delay)
Regex hacks → Brittle (breaks on nested objects)

Shim: Receive. Repair. Return. Sub-millisecond repair, <10ms API response.

What Shim Does

Shim is a streaming reliability layer for LLM outputs. Route all your LLM traffic through Shim to get:

Real-time repair (token-by-token, no buffering)
Schema validation (type coercion, missing field detection)
Confidence scoring (high/medium/low, never silent corruption)
Zero data storage (privacy-first, GDPR compliant)

How It Works

// Before: Broken JSON from LLM
const raw = '{"name": "John", "age": "30"'; // Missing closing brace

// After: Shim repairs it
const response = await fetch('https://api.shim.so/v1/repair', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer sk_xxx' },
  body: JSON.stringify({ raw_output: raw })
});

const { repaired, metadata } = await response.json();
// repaired: { name: "John", age: 30 }
// metadata.confidence: "high"
// metadata.repairs_applied: ["closed_object_bracket", "type_coercion"]

Key Features

1. Streaming-First Architecture

Batch repair is a tool. Streaming repair is infrastructure. Shim repairs JSON token-by-token without buffering delays.

2. Schema-Aware Repair

Provide a JSON Schema and Shim will:

Coerce types ("30" → 30)
Detect missing required fields
Remove extra fields (strict mode)
Validate nested objects

3. Never Fails Silently

Every response includes:

Confidence score (high/medium/low)
Exact repairs applied
Warnings for risky changes
Errors for unrecoverable issues

4. Zero Data Persistence

Shim never stores:

Raw LLM outputs
Repaired JSON
Field names or values

We only log metadata: repair types, confidence, latency.

5. Performance

Sub-millisecond repair, edge-deployed globally:

Metric	Performance
Repair engine	`<0.1ms` (P99: 0.03ms)
Total API latency	5-15ms (including auth, logging, network)
KV lookup	`<1ms`
Edge deployment	330+ locations globally

200-500x faster than OutputFixingParser (which waits 2-5 seconds for full output).

Who Uses Shim

Indie Developers: Building AI apps with LangChain/LlamaIndex
Startups: Shipping AI features to production
Enterprises: Deploying AI at scale with compliance requirements

Next Steps

Quick Start

Get your first repair working in 5 minutes

API Reference

Complete API documentation

Demo

Try Shim without signing up

TypeScript SDK

Install the official SDK

​Why Shim Exists

​What Shim Does

​How It Works

​Key Features

​1. Streaming-First Architecture

​2. Schema-Aware Repair

​3. Never Fails Silently

​4. Zero Data Persistence

​5. Performance

​Who Uses Shim

​Next Steps