Skip to main content

Rate Limits Guide

Shim enforces two types of limits: requests per minute and monthly repairs. Behavior differs by tier.

Tier Limits

TierRequests/MinMonthly RepairsOverage Behavior
Free1001,000Throttle at 110%
Pro1,000100,000Throttle at 110%
Team10,0001,000,000Bill $0.30/1K over 110%

Free and Pro Tiers

Grace Period (100-110%)

Continue at full speed. No degradation.
X-Shim-Usage: 1050
X-Shim-Limit: 1000
X-Shim-Usage-Percentage: 105.0
X-Shim-Overage: soft

Phase 1 Throttle (110-150%)

Throttle to reduced rate:
  • Free: 10 requests/min
  • Pro: 100 requests/min
X-Shim-Usage: 1200
X-Shim-Limit: 1000
X-Shim-Usage-Percentage: 120.0
X-Shim-Overage: throttled

Phase 2 Throttle (150%+)

Hard throttle:
  • Free: 1 request/min
  • Pro: 10 requests/min
X-Shim-Usage: 1600
X-Shim-Limit: 1000
X-Shim-Usage-Percentage: 160.0
X-Shim-Overage: limp

Why Throttle?

Free and Pro tiers are fixed-price. Throttling prevents runaway costs while keeping your app online.

Team Tier

Grace Period (100-110%)

Continue at full speed. No degradation.
X-Shim-Usage: 1050000
X-Shim-Limit: 1000000
X-Shim-Usage-Percentage: 105.0
X-Shim-Overage: soft

Metered Billing (110%+)

Full speed. Billed $0.30 per 1,000 repairs over 110%.
X-Shim-Usage: 1250000
X-Shim-Limit: 1000000
X-Shim-Usage-Percentage: 125.0
X-Shim-Overage: billing
Calculation:
Base limit: 1,000,000
Grace threshold (110%): 1,100,000
Current usage: 1,250,000
Overage: 1,250,000 - 1,100,000 = 150,000 repairs
Billed units: 150,000 / 1,000 = 150 units
Cost: 150 × $0.30 = $45

Why Metered?

Team tier expects high volume. Metered billing scales with usage without throttling.

Response Headers

Per-Minute Rate Limit

Standard rate limit headers on every response:
X-RateLimit-Limit: 100          # Requests allowed this minute
X-RateLimit-Remaining: 87       # Requests remaining this minute
X-RateLimit-Reset: 1706000060   # Unix timestamp when window resets

Monthly Usage

Usage tracking headers on every response:
X-Shim-Usage: 1200              # Repairs used this month (integer)
X-Shim-Limit: 1000              # Monthly limit for your tier
X-Shim-Usage-Percentage: 120.0  # Usage as percentage of limit

X-Shim-Overage

Overage phase header (only present when usage exceeds 100%):
X-Shim-Overage: soft            # 100-110% grace period
X-Shim-Overage: throttled       # Free/Pro 110-150%
X-Shim-Overage: limp            # Free/Pro 150%+
X-Shim-Overage: billing         # Team 110%+ (metered)

Handling Rate Limits

Check Headers

const response = await fetch('https://api.shim.so/v1/repair', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${apiKey}` },
  body: JSON.stringify({ raw_output })
});

const usage = response.headers.get('X-Shim-Usage');
const limit = response.headers.get('X-Shim-Limit');
const percentage = response.headers.get('X-Shim-Usage-Percentage');
const overage = response.headers.get('X-Shim-Overage');

console.log(`Usage: ${usage}/${limit} (${percentage}%)`);
console.log(`Overage phase: ${overage ?? 'normal'}`);

Exponential Backoff

async function repairWithBackoff(input: string, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await shim.repair({ raw_output: input });

    if (response.success) {
      return response.repaired;
    }

    // Check if rate limited
    const error = response.metadata.errors[0];
    if (error?.code === 'RATE_LIMIT_EXCEEDED') {
      const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
      await new Promise(r => setTimeout(r, delay));
      continue;
    }

    throw new Error('Repair failed');
  }

  throw new Error('Max retries exceeded');
}

Alert on High Usage

const response = await fetch('https://api.shim.so/v1/repair', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${apiKey}` },
  body: JSON.stringify({ raw_output: input })
});

const percentage = parseFloat(response.headers.get('X-Shim-Usage-Percentage') || '0');

if (percentage > 90) {
  console.warn(`High usage: ${percentage}% of monthly limit`);
  // Send alert
}

Console Monitoring

View real-time usage in your console:
  1. Go to console.shim.so
  2. Check usage meter
  3. View daily breakdown

Upgrading Tiers

Free → Pro

Increase limits:
  • 1K → 100K monthly repairs
  • 100 → 1K requests/min
# Upgrade in console
https://console.shim.so "Upgrade to Pro"

Pro → Team

Unlock metered billing:
  • 100K → 1M base limit
  • 1K → 10K requests/min
  • Overage billing instead of throttling
# Upgrade in console
https://console.shim.so "Upgrade to Team"

Best Practices

1. Monitor Usage Headers

Check X-Shim-Usage on every request:
const usage = response.headers.get('X-Shim-Usage');
logToMonitoring('shim.usage', usage);

2. Cache Repairs

Avoid repairing the same output twice:
const cache = new Map();

async function cachedRepair(input: string) {
  if (cache.has(input)) {
    return cache.get(input);
  }

  const result = await shim.repair({ raw_output: input });
  cache.set(input, result);
  return result;
}

3. Use Streaming for Large Outputs

Streaming doesn’t reduce repair count, but spreads requests over time:
// Instead of one large batch request
const { session_id } = await shim.stream.start();

for await (const chunk of llmStream) {
  await shim.stream.push({ session_id, chunk });
}

const result = await shim.stream.finalize({ session_id });

4. Set Alerts

Configure alerts at 80% and 100%:
if (percentage >= 80 && percentage < 100) {
  sendAlert('Warning: 80% of monthly limit reached');
}

if (percentage >= 100) {
  sendAlert('Critical: Monthly limit exceeded, throttling active');
}

FAQ

Does streaming count as one repair or many?

One repair. A streaming session (start → push × N → finalize) counts as one repair.

When do limits reset?

Monthly limits reset on your billing cycle date. View in dashboard.

Can I request a limit increase?

Team tier is the highest standard tier. Contact sales for custom plans.

What happens if I downgrade?

Limits adjust immediately. If over new limit, throttling applies.

Next Steps

Authentication

View your API key and tier

Error Codes

Handle RATE_LIMIT_EXCEEDED

Upgrade

Upgrade your tier

Console

Monitor usage in real-time