> clade-rate-limits

Handle Anthropic rate limits — understand tiers, implement backoff, Use when working with rate-limits patterns. optimize throughput, and monitor usage. Trigger with "anthropic rate limit", "claude 429", "anthropic throttling", "anthropic usage limits", "claude tokens per minute".

fetch
$curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/clade-rate-limits?format=md"
SKILL.mdclade-rate-limits

Anthropic Rate Limits

Overview

Anthropic enforces three types of limits: requests per minute (RPM), input tokens per minute (TPM), and output tokens per minute. Limits depend on your spend tier.

Rate Limit Tiers

TierQualificationRPMInput TPMOutput TPM
Tier 1Free5040,0008,000
Tier 2$40+ spend1,00080,00016,000
Tier 3$200+ spend2,000160,00032,000
Tier 4$400+ spend4,000400,00080,000
ScaleCustomCustomCustomCustom

Check your tier: console.anthropic.com → Settings → Limits

Response Headers

Every API response includes rate limit headers:

claude-ratelimit-requests-limit: 1000
claude-ratelimit-requests-remaining: 998
claude-ratelimit-requests-reset: 2025-01-01T00:01:00Z
claude-ratelimit-tokens-limit: 80000
claude-ratelimit-tokens-remaining: 79500
claude-ratelimit-tokens-reset: 2025-01-01T00:01:00Z
retry-after: 5

Built-In SDK Retries

The SDK automatically retries 429 and 529 errors with exponential backoff:

import Anthropic from '@claude-ai/sdk';

const client = new Anthropic({
  maxRetries: 3, // default: 2. Set to 0 to disable.
});

Custom Backoff

async function callWithBackoff(params: Anthropic.MessageCreateParams, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.messages.create(params);
    } catch (err) {
      if (err instanceof Anthropic.RateLimitError) {
        const retryAfter = Number(err.headers?.['retry-after'] || 2 ** attempt);
        const jitter = Math.random() * 1000;
        console.log(`Rate limited. Retry in ${retryAfter}s (attempt ${attempt + 1})`);
        await new Promise(r => setTimeout(r, retryAfter * 1000 + jitter));
      } else {
        throw err;
      }
    }
  }
  throw new Error('Exceeded max retries');
}

Throughput Optimization

StrategyImpact
Use Message Batches APIBypasses rate limits entirely (async, 24h SLA)
Use prompt cachingCached tokens don't count toward input TPM
Use smaller models for simple tasksLower token counts = more requests per minute
Pre-count tokens with countTokensAvoid wasted requests that will fail
Queue and batch requestsSmooth out bursts

Token Counting

// Count before sending — avoid burning RPM on requests that'll fail
const count = await client.messages.countTokens({
  model: 'claude-sonnet-4-20250514',
  messages,
  system: systemPrompt,
});
console.log(`This request will use ${count.input_tokens} input tokens`);

Python

import anthropic
import time

client = anthropic.Anthropic(max_retries=5)

# Or manual handling:
try:
    message = client.messages.create(...)
except anthropic.RateLimitError as e:
    retry_after = float(e.response.headers.get("retry-after", 5))
    time.sleep(retry_after)

Output

  • Rate limit tier identified from response headers
  • SDK configured with appropriate maxRetries setting
  • Custom backoff implemented with jitter for high-throughput use cases
  • Throughput optimized using batches, caching, or model selection

Error Handling

ErrorCauseSolution
API ErrorCheck error type and status codeSee clade-common-errors

Examples

See Rate Limit Tiers table, Response Headers section, Built-In SDK Retries, Custom Backoff implementation, and Throughput Optimization strategies above.

Resources

Next Steps

See clade-cost-tuning for cost optimization strategies.

Prerequisites

  • Completed clade-install-auth
  • Understanding of HTTP response headers
  • Familiarity with exponential backoff patterns

Instructions

Step 1: Review the patterns below

Each section contains production-ready code examples. Copy and adapt them to your use case.

Step 2: Apply to your codebase

Integrate the patterns that match your requirements. Test each change individually.

Step 3: Verify

Run your test suite to confirm the integration works correctly.

┌ stats

installs/wk0
░░░░░░░░░░
github stars1.7K
██████████
first seenMar 23, 2026
└────────────

┌ repo

jeremylongshore/claude-code-plugins-plus-skills
by jeremylongshore
└────────────