> castai-rate-limits

Handle CAST AI API rate limits with backoff and request queuing. Use when hitting 429 errors, optimizing API call patterns, or implementing rate-aware batch operations. Trigger with phrases like "cast ai rate limit", "cast ai 429", "cast ai throttle", "cast ai API limits".

fetch
$curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/castai-rate-limits?format=md"
SKILL.mdcastai-rate-limits

CAST AI Rate Limits

Overview

The CAST AI REST API enforces rate limits per API key. The autoscaler agent communicates cluster state at 15-second intervals. For custom API integrations, implement exponential backoff and request queuing to avoid hitting limits.

Prerequisites

  • CAST AI API key configured
  • Understanding of the API endpoints you call

Rate Limit Behavior

AspectValue
Rate limit scopePer API key
Response on limitHTTP 429 with Retry-After header
Agent sync intervalEvery 15 seconds
Recommended pollingNo more than once per 30 seconds

Instructions

Step 1: Detect Rate Limits from Response Headers

async function castaiRequest(path: string): Promise<Response> {
  const response = await fetch(`https://api.cast.ai${path}`, {
    headers: { "X-API-Key": process.env.CASTAI_API_KEY! },
  });

  // Log rate limit headers for monitoring
  const remaining = response.headers.get("X-RateLimit-Remaining");
  const reset = response.headers.get("X-RateLimit-Reset");
  if (remaining) {
    console.log(`Rate limit remaining: ${remaining}, resets: ${reset}`);
  }

  if (response.status === 429) {
    const retryAfter = parseInt(response.headers.get("Retry-After") ?? "5");
    throw new RateLimitError(retryAfter);
  }

  return response;
}

class RateLimitError extends Error {
  constructor(public retryAfterSeconds: number) {
    super(`Rate limited. Retry after ${retryAfterSeconds}s`);
  }
}

Step 2: Exponential Backoff with Jitter

async function withBackoff<T>(
  fn: () => Promise<T>,
  maxRetries = 5
): Promise<T> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (err) {
      if (attempt === maxRetries) throw err;

      let delayMs: number;
      if (err instanceof RateLimitError) {
        delayMs = err.retryAfterSeconds * 1000;
      } else {
        delayMs = Math.min(1000 * Math.pow(2, attempt), 30000);
      }
      // Add jitter to prevent thundering herd
      delayMs += Math.random() * 1000;

      console.log(`Retry ${attempt + 1}/${maxRetries} in ${delayMs}ms`);
      await new Promise((r) => setTimeout(r, delayMs));
    }
  }
  throw new Error("Unreachable");
}

Step 3: Request Queue for Batch Operations

import PQueue from "p-queue";

// Limit concurrent requests and enforce interval
const castaiQueue = new PQueue({
  concurrency: 3,
  interval: 1000,
  intervalCap: 5,   // Max 5 requests per second
});

async function queuedCastAIRequest<T>(fn: () => Promise<T>): Promise<T> {
  return castaiQueue.add(() => withBackoff(fn));
}

// Batch process multiple clusters
const clusterIds = ["id1", "id2", "id3", "id4", "id5"];
const savings = await Promise.all(
  clusterIds.map((id) =>
    queuedCastAIRequest(() =>
      fetch(`https://api.cast.ai/v1/kubernetes/clusters/${id}/savings`, {
        headers: { "X-API-Key": process.env.CASTAI_API_KEY! },
      }).then((r) => r.json())
    )
  )
);

Step 4: Polling Best Practice

// Do NOT poll faster than 30 seconds for cluster state
// The agent syncs every 15s; polling faster adds no value

async function pollClusterStatus(
  clusterId: string,
  intervalMs = 30000
): Promise<void> {
  const timer = setInterval(async () => {
    try {
      const status = await queuedCastAIRequest(() =>
        fetch(
          `https://api.cast.ai/v1/kubernetes/external-clusters/${clusterId}`,
          { headers: { "X-API-Key": process.env.CASTAI_API_KEY! } }
        ).then((r) => r.json())
      );
      console.log(`Cluster ${clusterId}: ${status.agentStatus}`);
    } catch (err) {
      console.error("Poll failed:", err);
    }
  }, intervalMs);
}

Error Handling

ScenarioDetectionResponse
429 with Retry-AfterCheck headerWait exact duration
429 without headerStatus code onlyExponential backoff from 1s
5xx errorsStatus >= 500Retry up to 3 times
Connection timeoutFetch throwsRetry with longer timeout

Resources

Next Steps

For security configuration, see castai-security-basics.

┌ stats

installs/wk0
░░░░░░░░░░
github stars1.7K
██████████
first seenMar 23, 2026
└────────────

┌ repo

jeremylongshore/claude-code-plugins-plus-skills
by jeremylongshore
└────────────