> assemblyai-performance-tuning

Optimize AssemblyAI API performance with caching, parallel processing, and model selection. Use when experiencing slow transcriptions, implementing caching strategies, or optimizing throughput for batch transcription workloads. Trigger with phrases like "assemblyai performance", "optimize assemblyai", "assemblyai latency", "assemblyai caching", "assemblyai slow", "assemblyai batch".

fetch
$curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/assemblyai-performance-tuning?format=md"
SKILL.mdassemblyai-performance-tuning

AssemblyAI Performance Tuning

Overview

Optimize AssemblyAI transcription performance through model selection, parallel processing, caching, and webhook-based architectures.

Prerequisites

  • assemblyai package installed
  • Understanding of async patterns
  • Redis or in-memory cache available (optional)

Latency Benchmarks (Actual)

Async Transcription

Audio DurationApprox. Processing TimeNotes
30 seconds~10-15 secondsIncludes queue time
5 minutes~30-60 secondsScales sub-linearly
1 hour~3-5 minutesDepends on queue load
10 hours~15-30 minutesMax async duration

Streaming

MetricValue
First partial transcript~300ms (P50)
Final transcript latency~500ms (P50)
End-of-turn detectionAutomatic with endpointing

Model Speed vs. Accuracy

ModelSpeedAccuracyPrice/hr
nanoFastestGood$0.12
best (Universal-3)StandardHighest$0.37
nova-3 (streaming)Real-timeHigh$0.47
nova-3-pro (streaming)Real-timeHighest$0.47

Instructions

Step 1: Choose the Right Model

import { AssemblyAI } from 'assemblyai';

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY!,
});

// For highest accuracy (default)
const accurate = await client.transcripts.transcribe({
  audio: audioUrl,
  speech_model: 'best',
});

// For fastest processing and lowest cost
const fast = await client.transcripts.transcribe({
  audio: audioUrl,
  speech_model: 'nano',
});

Step 2: Parallel Batch Processing

import PQueue from 'p-queue';

const queue = new PQueue({ concurrency: 10 });

async function batchTranscribe(audioUrls: string[]) {
  const results = await Promise.all(
    audioUrls.map(url =>
      queue.add(() =>
        client.transcripts.transcribe({ audio: url, speech_model: 'nano' })
      )
    )
  );

  return results.filter(t => t.status === 'completed');
}

// Process 100 files with 10 concurrent jobs
const urls = Array.from({ length: 100 }, (_, i) => `https://storage.example.com/audio-${i}.mp3`);
const transcripts = await batchTranscribe(urls);
console.log(`Completed: ${transcripts.length}/${urls.length}`);

Step 3: Use Webhooks Instead of Polling

// SLOW: transcribe() polls every 3 seconds until done
const slow = await client.transcripts.transcribe({ audio: audioUrl });

// FAST: submit() returns immediately, webhook notifies on completion
const fast = await client.transcripts.submit({
  audio: audioUrl,
  webhook_url: 'https://your-app.com/webhooks/assemblyai',
});
// Your webhook handler processes the result — no polling overhead

Step 4: Cache Transcript Results

import { LRUCache } from 'lru-cache';
import type { Transcript } from 'assemblyai';

const transcriptCache = new LRUCache<string, Transcript>({
  max: 500,
  ttl: 60 * 60 * 1000, // 1 hour
});

async function getCachedTranscript(transcriptId: string): Promise<Transcript> {
  const cached = transcriptCache.get(transcriptId);
  if (cached) return cached;

  const transcript = await client.transcripts.get(transcriptId);
  if (transcript.status === 'completed') {
    transcriptCache.set(transcriptId, transcript);
  }
  return transcript;
}

Step 5: Redis Cache for Distributed Systems

import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL!);

async function getCachedTranscriptRedis(transcriptId: string): Promise<Transcript> {
  const cached = await redis.get(`transcript:${transcriptId}`);
  if (cached) return JSON.parse(cached);

  const transcript = await client.transcripts.get(transcriptId);
  if (transcript.status === 'completed') {
    await redis.setex(
      `transcript:${transcriptId}`,
      3600, // 1 hour TTL
      JSON.stringify(transcript)
    );
  }
  return transcript;
}

Step 6: Minimize Feature Overhead

// Only enable features you actually need — each adds processing time

// Minimal (fastest)
const minimal = await client.transcripts.transcribe({
  audio: audioUrl,
  speech_model: 'nano',
  punctuate: true,
  format_text: true,
});

// Full intelligence (slower, more expensive)
const full = await client.transcripts.transcribe({
  audio: audioUrl,
  speech_model: 'best',
  speaker_labels: true,
  sentiment_analysis: true,
  entity_detection: true,
  auto_highlights: true,
  content_safety: true,
  iab_categories: true,
  summarization: true,
  summary_type: 'bullets',
});

Step 7: Performance Monitoring

async function timedTranscribe(audioUrl: string, options: Record<string, any> = {}) {
  const start = Date.now();
  const transcript = await client.transcripts.transcribe({
    audio: audioUrl,
    ...options,
  });
  const durationMs = Date.now() - start;

  const stats = {
    transcriptId: transcript.id,
    status: transcript.status,
    audioDuration: transcript.audio_duration,
    processingTimeMs: durationMs,
    ratio: transcript.audio_duration
      ? (durationMs / 1000 / transcript.audio_duration).toFixed(2)
      : 'N/A',
    wordCount: transcript.words?.length ?? 0,
    model: options.speech_model ?? 'best',
  };

  console.log('Transcription stats:', stats);
  return { transcript, stats };
}

Output

  • Optimal model selection based on speed/accuracy/cost trade-offs
  • Parallel batch processing with concurrency control
  • Webhook-based architecture (eliminates polling overhead)
  • In-memory and Redis caching for transcript retrieval
  • Performance monitoring with processing time ratios

Error Handling

IssueCauseSolution
Slow transcriptionLarge file + best modelUse nano model or split audio
Queue backlogToo many concurrent submissionsLimit concurrency with p-queue
Cache stale dataTranscript re-processedSet appropriate TTL, invalidate on webhook
Polling overheadUsing transcribe() for many filesSwitch to submit() + webhooks

Resources

Next Steps

For cost optimization, see assemblyai-cost-tuning.

┌ stats

installs/wk0
░░░░░░░░░░
github stars1.7K
██████████
first seenMar 23, 2026
└────────────

┌ repo

jeremylongshore/claude-code-plugins-plus-skills
by jeremylongshore
└────────────