> elevenlabs-core-workflow-b

Implement ElevenLabs speech-to-speech, sound effects, audio isolation, and speech-to-text. Use when converting voice to another voice, generating sound effects from text, removing background noise, or transcribing audio. Trigger: "elevenlabs speech to speech", "voice changer", "sound effects", "audio isolation", "remove background noise", "elevenlabs transcribe".

fetch

$curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/elevenlabs-core-workflow-b?format=md"

SKILL.md•elevenlabs-core-workflow-b

ElevenLabs Core Workflow B — Speech-to-Speech, Sound Effects & Audio Isolation

Overview

Secondary ElevenLabs workflows beyond TTS: (1) Speech-to-Speech voice conversion, (2) Sound Effects generation from text descriptions, (3) Audio Isolation for noise removal, and (4) Speech-to-Text transcription.

Prerequisites

Completed elevenlabs-install-auth setup
For STS: source audio file in MP3/WAV/M4A format
For audio isolation: noisy audio file to clean

Instructions

Step 1: Speech-to-Speech (Voice Changer)

Transform audio from one voice to another using POST /v1/speech-to-speech/{voice_id}:

import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { createReadStream, createWriteStream } from "fs";
import { Readable } from "stream";
import { pipeline } from "stream/promises";

const client = new ElevenLabsClient();

async function speechToSpeech(
  sourceAudioPath: string,
  targetVoiceId: string,
  outputPath: string
) {
  const audio = await client.speechToSpeech.convert(targetVoiceId, {
    audio: createReadStream(sourceAudioPath),
    model_id: "eleven_english_sts_v2",  // STS-specific model
    voice_settings: JSON.stringify({
      stability: 0.5,
      similarity_boost: 0.8,
      style: 0.0,
    }),
    remove_background_noise: true,  // Built-in noise removal
  });

  await pipeline(Readable.fromWeb(audio as any), createWriteStream(outputPath));
  console.log(`Voice-converted audio saved to ${outputPath}`);
}

// Convert your voice recording to sound like "Rachel"
await speechToSpeech(
  "my_recording.mp3",
  "21m00Tcm4TlvDq8ikWAM",
  "converted.mp3"
);

cURL equivalent:

curl -X POST "https://api.elevenlabs.io/v1/speech-to-speech/21m00Tcm4TlvDq8ikWAM" \
  -H "xi-api-key: ${ELEVENLABS_API_KEY}" \
  -F "audio=@my_recording.mp3" \
  -F "model_id=eleven_english_sts_v2" \
  -F 'voice_settings={"stability":0.5,"similarity_boost":0.8}' \
  -F "remove_background_noise=true" \
  --output converted.mp3

Step 2: Sound Effects Generation

Generate cinematic sound effects from text descriptions using POST /v1/sound-generation:

async function generateSoundEffect(
  description: string,
  outputPath: string,
  options?: {
    duration?: number;      // 0.5-30 seconds (null = auto)
    promptInfluence?: number; // 0-1 (default 0.3, higher = follows prompt more closely)
    loop?: boolean;          // Seamless looping (default false)
  }
) {
  const audio = await client.textToSoundEffects.convert({
    text: description,
    duration_seconds: options?.duration,
    prompt_influence: options?.promptInfluence ?? 0.3,
    // model_id: "eleven_text_to_sound_v2",  // default
  });

  await pipeline(Readable.fromWeb(audio as any), createWriteStream(outputPath));
  console.log(`Sound effect saved to ${outputPath}`);
}

// Generate various sound effects
await generateSoundEffect(
  "Heavy rain on a tin roof with distant thunder",
  "rain.mp3",
  { duration: 10, promptInfluence: 0.6 }
);

await generateSoundEffect(
  "Sci-fi laser gun firing three quick bursts",
  "laser.mp3",
  { duration: 3, promptInfluence: 0.8 }
);

await generateSoundEffect(
  "Gentle forest ambiance with birds chirping",
  "forest_loop.mp3",
  { duration: 15, loop: true }  // Seamless loop for background audio
);

cURL equivalent:

curl -X POST "https://api.elevenlabs.io/v1/sound-generation" \
  -H "xi-api-key: ${ELEVENLABS_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Heavy rain on a tin roof with distant thunder",
    "duration_seconds": 10,
    "prompt_influence": 0.6
  }' \
  --output rain.mp3

Step 3: Audio Isolation (Voice Isolator)

Remove background noise from audio using POST /v1/audio-isolation:

async function isolateVoice(
  noisyAudioPath: string,
  cleanOutputPath: string
) {
  const cleanAudio = await client.audioIsolation.audioIsolation({
    audio: createReadStream(noisyAudioPath),
  });

  await pipeline(
    Readable.fromWeb(cleanAudio as any),
    createWriteStream(cleanOutputPath)
  );
  console.log(`Clean audio saved to ${cleanOutputPath}`);
}

// Remove background noise from a recording
await isolateVoice("noisy_interview.mp3", "clean_interview.mp3");

Streaming variant for large files (POST /v1/audio-isolation/stream):

async function isolateVoiceStreaming(
  noisyAudioPath: string,
  cleanOutputPath: string
) {
  const stream = await client.audioIsolation.audioIsolationStream({
    audio: createReadStream(noisyAudioPath),
  });

  const writer = createWriteStream(cleanOutputPath);
  for await (const chunk of stream) {
    writer.write(chunk);
  }
  writer.end();
}

cURL equivalent:

curl -X POST "https://api.elevenlabs.io/v1/audio-isolation" \
  -H "xi-api-key: ${ELEVENLABS_API_KEY}" \
  -F "audio=@noisy_interview.mp3" \
  --output clean_interview.mp3

Step 4: Speech-to-Text (Transcription)

Transcribe audio with speaker diarization using POST /v1/speech-to-text:

async function transcribeAudio(audioPath: string) {
  const result = await client.speechToText.convert({
    audio: createReadStream(audioPath),
    model_id: "scribe_v1",  // ElevenLabs' STT model
    // language_code: "en",  // Optional: force language
    // diarize: true,        // Enable speaker detection
    // timestamps_granularity: "word",  // "word" or "character"
  });

  console.log("Transcription:", result.text);

  // Word-level timestamps
  if (result.words) {
    for (const word of result.words) {
      console.log(`[${word.start.toFixed(2)}-${word.end.toFixed(2)}] ${word.text}`);
    }
  }

  return result;
}

await transcribeAudio("podcast_episode.mp3");

API Endpoint Summary

Feature	Method	Endpoint	Billing
Speech-to-Speech	POST	`/v1/speech-to-speech/{voice_id}`	Per character
Sound Effects	POST	`/v1/sound-generation`	Per generation
Audio Isolation	POST	`/v1/audio-isolation`	1,000 chars/min of audio
Audio Isolation Stream	POST	`/v1/audio-isolation/stream`	1,000 chars/min of audio
Speech-to-Text	POST	`/v1/speech-to-text`	Per audio minute

Sound Effect Tips

Be specific: "wooden door creaking slowly open in a quiet room" beats "door sound"
Specify quantity: "three quick gunshots" vs "gunshots"
Set mood: "eerie", "cheerful", "aggressive" changes the output character
Use prompt_influence: 0.6-0.8 for precise results, 0.2-0.4 for creative variation
Max duration: 30 seconds per generation

Audio Isolation Limits

Aspect	Limit
Max file size	500 MB
Max duration	1 hour
Supported formats	MP3, WAV, M4A, FLAC, OGG, WEBM
PCM optimization	Use `file_format: "pcm_s16le_16"` for lowest latency

Error Handling

Error	HTTP	Cause	Solution
`model_can_not_do_voice_conversion`	400	Wrong model for STS	Use `eleven_english_sts_v2`
`audio_too_short`	400	STS input under 1 second	Use longer audio clip
`audio_too_long`	400	STS input over limit	Trim to under 5 minutes
`invalid_sound_prompt`	400	Nonsensical SFX description	Write descriptive, specific prompts
`file_too_large`	413	Audio isolation over 500MB	Compress or split the file
`quota_exceeded`	401	Character/generation limit hit	Check usage dashboard

Resources

Next Steps

For common errors, see elevenlabs-common-errors. For SDK patterns, see elevenlabs-sdk-patterns.

> related_skills --same-repo

> fathom-cost-tuning

Optimize Fathom API usage and plan selection. Trigger with phrases like "fathom cost", "fathom pricing", "fathom plan".

> fathom-core-workflow-b

Sync Fathom meeting data to CRM and build automated follow-up workflows. Use when integrating Fathom with Salesforce, HubSpot, or custom CRMs, or creating automated post-meeting email summaries. Trigger with phrases like "fathom crm sync", "fathom salesforce", "fathom follow-up", "fathom post-meeting workflow".

> fathom-core-workflow-a

Build a meeting analytics pipeline with Fathom transcripts and summaries. Use when extracting insights from meetings, building CRM sync, or creating automated meeting follow-up workflows. Trigger with phrases like "fathom analytics", "fathom meeting pipeline", "fathom transcript analysis", "fathom action items sync".

> fathom-common-errors

Diagnose and fix Fathom API errors including auth failures and missing data. Use when API calls fail, transcripts are empty, or webhooks are not firing. Trigger with phrases like "fathom error", "fathom not working", "fathom api failure", "fix fathom".

┌ stats

installs/wk0

░░░░░░░░░░

github stars2.1K

██████████

first seenMar 23, 2026

└────────────

┌ repo

jeremylongshore/claude-code-plugins-plus-skills

by jeremylongshore

└────────────