> elevenlabs-core-workflow-b
Implement ElevenLabs speech-to-speech, sound effects, audio isolation, and speech-to-text. Use when converting voice to another voice, generating sound effects from text, removing background noise, or transcribing audio. Trigger: "elevenlabs speech to speech", "voice changer", "sound effects", "audio isolation", "remove background noise", "elevenlabs transcribe".
curl "https://skillshub.wtf/jeremylongshore/claude-code-plugins-plus-skills/elevenlabs-core-workflow-b?format=md"ElevenLabs Core Workflow B — Speech-to-Speech, Sound Effects & Audio Isolation
Overview
Secondary ElevenLabs workflows beyond TTS: (1) Speech-to-Speech voice conversion, (2) Sound Effects generation from text descriptions, (3) Audio Isolation for noise removal, and (4) Speech-to-Text transcription.
Prerequisites
- Completed
elevenlabs-install-authsetup - For STS: source audio file in MP3/WAV/M4A format
- For audio isolation: noisy audio file to clean
Instructions
Step 1: Speech-to-Speech (Voice Changer)
Transform audio from one voice to another using POST /v1/speech-to-speech/{voice_id}:
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { createReadStream, createWriteStream } from "fs";
import { Readable } from "stream";
import { pipeline } from "stream/promises";
const client = new ElevenLabsClient();
async function speechToSpeech(
sourceAudioPath: string,
targetVoiceId: string,
outputPath: string
) {
const audio = await client.speechToSpeech.convert(targetVoiceId, {
audio: createReadStream(sourceAudioPath),
model_id: "eleven_english_sts_v2", // STS-specific model
voice_settings: JSON.stringify({
stability: 0.5,
similarity_boost: 0.8,
style: 0.0,
}),
remove_background_noise: true, // Built-in noise removal
});
await pipeline(Readable.fromWeb(audio as any), createWriteStream(outputPath));
console.log(`Voice-converted audio saved to ${outputPath}`);
}
// Convert your voice recording to sound like "Rachel"
await speechToSpeech(
"my_recording.mp3",
"21m00Tcm4TlvDq8ikWAM",
"converted.mp3"
);
cURL equivalent:
curl -X POST "https://api.elevenlabs.io/v1/speech-to-speech/21m00Tcm4TlvDq8ikWAM" \
-H "xi-api-key: ${ELEVENLABS_API_KEY}" \
-F "audio=@my_recording.mp3" \
-F "model_id=eleven_english_sts_v2" \
-F 'voice_settings={"stability":0.5,"similarity_boost":0.8}' \
-F "remove_background_noise=true" \
--output converted.mp3
Step 2: Sound Effects Generation
Generate cinematic sound effects from text descriptions using POST /v1/sound-generation:
async function generateSoundEffect(
description: string,
outputPath: string,
options?: {
duration?: number; // 0.5-30 seconds (null = auto)
promptInfluence?: number; // 0-1 (default 0.3, higher = follows prompt more closely)
loop?: boolean; // Seamless looping (default false)
}
) {
const audio = await client.textToSoundEffects.convert({
text: description,
duration_seconds: options?.duration,
prompt_influence: options?.promptInfluence ?? 0.3,
// model_id: "eleven_text_to_sound_v2", // default
});
await pipeline(Readable.fromWeb(audio as any), createWriteStream(outputPath));
console.log(`Sound effect saved to ${outputPath}`);
}
// Generate various sound effects
await generateSoundEffect(
"Heavy rain on a tin roof with distant thunder",
"rain.mp3",
{ duration: 10, promptInfluence: 0.6 }
);
await generateSoundEffect(
"Sci-fi laser gun firing three quick bursts",
"laser.mp3",
{ duration: 3, promptInfluence: 0.8 }
);
await generateSoundEffect(
"Gentle forest ambiance with birds chirping",
"forest_loop.mp3",
{ duration: 15, loop: true } // Seamless loop for background audio
);
cURL equivalent:
curl -X POST "https://api.elevenlabs.io/v1/sound-generation" \
-H "xi-api-key: ${ELEVENLABS_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"text": "Heavy rain on a tin roof with distant thunder",
"duration_seconds": 10,
"prompt_influence": 0.6
}' \
--output rain.mp3
Step 3: Audio Isolation (Voice Isolator)
Remove background noise from audio using POST /v1/audio-isolation:
async function isolateVoice(
noisyAudioPath: string,
cleanOutputPath: string
) {
const cleanAudio = await client.audioIsolation.audioIsolation({
audio: createReadStream(noisyAudioPath),
});
await pipeline(
Readable.fromWeb(cleanAudio as any),
createWriteStream(cleanOutputPath)
);
console.log(`Clean audio saved to ${cleanOutputPath}`);
}
// Remove background noise from a recording
await isolateVoice("noisy_interview.mp3", "clean_interview.mp3");
Streaming variant for large files (POST /v1/audio-isolation/stream):
async function isolateVoiceStreaming(
noisyAudioPath: string,
cleanOutputPath: string
) {
const stream = await client.audioIsolation.audioIsolationStream({
audio: createReadStream(noisyAudioPath),
});
const writer = createWriteStream(cleanOutputPath);
for await (const chunk of stream) {
writer.write(chunk);
}
writer.end();
}
cURL equivalent:
curl -X POST "https://api.elevenlabs.io/v1/audio-isolation" \
-H "xi-api-key: ${ELEVENLABS_API_KEY}" \
-F "audio=@noisy_interview.mp3" \
--output clean_interview.mp3
Step 4: Speech-to-Text (Transcription)
Transcribe audio with speaker diarization using POST /v1/speech-to-text:
async function transcribeAudio(audioPath: string) {
const result = await client.speechToText.convert({
audio: createReadStream(audioPath),
model_id: "scribe_v1", // ElevenLabs' STT model
// language_code: "en", // Optional: force language
// diarize: true, // Enable speaker detection
// timestamps_granularity: "word", // "word" or "character"
});
console.log("Transcription:", result.text);
// Word-level timestamps
if (result.words) {
for (const word of result.words) {
console.log(`[${word.start.toFixed(2)}-${word.end.toFixed(2)}] ${word.text}`);
}
}
return result;
}
await transcribeAudio("podcast_episode.mp3");
API Endpoint Summary
| Feature | Method | Endpoint | Billing |
|---|---|---|---|
| Speech-to-Speech | POST | /v1/speech-to-speech/{voice_id} | Per character |
| Sound Effects | POST | /v1/sound-generation | Per generation |
| Audio Isolation | POST | /v1/audio-isolation | 1,000 chars/min of audio |
| Audio Isolation Stream | POST | /v1/audio-isolation/stream | 1,000 chars/min of audio |
| Speech-to-Text | POST | /v1/speech-to-text | Per audio minute |
Sound Effect Tips
- Be specific: "wooden door creaking slowly open in a quiet room" beats "door sound"
- Specify quantity: "three quick gunshots" vs "gunshots"
- Set mood: "eerie", "cheerful", "aggressive" changes the output character
- Use
prompt_influence: 0.6-0.8for precise results,0.2-0.4for creative variation - Max duration: 30 seconds per generation
Audio Isolation Limits
| Aspect | Limit |
|---|---|
| Max file size | 500 MB |
| Max duration | 1 hour |
| Supported formats | MP3, WAV, M4A, FLAC, OGG, WEBM |
| PCM optimization | Use file_format: "pcm_s16le_16" for lowest latency |
Error Handling
| Error | HTTP | Cause | Solution |
|---|---|---|---|
model_can_not_do_voice_conversion | 400 | Wrong model for STS | Use eleven_english_sts_v2 |
audio_too_short | 400 | STS input under 1 second | Use longer audio clip |
audio_too_long | 400 | STS input over limit | Trim to under 5 minutes |
invalid_sound_prompt | 400 | Nonsensical SFX description | Write descriptive, specific prompts |
file_too_large | 413 | Audio isolation over 500MB | Compress or split the file |
quota_exceeded | 401 | Character/generation limit hit | Check usage dashboard |
Resources
Next Steps
For common errors, see elevenlabs-common-errors. For SDK patterns, see elevenlabs-sdk-patterns.
> related_skills --same-repo
> fathom-cost-tuning
Optimize Fathom API usage and plan selection. Trigger with phrases like "fathom cost", "fathom pricing", "fathom plan".
> fathom-core-workflow-b
Sync Fathom meeting data to CRM and build automated follow-up workflows. Use when integrating Fathom with Salesforce, HubSpot, or custom CRMs, or creating automated post-meeting email summaries. Trigger with phrases like "fathom crm sync", "fathom salesforce", "fathom follow-up", "fathom post-meeting workflow".
> fathom-core-workflow-a
Build a meeting analytics pipeline with Fathom transcripts and summaries. Use when extracting insights from meetings, building CRM sync, or creating automated meeting follow-up workflows. Trigger with phrases like "fathom analytics", "fathom meeting pipeline", "fathom transcript analysis", "fathom action items sync".
> fathom-common-errors
Diagnose and fix Fathom API errors including auth failures and missing data. Use when API calls fail, transcripts are empty, or webhooks are not firing. Trigger with phrases like "fathom error", "fathom not working", "fathom api failure", "fix fathom".