> azure-speech

Expert knowledge for Azure AI Speech development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building, debugging, or optimizing Azure AI Speech applications. Not for Azure AI services (use azure-ai-services), Azure AI Vision (use azure-ai-vision), Azure AI Custom Vision (use azure-custom-vision), Azure Translator (use azure-translator).

fetch
$curl "https://skillshub.wtf/MicrosoftDocs/Agent-Skills/azure-speech?format=md"
SKILL.mdazure-speech

Azure AI Speech Skill

This skill provides expert guidance for Azure AI Speech. Covers troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.

How to Use This Skill

IMPORTANT for Agent: Use the Category Index below to locate relevant sections. For categories with line ranges (e.g., L35-L120), use read_file with the specified lines. For categories with file links (e.g., [security.md](security.md)), use read_file on the linked reference file

IMPORTANT for Agent: If metadata.generated_at is more than 3 months old, suggest the user pull the latest version from the repository. If mcp_microsoftdocs tools are not available, suggest the user install it: Installation Guide

This skill requires network access to fetch documentation content:

  • Preferred: Use mcp_microsoftdocs:microsoft_docs_fetch with query string from=learn-agent-skill. Returns Markdown.
  • Fallback: Use fetch_webpage with query string from=learn-agent-skill&accept=text/markdown. Returns Markdown.

Category Index

CategoryLinesDescription
TroubleshootingL36-L46Diagnosing and fixing common Azure Speech issues (TTS, STT, SDK, containers, Voice Live, Foundry), including error codes, CRL/compatibility, and retrieving session/transcription IDs.
Best PracticesL47-L62Best practices for Azure AI Speech: data prep, custom voice recording/training, latency and memory tuning, Voice Live UX (interruptions, greetings), and improving recognition accuracy and hardware.
Decision MakingL63-L81Guidance on choosing speech features, evaluating models and devices, planning large-scale/batch use, and migrating between Speech/Voice API versions and related services
Limits & QuotasL82-L90Quotas, limits, and usage patterns for Azure Speech: batch TTS, custom/pro voice training & deployment, and short audio STT, plus throttling and capacity planning guidance.
SecurityL91-L102Securing Azure AI Speech: auth with Entra ID, RBAC, network isolation (VNet, Private Link, sovereign clouds), BYOS storage, encryption/keys, and voice talent consent management.
ConfigurationL103-L138Configuring Azure AI Speech behavior: audio I/O, regions, logging, storage, batch jobs, SSML, phonemes, custom speech/voice, and Voice Live/avatars settings and performance.
Integrations & Coding PatternsL139-L160Patterns and APIs for integrating Azure Speech/Voice Live with apps and telephony: real-time agents, STT/TTS, translation, REST/SDK usage, OpenAI chat, function calling, and personal voice.
DeploymentL161-L172Deploying and scaling Azure AI Speech: Docker/Kubernetes containers, on-prem STT/TTS, custom speech models/endpoints, language ID, and batch/long-form synthesis workflows.

Troubleshooting

TopicURL
Troubleshoot common Azure text to speech issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/faq-tts
Retrieve Speech to text session and transcription IDs for supporthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-get-speech-session-id
Resolve common Azure Speech in Foundry issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/known-issues
Resolve Azure AI Speech SDK CRL compatibility issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-sdk-1-48-2
Troubleshoot Speech service container deploymentshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-faq
Troubleshoot common Azure Speech SDK issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/troubleshooting
Troubleshoot common Voice Live API questions and issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-faq

Best Practices

TopicURL
Create high-quality human-labeled speech transcriptionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-human-labeled-transcriptions
Prepare training data for professional custom voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-voice-training-data
Apply best practices to reduce Speech synthesis latencyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-lower-speech-synthesis-latency
Track and manage Azure Speech SDK memory usagehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-track-speech-sdk-memory-usage
Handle user interruptions and chat truncation in Voice Livehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-auto-truncation
Use interim responses in Voice Live to reduce latency gapshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-interim-response
Configure proactive greetings for Voice Live agentshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-proactive-messages
Improve speech recognition with phrase listshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/improve-accuracy-phrase-list
Apply keyword recognition design and accuracy guidelineshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/keyword-recognition-guidelines
Record high-quality samples for custom voice traininghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/record-custom-voice-samples
Back up and recover custom Speech and Voice resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/resiliency-and-recovery-plan
Design microphone arrays optimized for Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-sdk-microphone

Decision Making

TopicURL
Plan large-scale transcription with batch processinghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription
Evaluate custom voice lite before professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/custom-neural-voice-lite
Choose Embedded Speech for offline and hybrid scenarioshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/embedded-speech
Evaluate device suitability for embedded speech modelshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/embedded-speech-performance-evaluations
Evaluate and compare custom speech model accuracyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-inspect-data
Train custom speech models and understand cost behaviorhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-train-model
Migrate Speech to text REST API from v3.2 to 2024-11-15https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-2024-11-15
Migrate Speech-to-text REST from 2024-11-15 to 2025-10-15https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-2025-10-15
Migrate from retired Speech intent recognition to Language or OpenAIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-intent-recognition
Migrate from Long Audio API to Batch synthesishttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-batch-synthesis
Migrate from v3 text-to-speech to custom voice REST APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-custom-voice-api
Migrate Speech-to-text REST from v3.0 to v3.1https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-v3-0-to-v3-1
Migrate Speech-to-text REST from v3.1 to v3.2https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-v3-1-to-v3-2
Assess capabilities and regions for personal voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-overview
Decide when to use Whisper for speech taskshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/whisper-overview

Limits & Quotas

TopicURL
Manage custom speech model and endpoint lifecyclehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle
Deploy professional voice models to custom endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-deploy-endpoint
Train professional voice models and understand durationhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-train-voice
Use Speech-to-text REST API for short audiohttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text-short
Apply Azure Speech quotas, limits, and throttling guidancehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-quotas-and-limits

Security

TopicURL
Configure BYOS storage for Azure Speech resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/bring-your-own-storage-speech-resource
Configure Microsoft Entra authentication for Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-configure-azure-ad-auth
Manage voice talent consent for professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-create-consent
Assign Azure RBAC roles for Speech resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/role-based-access-control
Use Azure Speech service in sovereign cloudshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/sovereign-clouds
Manage Speech service data-at-rest encryption and keyshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-encryption-of-data-at-rest
Secure Speech service with Virtual Network service endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-service-vnet-service-endpoint
Configure Azure Private Link for Speech servicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-private-link

Configuration

TopicURL
Configure Microsoft Audio Stack in Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/audio-processing-speech-sdk
Configure Batch synthesis properties for text-to-speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-synthesis-properties
Configure audio data locations for batch transcriptionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription-audio-data
Check status and retrieve batch transcription resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription-get
Configure BYOS storage for Speech to texthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/bring-your-own-storage-speech-resource-speech-to-text
Define UPS phonetic pronunciations for Speech to texthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/customize-pronunciation
Configure OpenSSL on Linux for Azure Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-configure-openssl-linux
Control and monitor Speech SDK service connectionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-control-connections
Create and manage custom speech fine-tuning projectshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-create-project
Prepare and upload datasets for custom speech traininghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-upload-data
Configure real-time speech recognition inputs and optionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-recognize-speech
Select and configure audio input devices in Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-select-audio-input-devices
Use visemes for facial animation with Speech servicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis-viseme
Configure Speech SDK audio input streamshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-audio-input-streams
Configure compressed audio input for Speech SDK and CLIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-codec-compressed-audio-input-streams
Enable and configure Speech SDK diagnostic logginghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-logging
Check Azure Speech language and voice availabilityhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support
Configure audio and transcription logging for Speech recognitionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/logging-audio-transcription
Upload and validate training datasets for professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-create-training-set
Use correct regional endpoints for Azure Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions
Configure Speech containers storage, logging, and securityhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-configuration
Use Speech phonetic alphabets and IPA in SSMLhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-ssml-phonetic-sets
Control speech output using SSML configurationhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup
Configure pronunciation with SSML phonemes and lexiconshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-pronunciation
Structure SSML documents and events for Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-structure
Configure voice and sound using SSML in Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-voice
Configure Speech CLI datastore search order and fileshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/spx-data-store-configuration
Configure output destinations for Speech CLI resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/spx-output-options
Configure batch synthesis properties for TTS avatarshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/batch-synthesis-avatar-properties
Reference Voice Live API events, models, and settings (2025-10-01)https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-api-reference-2025-10-01
Reference Voice Live API events and settings (2026-01-01-preview)https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-api-reference-2026-01-01-preview
Customize Voice Live models and performance settingshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-how-to-customize

Integrations & Coding Patterns

TopicURL
Integrate Speech service with call center telephonyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/call-center-telephony-integration
Use Speech SDK APIs to handle recognition resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-speech-recognition-results
Integrate custom models with Voice Live BYOMhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-bring-your-own-model
Implement text-to-speech synthesis with Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis
Implement speech translation with Azure Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-translate-speech
Build real-time voice agents with Voice Live and Foundry Agent Servicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-agent-integration
Implement function calling with Voice Live APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-function-calling
Call Azure LLM-speech API for transcription and translationhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/llm-speech
Integrate Azure Speech with Azure OpenAI chathttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/openai-speech
Add and manage user consent for personal voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-create-consent
Create personal voice projects via Custom Voice APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-create-project
Integrate batch transcription with Power Automate and Logic Appshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/power-automate-batch-transcription
Integrate with Speech-to-text REST API 2025-10-15https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text
Call Text-to-speech REST API for voice synthesishttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-text-to-speech
Generate Speech service REST clients from Swaggerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/swagger-documentation
Control text to speech avatar gestures with SSMLhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/avatar-gestures-with-ssml
Use Voice Live WebSocket events and propertieshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-how-to
Integrate Voice Live with telephony using Call Center Acceleratorhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-telephony

Deployment

TopicURL
Use Batch synthesis API for long-form text-to-speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-synthesis
Deploy custom speech models and endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-deploy-model
Scale Speech containers with batch processing kithttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-batch-processing
Run custom speech to text containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-cstt
Deploy and run Speech containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-howto
Run Speech containers on Kubernetes with Helmhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-howto-on-premises
Deploy language identification containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-lid
Deploy neural text to speech containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-ntts
Deploy speech to text containers for on-premises usehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-stt

> related_skills --same-repo

> azure-well-architected

Expert guidance for designing, assessing, and optimizing Azure workloads using Azure Well Architected. Covers design review checklists, recommendations, design principles, tradeoffs, service guides, workload patterns, and assessment questions. Use when architecting new solutions, reviewing existing workloads, or applying Well-Architected principles.

> azure-web-pubsub

Expert knowledge for Azure Web PubSub development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building, debugging, or optimizing Azure Web PubSub applications. Not for Azure SignalR Service (use azure-signalr-service), Azure Event Hubs (use azure-event-hubs), Azure Service Bus (use azure-service-bus), Azure Relay (use azure-relay).

> azure-web-application-firewall

Expert knowledge for Azure Web Application Firewall development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building, debugging, or optimizing Azure Web Application Firewall applications. Not for Azure Application Gateway (use azure-application-gateway), Azure Front Door (use azure-front-door), Azure Firewall (use azure-firewall), Azure DDos Protectio

> azure-vpn-gateway

Expert knowledge for Azure VPN Gateway development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building, debugging, or optimizing Azure VPN Gateway applications. Not for Azure Virtual Network (use azure-virtual-network), Azure Virtual WAN (use azure-virtual-wan), Azure ExpressRoute (use azure-expressroute), Azure Application Gateway (use azure-applica

┌ stats

installs/wk0
░░░░░░░░░░
github stars425
██████████
first seenMar 17, 2026
└────────────

┌ repo

MicrosoftDocs/Agent-Skills
by MicrosoftDocs
└────────────

┌ tags

└────────────