Welcome to the Call Center Real-time Voice Agent solution accelerator — a lightweight template for building speech-to-speech voice agents powered by Azure Voice Live API. It supports multiple telephony providers out of the box, including Azure Communication Services (ACS), Twilio, Infobip, and Genesys Cloud (AudioHook), plus a web browser client for quick testing. Bring your own telephony provider or use the built-in options. Start locally, deploy to Azure Container Apps.
The Azure voice live API is a solution enabling low-latency, high-quality speech to speech interactions for voice agents. The API is designed for developers seeking scalable and efficient voice-driven experiences as it eliminates the need to manually orchestrate multiple components. By integrating speech recognition, generative AI, and text to speech functionalities into a single, unified interface, it provides an end-to-end solution for creating seamless experiences. Learn more about Azure Voice Live API.
The Azure Communication Services Calls Automation APIs provide telephony integration and real-time event triggers to perform actions based on custom business logic specific to their domain. Within the call automation APIs developers can use simple AI powered APIs, which can be used to play personalized greeting messages, recognize conversational voice inputs to gather information on contextual questions to drive a more self-service model with customers, use sentiment analysis to improve customer service overall. Learn more about Azure Communication Services (Call Automation).
Alternatively, telephony integration is supported through third-party providers, including Twilio and Infobip.
Note: With any AI solutions you create using these templates, you are responsible for assessing all associated risks, and for complying with all applicable laws and safety standards. Learn more in the transparency documents for Voice Live API and Azure Communication Services.
This sample demonstrates how to build a real-time voice agent using the Azure Speech Voice Live API.
The solution includes:
-
A backend service that connects to the Voice Live API for real-time ASR, LLM and TTS
-
Multiple client options: The web browser client is always available. For telephony, choose one provider:
- Web browser — microphone/speaker via WebSocket (always available, great for testing)
- Azure Communication Services (ACS) — enterprise PSTN with Call Automation (default)
- Twilio — PSTN via Twilio Media Streams with webhook signature validation
- Infobip — PSTN via Infobip Calls API with WebSocket audio streaming
- Genesys Cloud — AudioHook (Audio Connector) for real-time call audio streaming
Telephony selection: Only one telephony provider can be active at a time. The service automatically selects the provider based on the configured credentials. If no credentials are provided, Azure Communication Services is used by default.
-
Ambient Scenes (optional): Add realistic background audio (office, call center) or use custom audio files to simulate real-world environments
-
Flexible configuration to customize prompts, ASR, TTS, and behavior
-
Easy extension to other client types
You can also try the Voice Live API via Azure AI Foundry for quick experimentation before deploying this template to your own Azure subscription.
![]() |
|---|
To deploy this solution accelerator, ensure you have access to an Azure subscription with the necessary permissions to create resource groups and resources. Follow the steps in Azure Account Set Up.
Check the Azure Products by Region page and select a region where the following services are available: Azure AI Foundry Speech, Azure Communication Services, Azure Container Apps, and Container Registry.
See Voice Live supported regions for a full list. Common choices include East US 2, Sweden Central, West US 2, and Southeast Asia. Pricing varies per region and usage, so it isn't possible to predict exact costs for your usage. The majority of the Azure resources used in this infrastructure are on usage-based pricing tiers. However, Azure Container Registry has a fixed cost per registry per day.
Use the Azure pricing calculator to calculate the cost of this solution in your subscription.
| Product | Description | Cost |
|---|---|---|
| Azure Speech Voice Live | Low-latency and high-quality speech to speech interactions | Pricing |
| Azure Communication Services | Server-based intelligent call workflows | Pricing |
| Azure Container Apps | Hosts the web application frontend | Pricing |
| Azure Container Registry | Stores container images for deployment | Pricing |
Here are some developers tools to set up as prerequisites:
- Azure CLI:
az - Azure Developer CLI:
azd - Python:
python - UV:
uv - Optionally Docker:
docker
Pick from the options below to see step-by-step instructions for: GitHub Codespaces, VS Code Dev Containers, Local Environments, and Bicep deployments.
Deploy in GitHub Codespaces
You can run this solution using GitHub Codespaces. The button will open a web-based VS Code instance in your browser:
-
Open the solution accelerator (this may take several minutes):
-
Accept the default values on the create Codespaces page.
-
Open a terminal window if it is not already open.
-
Follow the instructions in the helper script to populate deployment variables.
-
Continue with the deploying steps.
Deploy in VS Code Dev Containers
You can run this solution in VS Code Dev Containers, which will open the project in your local VS Code using the Dev Containers extension:
-
Start Docker Desktop (install it, if not already installed)
-
Open the project:
-
In the VS Code window that opens, once the project files show up (this may take several minutes), open a terminal window.
-
Follow the instructions in the helper script to populate deployment variables.
-
Continue with the deploying steps.
Deploy in your local environment
If you're not using one of the above options for opening the project, then you'll need to:
-
Make sure the following tools are installed:
-
Download the project code:
azd init -t Azure-Samples/call-center-voice-agent-accelerator/
Note: the above command should be run in a new folder of your choosing. You do not need to run
git cloneto download the project source code.azd inithandles this for you. -
Open the project folder in your terminal or editor.
-
Continue with the deploying steps.
Once you've opened the project in Codespaces or in Dev Containers or locally, you can deploy it to Azure following the following steps.
To change the azd parameters from the default values, follow the steps here.
-
Login to Azure:
azd auth login
-
Provision and deploy all the resources:
azd up
It will prompt you to provide an
azdenvironment name (like "voice-agent-dev"), select a subscription from your Azure account, and select a Voice Live region.The setup wizard will then guide you through:
- Model selection — choose from 12 fully managed models across Pro, Basic, and Lite tiers (or bring your own)
- Telephony provider selection — choose ACS (default), Twilio, Infobip, or Genesys
- Credential entry — securely prompts for tokens/keys only if you picked Twilio, Infobip, or Genesys
After provisioning completes, you'll see a deployment summary with your webhook endpoint(s) and next steps.
-
When
azdhas finished deploying, open the Application URL shown in the output to test in your browser. 🎉 -
When you've made any changes to the app code, you can just run:
azd deploy
-
To switch models after deployment (no redeploy needed):
az containerapp update -n <app-name> -g <resource-group> --set-env-vars "AZURE_VOICE_LIVE_MODEL=gpt-4.1-mini"
The model is a runtime-only setting — changing it does not require
azd upor any infrastructure changes. Updateazd envtoo to keep future deploys consistent:azd env set AZURE_VOICE_LIVE_MODEL gpt-4.1-mini -
To view live logs:
azd monitor --logs
-
When done, clean up all resources:
azd down
Note
- All supported models are fully managed — no deployment or capacity planning needed.
- Pricing is tiered (Pro, Basic, Lite) based on the model you choose. Default is
gpt-4o-mini(Basic tier). - Not all models are available in every region. The setup wizard validates your selection and will block incompatible model/region combinations. Models available in all regions include:
gpt-4.1,gpt-4.1-mini,gpt-4.1-nano,gpt-5,gpt-5-chat,gpt-5-mini,gpt-5-nano. - See Voice Live supported regions and models for the full compatibility matrix.
- Post-Deployment: Webhook configuration is handled automatically by the post-deploy script. For ACS telephony, you'll still need to acquire a PSTN phone number (see Testing the Agent below).
After deployment, you can verify that your Voice Agent is running correctly using the Web Client (quick browser test) or a telephony client for a real-world call center scenario.
Use this browser-based client to confirm your Container App is up and responding.
- Go to the Azure Portal and navigate to the Resource Group created by your deployment.
- Find and open the Container App resource.
- On the Overview page, copy the Application URL.
- Open the URL in your browser — a demo webpage should load.
- Click Start Talking to Agent to begin a voice session using your browser’s microphone and speaker.
- Click Stop Conversation to end the session.
⚠️ This web client is intended for testing purposes only. Use the ACS client below for production-like call flow testing.
This simulates a real inbound phone call to your voice agent using Azure Communication Services (ACS).
The IncomingCall Event Grid subscription is created automatically by the post-deploy script during azd up. No manual portal configuration is needed.
Manual setup (if needed)
- In the same resource group, find and open the Communication Services resource.
- In the left-hand menu, click Events.
- Click + Event Subscription and fill in the following:
- Event Type:
IncomingCall - Endpoint Type:
Web Hook - Endpoint Address:
https://<your-container-app-url>/acs/incomingcall
- Event Type:
📸 Refer to the screenshot below for guidance:
If you haven't already, obtain a phone number for your ACS resource:
👉 How to get a phone number (Microsoft Docs)
Once the phone number is active:
- Dial the ACS number.
- Your call will connect to the real-time voice agent powered by Azure Voice Live.
Inbound calls are handled via Twilio Media Streams — the server validates the request, connects the caller's audio to the AI agent via a real-time WebSocket, and bridges it to Azure Voice Live.
- A Twilio account
- A phone number purchased in the Twilio Console
During
azd up, the setup wizard prompts for Twilio credentials and stores the token securely in Azure Key Vault.
| Variable | Description | Where to find it |
|---|---|---|
TWILIO_ACCOUNT_SID |
Your Twilio Account SID | Twilio Console → Account Info |
TWILIO_AUTH_TOKEN |
Your Twilio Auth Token | Twilio Console → Account Info |
The Twilio webhook is configured automatically by the post-deploy script during azd up. It sets your phone number's voice URL to https://<your-container-app-url>/voice.
Manual setup (if needed)
- In the Twilio Console, go to your phone number's configuration.
- Under PhoneNumber → A Call Comes In, set:
- Webhook URL:
https://<your-container-app-url>/voice - HTTP Method:
POST
- Webhook URL:
- Save changes.
Dial your Twilio phone number. The call connects to the real-time voice agent powered by Azure Voice Live.
How it works:
- Twilio sends a request to
/voice— the server validates it and returns TwiML to start a media stream - Twilio opens a WebSocket to
/twilio/ws— the server verifies the embedded token, then bridges audio to Azure Voice Live - The AI agent hears the caller, generates a response, and audio is streamed back through the same connection
Inbound calls are handled via the Infobip Calls API — the server answers the call, then bridges the caller's audio to Azure Voice Live via a WebSocket connection.
- An Infobip account with Voice capabilities enabled
- A phone number purchased in the Infobip Portal
During
azd up, the setup wizard prompts for your Infobip API key and base URL, and stores the key securely in Azure Key Vault.
| Variable | Description | Where to find it |
|---|---|---|
INFOBIP_API_KEY |
Your Infobip API key | Infobip Portal → Homepage → API Key |
INFOBIP_API_BASE_URL |
Your account's API base URL (e.g. https://xxxxx.api.infobip.com) |
Infobip Portal → Homepage → Base URL |
All Infobip configuration is set up automatically by the post-deploy script during azd up:
- Notification profile with webhook URL
- Media stream configuration with WebSocket URL
- Calls configuration
- Event subscription for call lifecycle events
Manual setup (if needed)
- In the Infobip Portal, go to Channels and Numbers → VOICE AND WEBRTC.
- Under Notification Profile, create or update a profile with:
- Notify URL:
https://<your-container-app-url>/infobip/incoming
- Notify URL:
- Under Calls API → Media streaming, create a new configuration with:
- URL:
wss://<your-container-app-url>/infobip/ws - Audio format:
audio/l16;rate=24000(PCM 16-bit, 24kHz)
- URL:
- Under Calls API → Calls Configuration, create a configuration linked to your notification profile and media stream config.
- Assign your Infobip phone number to this Calls Configuration.
- Under Event Subscription (via API:
POST /subscriptions/1/subscription/VOICE_VIDEO), create a subscription with events:CALL_RECEIVED,CALL_ESTABLISHED,CALL_FINISHED,CALL_FAILED,CALL_STARTED,CALL_DISCONNECTEDMEDIA_STREAM_STARTED,MEDIA_STREAM_FAILED,MEDIA_STREAM_FINISHEDDIALOG_CREATED,DIALOG_ESTABLISHED,DIALOG_FAILED,DIALOG_FINISHEDDTMF_CAPTURED,CALL_RINGING,CALL_PRE_ESTABLISHED
Dial your Infobip phone number. The call connects to the real-time voice agent powered by Azure Voice Live.
How it works:
- Infobip sends a
CALL_RECEIVEDwebhook to/infobip/incoming— the server answers the call - Once established, the server creates a Dialog that bridges the caller to the WebSocket endpoint
- Infobip connects to
/infobip/ws— audio flows bidirectionally between the caller and Azure Voice Live
Genesys AudioHook (Audio Connector) streams real-time call audio from Genesys Cloud to your deployed Container App via WebSocket. Unlike the other telephony options, Genesys does not route phone calls through this template — it forwards audio from calls already handled within Genesys Cloud to your AudioHook endpoint for AI processing.
Caller → PSTN → Genesys Cloud → AudioHook WebSocket → Container App → Voice Live AI
- A Genesys Cloud organization with the Audio Connector feature enabled
During
azd up, the setup wizard prompts for a Genesys API key and stores it securely in Azure Key Vault.
| Variable | Description | Where to find it |
|---|---|---|
GENESYS_API_KEY |
A shared secret for authenticating AudioHook connections | You define this value — use the same key in both your deployment and Genesys Cloud integration settings |
After deployment, open the simulator page to test without a Genesys Cloud account:
https://<your-container-app-url>/genesys
The simulator mimics a Genesys AudioHook client in the browser — it sends your microphone audio as PCMU 8kHz and plays back the AI response. Enter the same API key you configured during setup.
- Add an AudioHook (Audio Connector) integration in your Genesys Cloud Admin console
- Set the Connection URI to
wss://<your-container-app-url>/audiohook/wsand the API Key to the same value you configured inGENESYS_API_KEY - Assign the integration to a call flow or queue so that matching calls stream audio to your endpoint
For protocol details, see the Genesys AudioHook developer documentation.
How it works:
- Genesys Cloud opens a WebSocket to
/audiohook/wsand authenticates with the API key - The caller's audio streams to the server, which bridges it to Azure Voice Live
- The AI response audio is streamed back to Genesys Cloud for the caller to hear
Once the environment has been deployed with azd up you can also run the application locally.
Please follow the instructions in the server README.
The Voice Live API supports connecting to an existing Azure AI Foundry Agent, allowing you to leverage pre-built capabilities, knowledge bases, and orchestration features alongside real-time voice interactions.
In the session.update configuration, you can set different properties such as the model, voice settings, turn detection, and agent connection. For detailed configuration options and step-by-step instructions, refer to the official documentation:
👉 Get started with Voice Live and Azure AI Foundry Agent Service
After updating your configuration, deploy the changes to your Container App:
azd deployAdd realistic background audio to your voice agent to simulate real-world call center environments. This feature works for both web browser and phone (ACS) clients.
Available Presets:
| Preset | Description |
|---|---|
none |
Disabled (default) - clean audio with no background |
office |
Quiet office ambient (keyboard typing, soft murmurs) |
call_center |
Busy call center background (phones, conversations) |
| custom | Add your own audio files (see below) |
How to Enable:
-
Set the
AMBIENT_PRESETenvironment variable in your.envfile:AMBIENT_PRESET=call_center -
For Azure deployment, set it before running
azd up:azd env set AMBIENT_PRESET call_center azd up
Adjusting Volume:
The ambient volume is controlled by _ambient_gain in server/app/handler/ambient_mixer.py:
self._ambient_gain = 0.08 # Default: subtle background| Value | Effect |
|---|---|
0.05 |
Very quiet (barely audible) |
0.08 |
Subtle (default) |
0.12 |
Moderate |
0.20 |
Noticeable |
Using Custom Audio Files:
You can add your own ambient audio files:
-
Prepare your audio file with these requirements:
- Format: WAV (uncompressed PCM)
- Sample Rate: 24000 Hz
- Bit Depth: 16-bit signed
- Channels: Mono
- Duration: 30-60 seconds (will loop seamlessly)
-
Place the file in
server/app/audio/ -
Register the preset in
server/app/handler/ambient_mixer.py:PRESETS = { "none": {"file": None}, "office": {"file": "office.wav"}, "call_center": {"file": "callcenter.wav"}, "my_custom": {"file": "my_audio.wav"}, # Add your preset }
-
Set
AMBIENT_PRESET=my_customin your.envfile
- 📖 Docs: Voice live overview
- 📖 Blog: Upgrade your voice agent with Azure AI Voice Live API
- 📖 Docs: Azure Speech
- 📖 Docs: Azure Communication Services (Call Automation)
ACS currently does not support Managed Identity. The ACS connection string is stored securely in Key Vault and injected into the container app via its secret URL.
To the extent that the Software includes components or code used in or derived from Microsoft products or services, including without limitation Microsoft Azure Services (collectively, “Microsoft Products and Services”), you must also comply with the Product Terms applicable to such Microsoft Products and Services. You acknowledge and agree that the license governing the Software does not grant you a license or other right to use Microsoft Products and Services. Nothing in the license or this ReadMe file will serve to supersede, amend, terminate or modify any terms in the Product Terms for any Microsoft Products and Services.
You must also comply with all domestic and international export laws and regulations that apply to the Software, which include restrictions on destinations, end users, and end use. For further information on export restrictions, visit https://aka.ms/exporting.
You acknowledge that the Software and Microsoft Products and Services (1) are not designed, intended or made available as a medical device(s), and (2) are not designed or intended to be a substitute for professional medical advice, diagnosis, treatment, or judgment and should not be used to replace or as a substitute for professional medical advice, diagnosis, treatment, or judgment. Customer is solely responsible for displaying and/or obtaining appropriate consents, warnings, disclaimers, and acknowledgements to end users of Customer’s implementation of the Online Services.
You acknowledge the Software is not subject to SOC 1 and SOC 2 compliance audits. No Microsoft technology, nor any of its component technologies, including the Software, is intended or made available as a substitute for the professional advice, opinion, or judgement of a certified financial services professional. Do not use the Software to replace, substitute, or provide professional financial advice or judgment.
BY ACCESSING OR USING THE SOFTWARE, YOU ACKNOWLEDGE THAT THE SOFTWARE IS NOT DESIGNED OR INTENDED TO SUPPORT ANY USE IN WHICH A SERVICE INTERRUPTION, DEFECT, ERROR, OR OTHER FAILURE OF THE SOFTWARE COULD RESULT IN THE DEATH OR SERIOUS BODILY INJURY OF ANY PERSON OR IN PHYSICAL OR ENVIRONMENTAL DAMAGE (COLLECTIVELY, “HIGH-RISK USE”), AND THAT YOU WILL ENSURE THAT, IN THE EVENT OF ANY INTERRUPTION, DEFECT, ERROR, OR OTHER FAILURE OF THE SOFTWARE, THE SAFETY OF PEOPLE, PROPERTY, AND THE ENVIRONMENT ARE NOT REDUCED BELOW A LEVEL THAT IS REASONABLY, APPROPRIATE, AND LEGAL, WHETHER IN GENERAL OR IN A SPECIFIC INDUSTRY. BY ACCESSING THE SOFTWARE, YOU FURTHER ACKNOWLEDGE THAT YOUR HIGH-RISK USE OF THE SOFTWARE IS AT YOUR OWN RISK.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft’s Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party’s policies.
The software may collect information about you and your use of the software and send it to Microsoft. Microsoft may use this information to provide services and improve our products and services. You may turn off the telemetry as described in the repository. There are also some features in the software that may enable you and Microsoft to collect data from users of your applications. If you use these features, you must comply with applicable law, including providing appropriate notices to users of your applications together with a copy of Microsoft’s privacy statement. Our privacy statement is located at here. You can learn more about data collection and use in the help documentation and our privacy statement. Your use of the software operates as your consent to these practices.
Note:
- No telemetry or data collection is directly added in this accelerator project. Please review individual telemetry information from the included Azure services regarding their APIs.

