Build a conversational voice assistant for phone calls using LiveKit SIP with Twilio and Speechmatics speech recognition and text-to-speech.
A complete voice assistant that handles inbound phone calls using LiveKit's SIP integration with Twilio, best-in-class speech recognition (Speechmatics STT), natural language processing (OpenAI), and text-to-speech (Speechmatics TTS).
- How to integrate Speechmatics STT and TTS with LiveKit Agents for telephony
- Setting up Twilio Elastic SIP Trunking with LiveKit
- Handling inbound phone calls with LiveKit SIP
- Using LiveKit's agent framework for phone-based conversations
- Voice Activity Detection (VAD) for natural turn-taking
- Speechmatics API Key: Get one from portal.speechmatics.com
- OpenAI API Key: Get one from platform.openai.com
- LiveKit Cloud Account: Get one from cloud.livekit.io
- Twilio Account: Get one from twilio.com
- Twilio Phone Number: Purchase a phone number in the Twilio Console
- Python 3.10+
Step 1: Create and activate a virtual environment
On Windows:
cd python
python -m venv .venv
.venv\Scripts\activateOn Mac/Linux:
cd python
python3 -m venv .venv
source .venv/bin/activateStep 2: Install dependencies
pip install -r requirements.txtStep 3: Configure your API keys
cp ../.env.example .envOpen the .env file and add your API keys:
SPEECHMATICS_API_KEY=your_speechmatics_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key_here
LIVEKIT_API_SECRET=your_livekit_api_secret_here
Important
Why .env? Never commit API keys to version control. The .env file keeps secrets out of your code.
Step 4: Configure Twilio SIP Trunk
See Twilio + LiveKit SIP Setup below.
Step 5: Run the agent
python main.py devStep 6: Call your Twilio number
Call your Twilio phone number and start talking to Roxie!
flowchart LR
subgraph Phone
CALLER[Caller]
end
subgraph Twilio
PSTN[PSTN Gateway]
SIP[Elastic SIP Trunk]
end
subgraph LiveKit Cloud
LKSIP[LiveKit SIP]
ROOM[LiveKit Room]
end
subgraph Agent
STT[Speechmatics STT]
LLM[OpenAI LLM]
TTS[Speechmatics TTS]
VAD[Silero VAD]
end
CALLER <-->|Phone Call| PSTN
PSTN <-->|SIP| SIP
SIP <-->|SIP| LKSIP
LKSIP <-->|WebRTC| ROOM
ROOM --> STT
STT --> LLM
LLM --> TTS
TTS --> ROOM
VAD --> STT
- Incoming Call - Caller dials your Twilio phone number
- SIP Routing - Twilio routes call via SIP to LiveKit
- Room Connection - LiveKit SIP creates a room participant for the caller
- Speech-to-Text - Speechmatics transcribes the caller's speech
- LLM Processing - OpenAI generates a response
- Text-to-Speech - Speechmatics converts response to audio
- Audio Playback - Audio streams back through LiveKit SIP to the caller
| Component | Description |
|---|---|
| LiveKit Agents | Framework for building real-time voice AI applications |
| LiveKit SIP | Bridges PSTN calls to LiveKit rooms via SIP |
| Twilio Elastic SIP | Routes phone calls to LiveKit's SIP endpoint |
| Speechmatics STT | Real-time speech-to-text transcription |
| OpenAI GPT-4o-mini | Language model for generating responses |
| Speechmatics TTS | Text-to-speech for natural voice output |
| Silero VAD | Voice Activity Detection for turn-taking |
from livekit.agents import AgentSession, Agent
from livekit.plugins import speechmatics, openai, silero
class VoiceAssistant(Agent):
def __init__(self) -> None:
super().__init__(instructions="You are Roxie, a hilarious standup comedian...")
async def entrypoint(ctx: agents.JobContext):
await ctx.connect()
session = AgentSession(
stt=speechmatics.STT(),
llm=openai.LLM(model="gpt-4o-mini"),
tts=speechmatics.TTS(),
vad=silero.VAD.load(),
)
await session.start(room=ctx.room, agent=VoiceAssistant())
await session.generate_reply(instructions="Say hello...")First, create a SIP trunk in LiveKit to receive calls:
- Go to LiveKit Cloud Console
- Navigate to Telephony Configuration > SIP Trunks
- Click Create SIP Trunk
- Configure:
- Trunk name: Give it a name (e.g., "Twilio Inbound")
- Trunk direction: Select Inbound
- Numbers: Enter your Twilio phone number (e.g.,
+14155551234) - Allowed addresses: Leave as
0.0.0.0/0to allow all IPs
- Click Create and copy the SIP URI (e.g.,
sip:xxxxx.sip.livekit.cloud)
- Go to Twilio Console
- Navigate to Elastic SIP Trunking > Trunks
- Click Create new SIP Trunk
- Give it a name (e.g., "LiveKit Voice Assistant")
Point Twilio to your LiveKit SIP endpoint:
- In your Twilio SIP trunk, go to Origination
- Add an Origination URI:
- Origination SIP URI: Paste the LiveKit SIP URI from Step 1
- Priority: 1
- Weight: 1
- Enabled: Yes
- Save the configuration
- In your Twilio SIP trunk, go to Phone Numbers
- Click Add a Phone Number
- Select your Twilio phone number
- Save the configuration
Route incoming calls to your agent:
- Go to LiveKit Cloud Console
- Navigate to Telephony Configuration > Dispatch Rules
- Click Create Dispatch Rule
- Configure:
- Trunk: Select the inbound trunk you created in Step 1
- Rule Type: Individual (creates a room per call)
- Room Prefix:
call-(or any prefix you prefer)
- Save the configuration
Edit assets/agent.md to change the assistant's personality. The default configures Roxie as a standup comedian optimized for phone conversations.
For better call quality, enable Krisp noise cancellation in your LiveKit dispatch rule:
- Go to LiveKit Cloud Console
- Navigate to Telephony Configuration > Dispatch Rules
- Select your dispatch rule and click Edit
- Add
krispEnabled: trueto theroomConfigobject:
{
"sipDispatchRuleId": "SDR_yourRuleId",
"rule": {
"dispatchRuleIndividual": {
"roomPrefix": "call-"
}
},
"trunkIds": [
"ST_yourTrunkId"
],
"name": "Your Rule Name",
"roomConfig": {
"krispEnabled": true
}
}This enables Krisp AI noise cancellation, which filters out background noise from the caller's phone for clearer audio to your STT.
| Mode | Command | Description |
|---|---|---|
| Dev | python main.py dev |
Connects to LiveKit Cloud for testing |
| Console | python main.py console |
Local testing with microphone (no phone) |
| Production | python main.py start |
Production deployment |
| Feature | LiveKit SIP (This Sample) | Direct Twilio (01-voice-assistant) |
|---|---|---|
| Infrastructure | LiveKit Cloud handles SIP | Your server handles WebSocket |
| Scaling | Built-in via LiveKit | Manual server scaling |
| Audio Format | Handled by LiveKit | Manual mulaw conversion |
| Setup Complexity | SIP trunk configuration | Webhook + ngrok setup |
| Best For | Production deployments | Learning/prototyping |
Error: "Invalid API key"
- Verify all API keys in your
.envfile - Check each service's portal for key validity
Calls not connecting
- Verify Twilio SIP trunk origination URI is correct
- Check LiveKit dispatch rules are configured
- Ensure LiveKit API credentials are valid
Agent doesn't respond
- Check OpenAI API key is valid
- Verify you have API credits available
No audio heard by caller
- Check Speechmatics API key
- Verify TTS is generating audio in logs
- Simple Voice Assistant - LiveKit - WebRTC-based voice assistant
- Voice Assistant - Direct Twilio - Compare with direct Twilio integration
- LiveKit SIP Documentation
- LiveKit Agents Documentation
- Twilio Elastic SIP Trunking
- Configuring Twilio Trunk for LiveKit
- Speechmatics API Docs
- OpenAI API Docs
Help us improve this guide:
- Found an issue? Report it
- Have suggestions? Open a discussion
Time to Complete: 30 minutes Difficulty: Advanced Integration: LiveKit + Twilio SIP