Migration • Feature Comparison • Code Examples
Switching from Deepgram? This guide shows you equivalent features and code patterns to help you migrate smoothly.
Note
Migration Incentive: Get $200 free credit with code SWITCH200 when switching from Deepgram! Learn more
- Feature Mapping
- Why Switch?
- Code Migration Examples
- Response Structure
- Migration Checklist
- Common Gotchas
- Need Help?
| Feature | Deepgram | Speechmatics | Notes |
|---|---|---|---|
| Model Selection | model="nova-3" |
operating_point="enhanced" |
Enhanced for best accuracy, "standard" for faster turnaround |
| Language | language="en-US" |
language="en" |
Speechmatics uses ISO 639-1 codes; no locale variants needed (handles all accents automatically). Mandarin uses cmn with output_locale for Simplified/Traditional formatting |
| Sample Rate | sample_rate=16000 |
sample_rate=16000 |
Same parameter in AudioFormat |
| Encoding | encoding="linear16" |
encoding="pcm_s16le" |
Slightly different naming |
| Channels | channels=1 |
Via diarization="channel" + AsyncMultiChannelClient |
Speechmatics uses separate streams per channel |
| API Key | DEEPGRAM_API_KEY |
SPEECHMATICS_API_KEY |
Environment variable naming |
Real-time Streaming & Voice Features • Click to explore RealTime and Voice Features
Speechmatics Packages:
speechmatics-rtfor basic real-time streaming,speechmatics-voicefor voice agent features (turn detection, segments, VAD events). Voice SDK is built on top of RT SDK.
| Feature | Deepgram | Speechmatics | Package | Notes |
|---|---|---|---|---|
| Interim Results | interim_results=True |
enable_partials=True |
rt, voice |
Partial transcripts while processing |
| Endpointing | endpointing=500 (ms) |
max_delay=0.5 (seconds) |
rt, voice |
Duration engine waits to verify partial word accuracy before committing (0.7-4.0s) |
| Max Delay Mode | Not available | max_delay_mode="flexible" or "fixed" |
rt, voice |
Flexible allows entity completion |
| Utterance End | utterance_end_ms=1000 |
end_of_utterance_silence_trigger=1.0 |
rt, voice |
Reference silence duration (0-2s); ADAPTIVE mode scales this based on speech patterns |
| Force End Utterance | Finalize message |
client.finalize(end_of_turn=True) |
voice |
Manually trigger end of utterance |
| VAD Events | vad_events=True (Beta) |
AgentServerMessageType.SPEAKER_STARTEDAgentServerMessageType.SPEAKER_ENDED |
voice |
Voice activity detection events |
| Diarization | diarize=True |
diarization="speaker" |
rt, voice |
Speaker labeling |
| Speaker Config | Not available | speaker_diarization_config= SpeakerDiarizationConfig(...) |
rt, voice |
Fine-tune diarization |
| Known Speakers | Not available | known_speakers=[SpeakerIdentifier(label, speaker_identifiers)] |
rt, voice |
Pre-register speaker voices |
| Speaker Focus | Not available | SpeakerFocusConfig(focus_speakers, ignore_speakers, focus_mode) |
voice |
Focus on specific speakers; only focused speakers drive conversation flow |
| Multichannel | multichannel=True |
diarization="channel" or "channel_and_speaker" |
rt, voice |
Channel-based diarization |
| Channel Labels | Not available | channel_diarization_labels=["agent", "customer"] |
rt, voice |
Label audio channels |
| Keywords/Keyterms | keywords=["term"],keyterm=["term"] |
additional_vocab=[{"content": "term"}] |
rt, voice |
Boost specific terms |
| Translation | Not available | translation_config=TranslationConfig(target_languages=["es"] |
rt |
Real-time translation |
| Audio Events | Not available | audio_events_config=AudioEventsConfig(types=[...]) |
rt |
Detect laughter, applause, etc. |
| Domain | Not available | domain="medical" |
rt, voice |
Domain-optimized language pack |
Turn Detection (Voice SDK):
| Feature | Deepgram | Speechmatics | Notes |
|---|---|---|---|
| Fixed Delay | Via settings | EndOfUtteranceMode.FIXED |
Waits exactly the configured silence duration every time |
| Adaptive Delay | Not available | EndOfUtteranceMode.ADAPTIVE |
Scales wait time based on speech pace, filler words (um/uh), and punctuation |
| Smart Turn (ML) | Not available | smart_turn_config=SmartTurnConfig(enabled=True) |
Uses ML model to predict semantic turn completions (with ADAPTIVE mode) |
| External Control | Not available | EndOfUtteranceMode.EXTERNAL + client.finalize(end_of_turn=True) |
Application controls turn endings (for Pipecat/LiveKit integration) |
| Silence Trigger | Via settings | end_of_utterance_silence_trigger |
Reference duration (0-2s); ADAPTIVE mode applies multipliers based on context |
| Presets | Not available | preset="fast", "fixed", "adaptive", "smart_turn", "scribe", "captions", "external" |
Ready-to-use configurations optimized for specific use cases |
Server Message Types:
| Deepgram Event | Speechmatics Event | Package | Notes |
|---|---|---|---|
EventType.MESSAGE (is_final=True) |
ServerMessageType.ADD_TRANSCRIPT |
rt |
Final transcript |
EventType.MESSAGE (is_final=False) |
ServerMessageType.ADD_PARTIAL_TRANSCRIPT |
rt |
Partial results |
EventType.MESSAGE (UtteranceEnd) |
ServerMessageType.END_OF_UTTERANCE |
rt |
End of utterance |
EventType.MESSAGE (SpeechStarted) |
AgentServerMessageType.SPEAKER_STARTED |
voice |
Speech detected |
EventType.MESSAGE (Metadata) |
ServerMessageType.RECOGNITION_STARTED |
rt, voice |
Session metadata |
| Not available | AgentServerMessageType.SPEAKER_ENDED |
voice |
Speech ended |
| Not available | AgentServerMessageType.ADD_SEGMENT |
voice |
Final segment |
| Not available | AgentServerMessageType.ADD_PARTIAL_SEGMENT |
voice |
Partial segment |
| Not available | AgentServerMessageType.START_OF_TURN |
voice |
Turn started |
| Not available | AgentServerMessageType.END_OF_TURN |
voice |
Turn completed |
| Not available | AgentServerMessageType.END_OF_TURN_PREDICTION |
voice |
Turn prediction timing |
| Not available | ServerMessageType.ADD_TRANSLATION |
rt |
Translation result |
| Not available | ServerMessageType.AUDIO_EVENT_STARTED / ENDED |
rt |
Audio events |
| Not available | ServerMessageType.SPEAKERS_RESULT |
rt |
Speaker identification |
Usage - Basic RT Streaming:
from speechmatics.rt import AsyncClient, ServerMessageType, TranscriptionConfig, AudioFormat, AudioEncoding
async with AsyncClient(api_key="YOUR_KEY") as client:
@client.on(ServerMessageType.ADD_TRANSCRIPT)
def on_transcript(message):
print(message['metadata']['transcript'])
await client.transcribe(
audio_file,
transcription_config=TranscriptionConfig(language="en", diarization="speaker"),
audio_format=AudioFormat(encoding=AudioEncoding.PCM_S16LE, sample_rate=16000)
)Usage - Voice SDK (Turn Detection):
from speechmatics.voice import VoiceAgentClient, VoiceAgentConfig, EndOfUtteranceMode, AgentServerMessageType
config = VoiceAgentConfig(
language="en",
enable_diarization=True,
end_of_utterance_mode=EndOfUtteranceMode.ADAPTIVE,
end_of_utterance_silence_trigger=0.5
)
async with VoiceAgentClient(api_key="YOUR_KEY", config=config) as client:
@client.on(AgentServerMessageType.ADD_SEGMENT)
def on_segment(message):
for segment in message['segments']:
print(f"[{segment['speaker_id']}]: {segment['text']}")
@client.on(AgentServerMessageType.END_OF_TURN)
def on_turn_end(message):
print("User finished speaking - ready for response")
await client.send_audio(audio_chunk)Batch Transcription Features • Click to explore Batch Features
Speechmatics Package:
speechmatics-batch
| Feature | Deepgram | Speechmatics | Package | Notes |
|---|---|---|---|---|
| Diarization | diarize=True, diarize_version="latest" |
diarization="speaker" |
batch |
Speaker identification |
| Multichannel | multichannel=True |
diarization="channel" or "channel_and_speaker" |
batch |
Channel-based diarization |
| Sentiment | sentiment=True |
sentiment_analysis_config=SentimentAnalysisConfig() |
batch |
Sentiment analysis |
| Topic Detection | topics=True |
topic_detection_config=TopicDetectionConfig(topics=[...]) |
batch |
Automatic topic extraction |
| Summarization | summarize=True |
summarization_config=SummarizationConfig(content_type, summary_length, summary_type) |
batch |
AI-powered summaries |
| Intent Recognition | intents=True |
Not available | - | Detect user intents |
| Entity Detection | detect_entities=True |
enable_entities=True |
batch |
Detect named entities |
| Utterances | utterances=True, utt_split=0.8 |
Not available | - | Split into utterances |
| Paragraphs | paragraphs=True |
Not available | - | Paragraph segmentation |
| Dictation | dictation=True |
Not available | - | Dictation mode formatting |
| Measurements | measurements=True |
enable_entities=True |
batch |
Format measurements (e.g., "10 km/s") |
| Auto Chapters | Not available | auto_chapters_config=AutoChaptersConfig() |
batch |
Automatic chapter generation |
| Audio Events | Not available | audio_events_config=AudioEventsConfig(types=[...]) |
batch |
Detect laughter, applause, etc. |
| Translation | Not available | translation_config=TranslationConfig(target_languages=["es", "fr"]) |
batch |
Translate transcript |
| Language ID | detect_language=True |
language_identification_config=LanguageIdentificationConfig(expected_languages=[...]) |
batch |
Identify spoken language |
| Domain | Not available | domain="medical" |
batch |
Domain-optimized language pack |
| Output Locale | Not available | output_locale="en-US" |
batch |
RFC-5646 locale for output |
| Output Format | ?format=srt |
get_transcript(job_id, format_type=FormatType.SRT) |
batch |
JSON, TXT, SRT formats |
| Webhooks | callback="url" |
notification_config=[NotificationConfig(url, contents, method)] |
batch |
Job completion notifications |
| Job Tracking | extra=KEY:VALUE |
tracking=TrackingConfig(title, reference, tags) |
batch |
Custom job metadata |
| Fetch from URL | url=... |
fetch_data=FetchData(url, auth_headers) |
batch |
Transcribe from URL |
Usage:
from speechmatics.batch import AsyncClient, JobConfig, JobType, TranscriptionConfig, SummarizationConfig
async with AsyncClient(api_key="YOUR_KEY") as client:
config = JobConfig(
type=JobType.TRANSCRIPTION,
transcription_config=TranscriptionConfig(
language="en",
diarization="speaker",
enable_entities=True
),
summarization_config=SummarizationConfig(
content_type="conversational",
summary_length="brief"
)
)
result = await client.transcribe("audio.wav", config=config)
print(result.transcript_text)
print(result.summary)Output Formatting & Filtering • Click to explore Formatting Options
Speechmatics Packages:
speechmatics-batch,speechmatics-rt- formatting features available in both batch and real-time.Note: Parameters like
punctuation_overrides,transcript_filtering_config, andaudio_filtering_configacceptdictobjects. The SDK passes these directly to the API - refer to API documentation for valid keys.
| Feature | Deepgram | Speechmatics | Package | Notes |
|---|---|---|---|---|
| Smart Formatting | smart_format=True |
enable_entities=True |
batch, rt |
Dates, numbers, currencies, emails, etc. |
| Punctuation | punctuate=True |
Enabled by default | batch, rt |
Automatic punctuation |
| Punctuation Sensitivity | Not available | punctuation_overrides={"sensitivity": 0.4} |
batch, rt |
Control punctuation frequency (0-1) |
| Punctuation Marks | Not available | punctuation_overrides={"permitted_marks": [".", ","]} |
batch, rt |
Limit allowed punctuation marks |
| Output Locale | Not available | output_locale="en-GB" |
batch, rt |
Regional spelling (en-GB, en-US, en-AU) |
| Profanity | profanity_filter=True |
Auto-tagged for en, it, es | batch, rt |
Deepgram removes, Speechmatics tags as $PROFANITY |
| Disfluencies | filler_words=True (include) |
transcript_filtering_config={"remove_disfluencies": True} |
batch, rt |
Deepgram includes by opt-in; Speechmatics auto-tags, optionally removes (EN only) |
| Word Replacement | replace=["old:new"] |
transcript_filtering_config={"replacements": [{"from": "old", "to": "new"}]} |
batch, rt |
Find/replace with regex support |
| Redaction | redact=["pci", "ssn", "numbers"] |
transcript_filtering_config={"replacements": [...]} |
batch, rt |
Use replacements to redact sensitive data |
| Audio Filtering | Not available | audio_filtering_config={"volume_threshold": 3.4} |
batch, rt |
Remove background speech by volume (0-100) |
| Custom Vocab | keywords=["term"], keyterm=["term"] |
additional_vocab=[{"content": "term", "sounds_like": [...]}] |
batch, rt |
Phonetic hints available |
Usage (Batch):
from speechmatics.batch import AsyncClient, TranscriptionConfig
config = TranscriptionConfig(
language="en",
enable_entities=True,
output_locale="en-GB",
punctuation_overrides={"sensitivity": 0.4},
transcript_filtering_config={"remove_disfluencies": True},
additional_vocab=[
{"content": "acetaminophen", "sounds_like": ["ah see tah min oh fen"]},
{"content": "myocardial infarction", "sounds_like": ["my oh car dee al in fark shun"]}
]
)
async with AsyncClient(api_key="YOUR_KEY") as client:
result = await client.transcribe("audio.wav", transcription_config=config)
print(result.transcript_text)Usage (Real-time):
from speechmatics.rt import AsyncClient, TranscriptionConfig, AudioFormat, AudioEncoding
config = TranscriptionConfig(
language="en",
enable_entities=True,
punctuation_overrides={"sensitivity": 0.4},
transcript_filtering_config={"remove_disfluencies": True}
)
async with AsyncClient(api_key="YOUR_KEY") as client:
await client.transcribe(
audio_file,
transcription_config=config,
audio_format=AudioFormat(encoding=AudioEncoding.PCM_S16LE, sample_rate=16000)
)Text-to-Speech (TTS) • Click to explore TTS Features
Speechmatics Package:
speechmatics-tts
| Feature | Deepgram | Speechmatics | Package | Notes |
|---|---|---|---|---|
| API Style | REST + WebSocket | REST | tts |
Both support audio output |
| Voices (EN) | Multiple Voices | 4 curated voices (sarah, theo, megan, jack) | tts |
Different voice selection approaches |
| Output Formats | Multiple encodings | wav_16000, pcm_16000 |
tts |
Standard formats supported |
| Sample Rate | Configurable | 16kHz (optimized for speech) | tts |
Speech-optimized defaults |
| Bit Rate | Configurable | Optimized defaults | tts |
Quality settings |
| Streaming TTS | WebSocket | HTTP chunked streaming | tts |
Both support streaming audio output |
| Callback | callback="url" |
Not available | - | Webhook support |
| Model Opt-out | mip_opt_out=True |
Options available post-preview | tts |
Privacy controls |
| Request Tags | tag=["label"] |
Via API headers | tts |
Request identification |
Usage:
# Deepgram TTS
from deepgram import DeepgramClient
client = DeepgramClient(api_key="YOUR_KEY")
with client.speak.v1.audio.generate(
text="Hello world",
model="aura-asteria-en",
encoding="linear16",
sample_rate=16000
) as response:
audio_data = response.data
# Speechmatics TTS
from speechmatics.tts import AsyncClient, Voice, OutputFormat
async with AsyncClient(api_key="YOUR_KEY") as client:
response = await client.generate(
text="Hello world",
voice=Voice.SARAH,
output_format=OutputFormat.WAV_16000
)
audio_data = await response.read()| Metric | Speechmatics | Deepgram |
|---|---|---|
| Word Error Rate (WER) | 6.8% | 16.5% |
| Medical Keyword Recall | 96% | - |
| Noisy Environments | Excellent | Standard |
| Accent Recognition | Market-leading | Standard |
| Multi-speaker Accuracy | Market-leading | Standard |
| Capability | Speechmatics | Deepgram |
|---|---|---|
| Languages Supported | 55+ | 30+ |
| Accuracy Consistency | Industry-leading across all | Varies by language |
| Bilingual Packs | Mandarin, Tamil, Malay, Tagalog + English | 10 European languages only |
| Real-time Translation | 30+ languages | ❌ |
| Auto Language Detection | ✅ | ✅ |
| Feature | Speechmatics | Deepgram |
|---|---|---|
| Domain-Specific Models | Medical, finance, and more | Limited |
| Custom Dictionary Size | 1,000 words included | 100 words |
| Speaker Diarization | Included | Extra charge |
| Speaker Identification | Known speaker pre-registration | ❌ |
| Speaker Focus | Focus/ignore specific speakers | ❌ |
| Deployment | Speechmatics | Deepgram |
|---|---|---|
| SaaS/Cloud | ✅ | ✅ |
| On-Premises | ✅ | Limited |
| On-Device | ✅ | ❌ |
| Air-Gapped | ✅ | ❌ |
- ISO 27001 certified
- GDPR compliant
- HIPAA compliant
Speechmatics excels in:
- Healthcare - 96% medical keyword recall with medical domain model
- Contact Centers - Speaker ID, focus, and multi-speaker accuracy
- Media & Captioning - High accuracy in noisy environments
- Finance - Enterprise security with air-gapped deployment
- Education - 55+ languages with consistent accuracy
Deepgram:
from deepgram import DeepgramClient, PrerecordedOptions
client = DeepgramClient(api_key="YOUR_API_KEY")
with open("audio.wav", "rb") as audio_file:
response = client.listen.prerecorded.transcribe_file(
audio_file,
PrerecordedOptions(
model="nova-3",
smart_format=True,
diarize=True
)
)
transcript = response.results.channels[0].alternatives[0].transcriptSpeechmatics:
import asyncio
from speechmatics.batch import AsyncClient, TranscriptionConfig
async def transcribe():
async with AsyncClient(api_key="YOUR_API_KEY") as client:
config = TranscriptionConfig(
language="en",
operating_point="enhanced",
diarization="speaker",
enable_entities=True
)
with open("audio.wav", "rb") as audio_file:
result = await client.transcribe(audio_file, transcription_config=config)
transcript = result.transcript_text
asyncio.run(transcribe())What Changed:
- Configuration is now in
TranscriptionConfigobject - Simpler result access with
result.transcript_text - Async-first for better performance and resource management
Deepgram:
from deepgram import DeepgramClient, LiveOptions
from deepgram.core.events import EventType
client = DeepgramClient(api_key="YOUR_API_KEY")
connection = client.listen.live.v("1")
def on_message(self, result, **kwargs):
# Check if this is a final transcript result
if hasattr(result, 'is_final') and result.is_final:
sentence = result.channel.alternatives[0].transcript
if len(sentence) > 0:
print(sentence)
connection.on(EventType.MESSAGE, on_message)
connection.start(LiveOptions(model="nova-3", language="en-US", diarize=True))
connection.send(audio_chunk)
connection.finish()Speechmatics:
from speechmatics.rt import AsyncClient, ServerMessageType, TranscriptResult, AudioFormat, AudioEncoding, TranscriptionConfig
async def stream_audio():
async with AsyncClient(api_key="YOUR_API_KEY") as client:
@client.on(ServerMessageType.ADD_TRANSCRIPT)
def on_transcript(message):
result = TranscriptResult.from_message(message)
print(result.metadata.transcript)
@client.on(ServerMessageType.ADD_PARTIAL_TRANSCRIPT)
def on_partial(message):
result = TranscriptResult.from_message(message)
print(f"Partial: {result.metadata.transcript}")
with open("audio.wav", "rb") as audio_file:
await client.transcribe(
audio_file,
transcription_config=TranscriptionConfig(
language="en",
operating_point="enhanced",
diarization="speaker",
enable_partials=True
),
audio_format=AudioFormat(
encoding=AudioEncoding.PCM_S16LE,
sample_rate=16000
)
)
asyncio.run(stream_audio())What Changed:
- Event-driven architecture with decorators
- Structured message types via
ServerMessageTypeenum - Better type safety with
TranscriptResultobjects - Separate events for final and partial transcripts
Deepgram:
options = PrerecordedOptions(
model="nova-3",
diarize=True,
utterances=True
)
response = client.listen.prerecorded.transcribe_file(audio_file, options)
for word in response.results.channels[0].alternatives[0].words:
print(f"Speaker {word.speaker}: {word.word}")Speechmatics:
config = TranscriptionConfig(
language="en",
diarization="speaker",
# max_speakers is optional - see note below
)
result = await client.transcribe(audio_file, transcription_config=config)
for item in result.results:
if item.type == "word":
print(f"Speaker {item.attaches_to}: {item.alternatives[0].content}")Advantages:
- Higher accuracy in multi-speaker scenarios
- Automatic speaker count detection
- Fine-grained diarization controls via
speaker_diarization_config
Note
max_speakers**: When set, the system consolidates all detected speakers into the specified number of groups. For example, max_speakers=2 with 4 actual speakers will merge them into just 2 speaker labels. Only use this when you're certain about the exact speaker count (e.g., a two-person interview). For most scenarios, omit this setting for automatic detection.
Speaker Focus allows you to designate primary speakers whose speech drives the conversation flow. This is useful for voice assistants where you want to focus on the user and ignore background speakers or the assistant's own voice.
Deepgram: Not available
Speechmatics (Voice SDK):
from speechmatics.voice import VoiceAgentClient, VoiceAgentConfig, SpeakerFocusConfig, SpeakerFocusMode
config = VoiceAgentConfig(
language="en",
enable_diarization=True,
speaker_config=SpeakerFocusConfig(
focus_speakers=["S1"], # Primary speaker(s) to focus on
ignore_speakers=["__ASSISTANT__"], # Speakers to completely exclude
focus_mode=SpeakerFocusMode.RETAIN # or IGNORE
)
)
async with VoiceAgentClient(api_key="YOUR_KEY", config=config) as client:
# Only S1 can drive conversation flow
# Other speakers' words only appear alongside focused speaker's speech
...Focus Mode Options:
| Mode | Behavior |
|---|---|
RETAIN |
Non-focused speakers' words are still emitted, but marked as passive. They only appear when a focused speaker is also speaking. |
IGNORE |
Non-focused speakers are completely excluded from output. |
Key Behavior: Only focused speakers can "drive" the conversation - their speech triggers VAD events, turn detection, and segment finalization. Non-focused speakers' words are processed but only emitted alongside active focused speaker content.
Deepgram:
options = PrerecordedOptions(
model="nova-3",
keywords=["Speechmatics", "DeepSeek", "TechTerm:2"] # keyword:boost
)Speechmatics:
config = TranscriptionConfig(
language="en",
additional_vocab=[
{"content": "Speechmatics", "sounds_like": ["speech matics"]},
{"content": "DeepSeek"},
{"content": "TechTerm", "sounds_like": ["tek term", "tech term"]},
]
)Features:
- Phonetic alternatives with
sounds_likefor pronunciation variants - 1,000 words included (vs Deepgram's 100)
- Better recognition of domain-specific terms
Deepgram:
options = PrerecordedOptions(
model="nova-3",
profanity_filter=True, # Removes profanities
filler_words=True, # Removes filler words
replace=["SSN:REDACTED", "password:REDACTED"]
)Speechmatics:
# Profanity tagging is automatic for en, it, es
config = {
"language": "en",
"transcript_filtering_config": {
"remove_disfluencies": True, # Remove "um", "uh", etc.
"replacements": [
{"from": "SSN", "to": "REDACTED"},
{"from": "password", "to": "REDACTED"}
]
}
}Key Differences:
- Profanity: Deepgram removes, Speechmatics auto-tags (appears as
$PROFANITY) - Disfluencies: Both support removal of filler words
- Redaction: Both support word replacement
{
"metadata": {...},
"results": {
"channels": [{
"alternatives": [{
"transcript": "Full transcript text",
"confidence": 0.98,
"words": [
{
"word": "hello",
"start": 0.0,
"end": 0.5,
"confidence": 0.99,
"speaker": 0
}
]
}]
}]
}
}{
"transcript_text": "Full transcript text",
"results": [
{
"type": "word",
"start_time": 0.0,
"end_time": 0.5,
"alternatives": [
{
"content": "hello",
"confidence": 0.99
}
],
"attaches_to": "speaker_1"
}
],
"metadata": {...}
}Key Differences:
- Speechmatics provides
transcript_textat the top level for quick access - Results are flat arrays instead of nested channels
- Speaker is referenced via
attaches_tofield
- Text-to-text search/keyword boosting
- Phonetic hints (
sounds_likeinadditional_vocab) - Real-time translation (
TranslationConfig) - Turn detection for voice agents (Voice SDK) with FIXED, ADAPTIVE, and EXTERNAL modes, plus Smart Turn ML
- Comprehensive audio intelligence (sentiment + topics + summary together)
- More granular speaker diarization controls (
SpeakerDiarizationConfig) - Known speaker pre-registration (
speaker_diarization_config.speakers) - Speaker Focus configuration - designate primary speakers, ignore others (e.g., assistant voice)
- Voice SDK for conversational AI
- Auto-disfluency tagging (automatic for English)
- On-device and air-gapped deployment
- Review feature mapping table above
- Identify features you're currently using in Deepgram
- Check language support for your use case
- Sign up at portal.speechmatics.com
- Get API key from portal
- Apply code
SWITCH200for $200 free credit
- Install SDK:
pip install speechmatics-batch speechmatics-rt - Replace
DEEPGRAM_API_KEYwithSPEECHMATICS_API_KEY - Update imports from
deepgramtospeechmatics.batchorspeechmatics.rt - Convert
PrerecordedOptions/LiveOptionstoTranscriptionConfig - Update event handlers (replace
EventTypewithServerMessageType) - Adjust result parsing (use
result.transcript_text)
- Test with same audio files used in Deepgram
- Verify accuracy meets or exceeds previous results
- Test error handling and retry logic
- Performance testing for streaming use cases
- Update production environment variables
- Deploy to staging environment
- Monitor transcription quality
- Verify usage metrics in portal
Speechmatics SDK is async-first:
import asyncio
async def main():
async with AsyncClient(api_key="YOUR_API_KEY") as client:
result = await client.transcribe(audio_file, transcription_config=config)
print(result.transcript_text)
asyncio.run(main())# Deepgram
text = response.results.channels[0].alternatives[0].transcript
# Speechmatics - simpler
text = result.transcript_text# Deepgram - uses generic MESSAGE event, check is_final for final vs partial
connection.on(EventType.MESSAGE, on_message)
# Speechmatics - separate events for final and partial
@client.on(ServerMessageType.ADD_TRANSCRIPT)
def on_transcript(message):
...# Deepgram - in options
options = LiveOptions(encoding="linear16", sample_rate=16000)
# Speechmatics - separate object
audio_format = AudioFormat(encoding=AudioEncoding.PCM_S16LE, sample_rate=16000)# Deepgram - requires locale variants
options = PrerecordedOptions(language="en-US") # or "en-GB", "en-AU"
# Speechmatics - just the language code, handles all accents automatically
config = TranscriptionConfig(language="en") # Works for US, UK, AU, etc.
# Mandarin uses output_locale for character formatting
config = TranscriptionConfig(
language="cmn",
output_locale="cmn-Hans" # Simplified Chinese (or "cmn-Hant" for Traditional)
)Speechmatics' models are trained on diverse accents and don't require locale specification. Use output_locale for region-specific formatting (e.g., "en-GB" vs "en-US" spelling, or "cmn-Hans" vs "cmn-Hant" for Mandarin characters).
from deepgram import DeepgramClient, PrerecordedOptions
import os
def transcribe_audio():
client = DeepgramClient(api_key=os.getenv("DEEPGRAM_API_KEY"))
with open("audio.wav", "rb") as audio_file:
response = client.listen.prerecorded.transcribe_file(
audio_file,
PrerecordedOptions(
model="nova-3",
smart_format=True,
diarize=True,
language="en-US",
keywords=["ProductName", "TechTerm"]
)
)
return response.results.channels[0].alternatives[0].transcript
print(transcribe_audio())import asyncio
import os
from speechmatics.batch import AsyncClient, TranscriptionConfig
async def transcribe_audio():
async with AsyncClient(api_key=os.getenv("SPEECHMATICS_API_KEY")) as client:
config = TranscriptionConfig(
language="en",
operating_point="enhanced",
diarization="speaker",
enable_entities=True,
additional_vocab=[
{"content": "ProductName"},
{"content": "TechTerm"}
]
)
with open("audio.wav", "rb") as audio_file:
result = await client.transcribe(audio_file, transcription_config=config)
return result.transcript_text
print(asyncio.run(transcribe_audio()))See complete working examples in:
- Batch vs Real-time - Understand API modes
- Voice Agent Turn Detection - Voice SDK with presets
- Email: devrel@speechmatics.com
- SDK Documentation
- Why Switch from Deepgram - Official comparison
- Hello World - Start here
- Batch vs Real-time - Understand API modes
- Configuration Guide - All config options
Help us improve this guide:
- Found an issue? Report it
- Have suggestions? Open a discussion
Time to Migrate: 30-60 minutes Difficulty: Intermediate Languages: Python