Name	Name	Last commit message	Last commit date
parent directory ..
assets	assets
python	python
.env.example	.env.example
README.md	README.md

Migrating from Deepgram to Speechmatics

Migration • Feature Comparison • Code Examples

Switching from Deepgram? This guide shows you equivalent features and code patterns to help you migrate smoothly.

Note

Migration Incentive: Get $200 free credit with code SWITCH200 when switching from Deepgram! Learn more

Feature Mapping
Why Switch?
Code Migration Examples
Response Structure
Migration Checklist
Common Gotchas
Need Help?

Feature Mapping

Core Configuration

Feature	Deepgram	Speechmatics	Notes
Model Selection	`model="nova-3"`	`operating_point="enhanced"`	Enhanced for best accuracy, `"standard"` for faster turnaround
Language	`language="en-US"`	`language="en"`	Speechmatics uses ISO 639-1 codes; no locale variants needed (handles all accents automatically). Mandarin uses `cmn` with `output_locale` for Simplified/Traditional formatting
Sample Rate	`sample_rate=16000`	`sample_rate=16000`	Same parameter in AudioFormat
Encoding	`encoding="linear16"`	`encoding="pcm_s16le"`	Slightly different naming
Channels	`channels=1`	Via `diarization="channel"` + `AsyncMultiChannelClient`	Speechmatics uses separate streams per channel
API Key	`DEEPGRAM_API_KEY`	`SPEECHMATICS_API_KEY`	Environment variable naming

Real-time Streaming & Voice Features • Click to explore RealTime and Voice Features

Speechmatics Packages: speechmatics-rt for basic real-time streaming, speechmatics-voice for voice agent features (turn detection, segments, VAD events). Voice SDK is built on top of RT SDK.

Feature	Deepgram	Speechmatics	Package	Notes
Interim Results	`interim_results=True`	`enable_partials=True`	`rt`, `voice`	Partial transcripts while processing
Endpointing	`endpointing=500` (ms)	`max_delay=0.5` (seconds)	`rt`, `voice`	Duration engine waits to verify partial word accuracy before committing (0.7-4.0s)
Max Delay Mode	Not available	`max_delay_mode="flexible"` or `"fixed"`	`rt`, `voice`	Flexible allows entity completion
Utterance End	`utterance_end_ms=1000`	`end_of_utterance_silence_trigger=1.0`	`rt`, `voice`	Reference silence duration (0-2s); ADAPTIVE mode scales this based on speech patterns
Force End Utterance	`Finalize` message	`client.finalize(end_of_turn=True)`	`voice`	Manually trigger end of utterance
VAD Events	`vad_events=True` (Beta)	`AgentServerMessageType.SPEAKER_STARTED` `AgentServerMessageType.SPEAKER_ENDED`	`voice`	Voice activity detection events
Diarization	`diarize=True`	`diarization="speaker"`	`rt`, `voice`	Speaker labeling
Speaker Config	Not available	`speaker_diarization_config=` `SpeakerDiarizationConfig(...)`	`rt`, `voice`	Fine-tune diarization
Known Speakers	Not available	`known_speakers=` `[SpeakerIdentifier(label, speaker_identifiers)]`	`rt`, `voice`	Pre-register speaker voices
Speaker Focus	Not available	`SpeakerFocusConfig(focus_speakers, ignore_speakers, focus_mode)`	`voice`	Focus on specific speakers; only focused speakers drive conversation flow
Multichannel	`multichannel=True`	`diarization="channel"` or `"channel_and_speaker"`	`rt`, `voice`	Channel-based diarization
Channel Labels	Not available	`channel_diarization_labels=["agent", "customer"]`	`rt`, `voice`	Label audio channels
Keywords/Keyterms	`keywords=["term"]`, `keyterm=["term"]`	`additional_vocab=[{"content": "term"}]`	`rt`, `voice`	Boost specific terms
Translation	Not available	`translation_config=` `TranslationConfig(target_languages=["es"]`	`rt`	Real-time translation
Audio Events	Not available	`audio_events_config=AudioEventsConfig(types=[...])`	`rt`	Detect laughter, applause, etc.
Domain	Not available	`domain="medical"`	`rt`, `voice`	Domain-optimized language pack

Turn Detection (Voice SDK):

Feature	Deepgram	Speechmatics	Notes
Fixed Delay	Via settings	`EndOfUtteranceMode.FIXED`	Waits exactly the configured silence duration every time
Adaptive Delay	Not available	`EndOfUtteranceMode.ADAPTIVE`	Scales wait time based on speech pace, filler words (um/uh), and punctuation
Smart Turn (ML)	Not available	`smart_turn_config=SmartTurnConfig(enabled=True)`	Uses ML model to predict semantic turn completions (with ADAPTIVE mode)
External Control	Not available	`EndOfUtteranceMode.EXTERNAL` + `client.finalize(end_of_turn=True)`	Application controls turn endings (for Pipecat/LiveKit integration)
Silence Trigger	Via settings	`end_of_utterance_silence_trigger`	Reference duration (0-2s); ADAPTIVE mode applies multipliers based on context
Presets	Not available	`preset="fast"`, `"fixed"`, `"adaptive"`, `"smart_turn"`, `"scribe"`, `"captions"`, `"external"`	Ready-to-use configurations optimized for specific use cases

Server Message Types:

Deepgram Event	Speechmatics Event	Package	Notes
`EventType.MESSAGE` (is_final=True)	`ServerMessageType.ADD_TRANSCRIPT`	`rt`	Final transcript
`EventType.MESSAGE` (is_final=False)	`ServerMessageType.ADD_PARTIAL_TRANSCRIPT`	`rt`	Partial results
`EventType.MESSAGE` (UtteranceEnd)	`ServerMessageType.END_OF_UTTERANCE`	`rt`	End of utterance
`EventType.MESSAGE` (SpeechStarted)	`AgentServerMessageType.SPEAKER_STARTED`	`voice`	Speech detected
`EventType.MESSAGE` (Metadata)	`ServerMessageType.RECOGNITION_STARTED`	`rt`, `voice`	Session metadata
Not available	`AgentServerMessageType.SPEAKER_ENDED`	`voice`	Speech ended
Not available	`AgentServerMessageType.ADD_SEGMENT`	`voice`	Final segment
Not available	`AgentServerMessageType.ADD_PARTIAL_SEGMENT`	`voice`	Partial segment
Not available	`AgentServerMessageType.START_OF_TURN`	`voice`	Turn started
Not available	`AgentServerMessageType.END_OF_TURN`	`voice`	Turn completed
Not available	`AgentServerMessageType.END_OF_TURN_PREDICTION`	`voice`	Turn prediction timing
Not available	`ServerMessageType.ADD_TRANSLATION`	`rt`	Translation result
Not available	`ServerMessageType.AUDIO_EVENT_STARTED` / `ENDED`	`rt`	Audio events
Not available	`ServerMessageType.SPEAKERS_RESULT`	`rt`	Speaker identification

Usage - Basic RT Streaming:

from speechmatics.rt import AsyncClient, ServerMessageType, TranscriptionConfig, AudioFormat, AudioEncoding

async with AsyncClient(api_key="YOUR_KEY") as client:
    @client.on(ServerMessageType.ADD_TRANSCRIPT)
    def on_transcript(message):
        print(message['metadata']['transcript'])

    await client.transcribe(
        audio_file,
        transcription_config=TranscriptionConfig(language="en", diarization="speaker"),
        audio_format=AudioFormat(encoding=AudioEncoding.PCM_S16LE, sample_rate=16000)
    )

Usage - Voice SDK (Turn Detection):

from speechmatics.voice import VoiceAgentClient, VoiceAgentConfig, EndOfUtteranceMode, AgentServerMessageType

config = VoiceAgentConfig(
    language="en",
    enable_diarization=True,
    end_of_utterance_mode=EndOfUtteranceMode.ADAPTIVE,
    end_of_utterance_silence_trigger=0.5
)

async with VoiceAgentClient(api_key="YOUR_KEY", config=config) as client:
    @client.on(AgentServerMessageType.ADD_SEGMENT)
    def on_segment(message):
        for segment in message['segments']:
            print(f"[{segment['speaker_id']}]: {segment['text']}")

    @client.on(AgentServerMessageType.END_OF_TURN)
    def on_turn_end(message):
        print("User finished speaking - ready for response")

    await client.send_audio(audio_chunk)

Batch Transcription Features • Click to explore Batch Features

Speechmatics Package: speechmatics-batch

Feature	Deepgram	Speechmatics	Package	Notes
Diarization	`diarize=True`, `diarize_version="latest"`	`diarization="speaker"`	`batch`	Speaker identification
Multichannel	`multichannel=True`	`diarization="channel"` or `"channel_and_speaker"`	`batch`	Channel-based diarization
Sentiment	`sentiment=True`	`sentiment_analysis_config=SentimentAnalysisConfig()`	`batch`	Sentiment analysis
Topic Detection	`topics=True`	`topic_detection_config=TopicDetectionConfig(topics=[...])`	`batch`	Automatic topic extraction
Summarization	`summarize=True`	`summarization_config=` `SummarizationConfig(content_type, summary_length, summary_type)`	`batch`	AI-powered summaries
Intent Recognition	`intents=True`	Not available	-	Detect user intents
Entity Detection	`detect_entities=True`	`enable_entities=True`	`batch`	Detect named entities
Utterances	`utterances=True`, `utt_split=0.8`	Not available	-	Split into utterances
Paragraphs	`paragraphs=True`	Not available	-	Paragraph segmentation
Dictation	`dictation=True`	Not available	-	Dictation mode formatting
Measurements	`measurements=True`	`enable_entities=True`	`batch`	Format measurements (e.g., "10 km/s")
Auto Chapters	Not available	`auto_chapters_config=AutoChaptersConfig()`	`batch`	Automatic chapter generation
Audio Events	Not available	`audio_events_config=AudioEventsConfig(types=[...])`	`batch`	Detect laughter, applause, etc.
Translation	Not available	`translation_config=TranslationConfig(target_languages=["es", "fr"])`	`batch`	Translate transcript
Language ID	`detect_language=True`	`language_identification_config=` `LanguageIdentificationConfig(expected_languages=[...])`	`batch`	Identify spoken language
Domain	Not available	`domain="medical"`	`batch`	Domain-optimized language pack
Output Locale	Not available	`output_locale="en-US"`	`batch`	RFC-5646 locale for output
Output Format	`?format=srt`	`get_transcript(job_id, format_type=FormatType.SRT)`	`batch`	JSON, TXT, SRT formats
Webhooks	`callback="url"`	`notification_config=` `[NotificationConfig(url, contents, method)]`	`batch`	Job completion notifications
Job Tracking	`extra=KEY:VALUE`	`tracking=TrackingConfig(title, reference, tags)`	`batch`	Custom job metadata
Fetch from URL	`url=...`	`fetch_data=FetchData(url, auth_headers)`	`batch`	Transcribe from URL

Usage:

from speechmatics.batch import AsyncClient, JobConfig, JobType, TranscriptionConfig, SummarizationConfig

async with AsyncClient(api_key="YOUR_KEY") as client:
    config = JobConfig(
        type=JobType.TRANSCRIPTION,
        transcription_config=TranscriptionConfig(
            language="en",
            diarization="speaker",
            enable_entities=True
        ),
        summarization_config=SummarizationConfig(
            content_type="conversational",
            summary_length="brief"
        )
    )

    result = await client.transcribe("audio.wav", config=config)
    print(result.transcript_text)
    print(result.summary)

Output Formatting & Filtering • Click to explore Formatting Options

Speechmatics Packages: speechmatics-batch, speechmatics-rt - formatting features available in both batch and real-time.

Note: Parameters like punctuation_overrides, transcript_filtering_config, and audio_filtering_config accept dict objects. The SDK passes these directly to the API - refer to API documentation for valid keys.

Feature	Deepgram	Speechmatics	Package	Notes
Smart Formatting	`smart_format=True`	`enable_entities=True`	`batch`, `rt`	Dates, numbers, currencies, emails, etc.
Punctuation	`punctuate=True`	Enabled by default	`batch`, `rt`	Automatic punctuation
Punctuation Sensitivity	Not available	`punctuation_overrides={"sensitivity": 0.4}`	`batch`, `rt`	Control punctuation frequency (0-1)
Punctuation Marks	Not available	`punctuation_overrides={"permitted_marks": [".", ","]}`	`batch`, `rt`	Limit allowed punctuation marks
Output Locale	Not available	`output_locale="en-GB"`	`batch`, `rt`	Regional spelling (en-GB, en-US, en-AU)
Profanity	`profanity_filter=True`	Auto-tagged for en, it, es	`batch`, `rt`	Deepgram removes, Speechmatics tags as `$PROFANITY`
Disfluencies	`filler_words=True` (include)	`transcript_filtering_config=` `{"remove_disfluencies": True}`	`batch`, `rt`	Deepgram includes by opt-in; Speechmatics auto-tags, optionally removes (EN only)
Word Replacement	`replace=["old:new"]`	`transcript_filtering_config={"replacements": [{"from": "old", "to": "new"}]}`	`batch`, `rt`	Find/replace with regex support
Redaction	`redact=["pci", "ssn", "numbers"]`	`transcript_filtering_config={"replacements": [...]}`	`batch`, `rt`	Use replacements to redact sensitive data
Audio Filtering	Not available	`audio_filtering_config={"volume_threshold": 3.4}`	`batch`, `rt`	Remove background speech by volume (0-100)
Custom Vocab	`keywords=["term"]`, `keyterm=["term"]`	`additional_vocab=[{"content": "term", "sounds_like": [...]}]`	`batch`, `rt`	Phonetic hints available

Usage (Batch):

from speechmatics.batch import AsyncClient, TranscriptionConfig

config = TranscriptionConfig(
    language="en",
    enable_entities=True,
    output_locale="en-GB",
    punctuation_overrides={"sensitivity": 0.4},
    transcript_filtering_config={"remove_disfluencies": True},
    additional_vocab=[
        {"content": "acetaminophen", "sounds_like": ["ah see tah min oh fen"]},
        {"content": "myocardial infarction", "sounds_like": ["my oh car dee al in fark shun"]}
    ]
)

async with AsyncClient(api_key="YOUR_KEY") as client:
    result = await client.transcribe("audio.wav", transcription_config=config)
    print(result.transcript_text)

Usage (Real-time):

from speechmatics.rt import AsyncClient, TranscriptionConfig, AudioFormat, AudioEncoding

config = TranscriptionConfig(
    language="en",
    enable_entities=True,
    punctuation_overrides={"sensitivity": 0.4},
    transcript_filtering_config={"remove_disfluencies": True}
)

async with AsyncClient(api_key="YOUR_KEY") as client:
    await client.transcribe(
        audio_file,
        transcription_config=config,
        audio_format=AudioFormat(encoding=AudioEncoding.PCM_S16LE, sample_rate=16000)
    )

Text-to-Speech (TTS) • Click to explore TTS Features

Speechmatics Package: speechmatics-tts

Feature	Deepgram	Speechmatics	Package	Notes
API Style	REST + WebSocket	REST	`tts`	Both support audio output
Voices (EN)	Multiple Voices	4 curated voices (sarah, theo, megan, jack)	`tts`	Different voice selection approaches
Output Formats	Multiple encodings	`wav_16000`, `pcm_16000`	`tts`	Standard formats supported
Sample Rate	Configurable	16kHz (optimized for speech)	`tts`	Speech-optimized defaults
Bit Rate	Configurable	Optimized defaults	`tts`	Quality settings
Streaming TTS	WebSocket	HTTP chunked streaming	`tts`	Both support streaming audio output
Callback	`callback="url"`	Not available	-	Webhook support
Model Opt-out	`mip_opt_out=True`	Options available post-preview	`tts`	Privacy controls
Request Tags	`tag=["label"]`	Via API headers	`tts`	Request identification

Usage:

# Deepgram TTS
from deepgram import DeepgramClient
client = DeepgramClient(api_key="YOUR_KEY")
with client.speak.v1.audio.generate(
    text="Hello world",
    model="aura-asteria-en",
    encoding="linear16",
    sample_rate=16000
) as response:
    audio_data = response.data

# Speechmatics TTS
from speechmatics.tts import AsyncClient, Voice, OutputFormat
async with AsyncClient(api_key="YOUR_KEY") as client:
    response = await client.generate(
        text="Hello world",
        voice=Voice.SARAH,
        output_format=OutputFormat.WAV_16000
    )
    audio_data = await response.read()

Why Switch?

Superior Accuracy

Metric	Speechmatics	Deepgram
Word Error Rate (WER)	6.8%	16.5%
Medical Keyword Recall	96%	-
Noisy Environments	Excellent	Standard
Accent Recognition	Market-leading	Standard
Multi-speaker Accuracy	Market-leading	Standard

More Languages

Capability	Speechmatics	Deepgram
Languages Supported	55+	30+
Accuracy Consistency	Industry-leading across all	Varies by language
Bilingual Packs	Mandarin, Tamil, Malay, Tagalog + English	10 European languages only
Real-time Translation	30+ languages	❌
Auto Language Detection	✅	✅

Advanced Features

Feature	Speechmatics	Deepgram
Domain-Specific Models	Medical, finance, and more	Limited
Custom Dictionary Size	1,000 words included	100 words
Speaker Diarization	Included	Extra charge
Speaker Identification	Known speaker pre-registration	❌
Speaker Focus	Focus/ignore specific speakers	❌

Flexible Deployment Options

Deployment	Speechmatics	Deepgram
SaaS/Cloud	✅	✅
On-Premises	✅	Limited
On-Device	✅	❌
Air-Gapped	✅	❌

Enterprise-Grade Security

ISO 27001 certified
GDPR compliant
HIPAA compliant

Industries & Use Cases

Speechmatics excels in:

Healthcare - 96% medical keyword recall with medical domain model
Contact Centers - Speaker ID, focus, and multi-speaker accuracy
Media & Captioning - High accuracy in noisy environments
Finance - Enterprise security with air-gapped deployment
Education - 55+ languages with consistent accuracy

Code Migration Examples

Batch Transcription

Deepgram:

from deepgram import DeepgramClient, PrerecordedOptions

client = DeepgramClient(api_key="YOUR_API_KEY")

with open("audio.wav", "rb") as audio_file:
    response = client.listen.prerecorded.transcribe_file(
        audio_file,
        PrerecordedOptions(
            model="nova-3",
            smart_format=True,
            diarize=True
        )
    )

transcript = response.results.channels[0].alternatives[0].transcript

Speechmatics:

import asyncio
from speechmatics.batch import AsyncClient, TranscriptionConfig

async def transcribe():
    async with AsyncClient(api_key="YOUR_API_KEY") as client:
        config = TranscriptionConfig(
            language="en",
            operating_point="enhanced",
            diarization="speaker",
            enable_entities=True
        )

        with open("audio.wav", "rb") as audio_file:
            result = await client.transcribe(audio_file, transcription_config=config)
            transcript = result.transcript_text

asyncio.run(transcribe())

What Changed:

Configuration is now in TranscriptionConfig object
Simpler result access with result.transcript_text
Async-first for better performance and resource management

Real-time Streaming

Deepgram:

from deepgram import DeepgramClient, LiveOptions
from deepgram.core.events import EventType

client = DeepgramClient(api_key="YOUR_API_KEY")
connection = client.listen.live.v("1")

def on_message(self, result, **kwargs):
    # Check if this is a final transcript result
    if hasattr(result, 'is_final') and result.is_final:
        sentence = result.channel.alternatives[0].transcript
        if len(sentence) > 0:
            print(sentence)

connection.on(EventType.MESSAGE, on_message)
connection.start(LiveOptions(model="nova-3", language="en-US", diarize=True))
connection.send(audio_chunk)
connection.finish()

Speechmatics:

from speechmatics.rt import AsyncClient, ServerMessageType, TranscriptResult, AudioFormat, AudioEncoding, TranscriptionConfig

async def stream_audio():
    async with AsyncClient(api_key="YOUR_API_KEY") as client:

        @client.on(ServerMessageType.ADD_TRANSCRIPT)
        def on_transcript(message):
            result = TranscriptResult.from_message(message)
            print(result.metadata.transcript)

        @client.on(ServerMessageType.ADD_PARTIAL_TRANSCRIPT)
        def on_partial(message):
            result = TranscriptResult.from_message(message)
            print(f"Partial: {result.metadata.transcript}")

        with open("audio.wav", "rb") as audio_file:
            await client.transcribe(
                audio_file,
                transcription_config=TranscriptionConfig(
                    language="en",
                    operating_point="enhanced",
                    diarization="speaker",
                    enable_partials=True
                ),
                audio_format=AudioFormat(
                    encoding=AudioEncoding.PCM_S16LE,
                    sample_rate=16000
                )
            )

asyncio.run(stream_audio())

What Changed:

Event-driven architecture with decorators
Structured message types via ServerMessageType enum
Better type safety with TranscriptResult objects
Separate events for final and partial transcripts

Speaker Diarization

Deepgram:

options = PrerecordedOptions(
    model="nova-3",
    diarize=True,
    utterances=True
)

response = client.listen.prerecorded.transcribe_file(audio_file, options)

for word in response.results.channels[0].alternatives[0].words:
    print(f"Speaker {word.speaker}: {word.word}")

Speechmatics:

config = TranscriptionConfig(
    language="en",
    diarization="speaker",
    # max_speakers is optional - see note below
)

result = await client.transcribe(audio_file, transcription_config=config)

for item in result.results:
    if item.type == "word":
        print(f"Speaker {item.attaches_to}: {item.alternatives[0].content}")

Advantages:

Higher accuracy in multi-speaker scenarios
Automatic speaker count detection
Fine-grained diarization controls via speaker_diarization_config

Note

max_speakers**: When set, the system consolidates all detected speakers into the specified number of groups. For example, max_speakers=2 with 4 actual speakers will merge them into just 2 speaker labels. Only use this when you're certain about the exact speaker count (e.g., a two-person interview). For most scenarios, omit this setting for automatic detection.

Speaker Focus (Voice SDK Only)

Speaker Focus allows you to designate primary speakers whose speech drives the conversation flow. This is useful for voice assistants where you want to focus on the user and ignore background speakers or the assistant's own voice.

Deepgram: Not available

Speechmatics (Voice SDK):

from speechmatics.voice import VoiceAgentClient, VoiceAgentConfig, SpeakerFocusConfig, SpeakerFocusMode

config = VoiceAgentConfig(
    language="en",
    enable_diarization=True,
    speaker_config=SpeakerFocusConfig(
        focus_speakers=["S1"],           # Primary speaker(s) to focus on
        ignore_speakers=["__ASSISTANT__"],  # Speakers to completely exclude
        focus_mode=SpeakerFocusMode.RETAIN  # or IGNORE
    )
)

async with VoiceAgentClient(api_key="YOUR_KEY", config=config) as client:
    # Only S1 can drive conversation flow
    # Other speakers' words only appear alongside focused speaker's speech
    ...

Focus Mode Options:

Mode	Behavior
`RETAIN`	Non-focused speakers' words are still emitted, but marked as passive. They only appear when a focused speaker is also speaking.
`IGNORE`	Non-focused speakers are completely excluded from output.

Key Behavior: Only focused speakers can "drive" the conversation - their speech triggers VAD events, turn detection, and segment finalization. Non-focused speakers' words are processed but only emitted alongside active focused speaker content.

Custom Vocabulary

Deepgram:

options = PrerecordedOptions(
    model="nova-3",
    keywords=["Speechmatics", "DeepSeek", "TechTerm:2"]  # keyword:boost
)

Speechmatics:

config = TranscriptionConfig(
    language="en",
    additional_vocab=[
        {"content": "Speechmatics", "sounds_like": ["speech matics"]},
        {"content": "DeepSeek"},
        {"content": "TechTerm", "sounds_like": ["tek term", "tech term"]},
    ]
)

Features:

Phonetic alternatives with sounds_like for pronunciation variants
1,000 words included (vs Deepgram's 100)
Better recognition of domain-specific terms

Content Filtering

Deepgram:

options = PrerecordedOptions(
    model="nova-3",
    profanity_filter=True,  # Removes profanities
    filler_words=True,       # Removes filler words
    replace=["SSN:REDACTED", "password:REDACTED"]
)

Speechmatics:

# Profanity tagging is automatic for en, it, es
config = {
    "language": "en",
    "transcript_filtering_config": {
        "remove_disfluencies": True,  # Remove "um", "uh", etc.
        "replacements": [
            {"from": "SSN", "to": "REDACTED"},
            {"from": "password", "to": "REDACTED"}
        ]
    }
}

Key Differences:

Profanity: Deepgram removes, Speechmatics auto-tags (appears as $PROFANITY)
Disfluencies: Both support removal of filler words
Redaction: Both support word replacement

Response Structure

Deepgram Response

{
  "metadata": {...},
  "results": {
    "channels": [{
      "alternatives": [{
        "transcript": "Full transcript text",
        "confidence": 0.98,
        "words": [
          {
            "word": "hello",
            "start": 0.0,
            "end": 0.5,
            "confidence": 0.99,
            "speaker": 0
          }
        ]
      }]
    }]
  }
}

Speechmatics Response

{
  "transcript_text": "Full transcript text",
  "results": [
    {
      "type": "word",
      "start_time": 0.0,
      "end_time": 0.5,
      "alternatives": [
        {
          "content": "hello",
          "confidence": 0.99
        }
      ],
      "attaches_to": "speaker_1"
    }
  ],
  "metadata": {...}
}

Key Differences:

Speechmatics provides transcript_text at the top level for quick access
Results are flat arrays instead of nested channels
Speaker is referenced via attaches_to field

Features Unique to Each Platform

Deepgram Only

Text-to-text search/keyword boosting

Speechmatics Only

Phonetic hints (sounds_like in additional_vocab)
Real-time translation (TranslationConfig)
Turn detection for voice agents (Voice SDK) with FIXED, ADAPTIVE, and EXTERNAL modes, plus Smart Turn ML
Comprehensive audio intelligence (sentiment + topics + summary together)
More granular speaker diarization controls (SpeakerDiarizationConfig)
Known speaker pre-registration (speaker_diarization_config.speakers)
Speaker Focus configuration - designate primary speakers, ignore others (e.g., assistant voice)
Voice SDK for conversational AI
Auto-disfluency tagging (automatic for English)
On-device and air-gapped deployment

Migration Checklist

Pre-Migration

Review feature mapping table above
Identify features you're currently using in Deepgram
Check language support for your use case
Sign up at portal.speechmatics.com
Get API key from portal
Apply code SWITCH200 for $200 free credit

Code Migration

Install SDK: pip install speechmatics-batch speechmatics-rt
Replace DEEPGRAM_API_KEY with SPEECHMATICS_API_KEY
Update imports from deepgram to speechmatics.batch or speechmatics.rt
Convert PrerecordedOptions/LiveOptions to TranscriptionConfig
Update event handlers (replace EventType with ServerMessageType)
Adjust result parsing (use result.transcript_text)

Testing

Test with same audio files used in Deepgram
Verify accuracy meets or exceeds previous results
Test error handling and retry logic
Performance testing for streaming use cases

Deployment

Update production environment variables
Deploy to staging environment
Monitor transcription quality
Verify usage metrics in portal

Common Gotchas

1. Async/Await Pattern

Speechmatics SDK is async-first:

import asyncio

async def main():
    async with AsyncClient(api_key="YOUR_API_KEY") as client:
        result = await client.transcribe(audio_file, transcription_config=config)
        print(result.transcript_text)

asyncio.run(main())

2. Response Structure

# Deepgram
text = response.results.channels[0].alternatives[0].transcript

# Speechmatics - simpler
text = result.transcript_text

3. Event Types (Streaming)

# Deepgram - uses generic MESSAGE event, check is_final for final vs partial
connection.on(EventType.MESSAGE, on_message)

# Speechmatics - separate events for final and partial
@client.on(ServerMessageType.ADD_TRANSCRIPT)
def on_transcript(message):
    ...

4. Audio Format

# Deepgram - in options
options = LiveOptions(encoding="linear16", sample_rate=16000)

# Speechmatics - separate object
audio_format = AudioFormat(encoding=AudioEncoding.PCM_S16LE, sample_rate=16000)

5. Language Codes - No Locales Required

# Deepgram - requires locale variants
options = PrerecordedOptions(language="en-US")  # or "en-GB", "en-AU"

# Speechmatics - just the language code, handles all accents automatically
config = TranscriptionConfig(language="en")  # Works for US, UK, AU, etc.

# Mandarin uses output_locale for character formatting
config = TranscriptionConfig(
    language="cmn",
    output_locale="cmn-Hans"  # Simplified Chinese (or "cmn-Hant" for Traditional)
)

Speechmatics' models are trained on diverse accents and don't require locale specification. Use output_locale for region-specific formatting (e.g., "en-GB" vs "en-US" spelling, or "cmn-Hans" vs "cmn-Hant" for Mandarin characters).

Complete Before/After Example

Before (Deepgram)

from deepgram import DeepgramClient, PrerecordedOptions
import os

def transcribe_audio():
    client = DeepgramClient(api_key=os.getenv("DEEPGRAM_API_KEY"))

    with open("audio.wav", "rb") as audio_file:
        response = client.listen.prerecorded.transcribe_file(
            audio_file,
            PrerecordedOptions(
                model="nova-3",
                smart_format=True,
                diarize=True,
                language="en-US",
                keywords=["ProductName", "TechTerm"]
            )
        )

    return response.results.channels[0].alternatives[0].transcript

print(transcribe_audio())

After (Speechmatics)

import asyncio
import os
from speechmatics.batch import AsyncClient, TranscriptionConfig

async def transcribe_audio():
    async with AsyncClient(api_key=os.getenv("SPEECHMATICS_API_KEY")) as client:
        config = TranscriptionConfig(
            language="en",
            operating_point="enhanced",
            diarization="speaker",
            enable_entities=True,
            additional_vocab=[
                {"content": "ProductName"},
                {"content": "TechTerm"}
            ]
        )

        with open("audio.wav", "rb") as audio_file:
            result = await client.transcribe(audio_file, transcription_config=config)
            return result.transcript_text

print(asyncio.run(transcribe_audio()))

See complete working examples in:

Batch vs Real-time - Understand API modes
Voice Agent Turn Detection - Voice SDK with presets

Need Help?

Migration Support

Email: devrel@speechmatics.com
SDK Documentation
Why Switch from Deepgram - Official comparison

Related Academy Examples

Hello World - Start here
Batch vs Real-time - Understand API modes
Configuration Guide - All config options

Official Documentation

Feedback

Help us improve this guide:

Found an issue? Report it
Have suggestions? Open a discussion

Time to Migrate: 30-60 minutes Difficulty: Intermediate Languages: Python

Back to Academy Home

FilesExpand file tree

deepgram

Directory actions

More options

Directory actions

More options

Latest commit

History

deepgram

Folders and files

parent directory

README.md

Migrating from Deepgram to Speechmatics

Table of Contents

Feature Mapping

Core Configuration

Why Switch?

Superior Accuracy

More Languages

Advanced Features

Flexible Deployment Options

Enterprise-Grade Security

Industries & Use Cases

Code Migration Examples

Batch Transcription

Real-time Streaming

Speaker Diarization

Speaker Focus (Voice SDK Only)

Custom Vocabulary

Content Filtering

Response Structure

Deepgram Response

Speechmatics Response

Features Unique to Each Platform

Deepgram Only

Speechmatics Only

Migration Checklist

Pre-Migration

Code Migration

Testing

Deployment

Common Gotchas

1. Async/Await Pattern

2. Response Structure

3. Event Types (Streaming)

4. Audio Format

5. Language Codes - No Locales Required

Complete Before/After Example

Before (Deepgram)

After (Speechmatics)

Need Help?

Migration Support

Related Academy Examples

Official Documentation

Feedback