Name	Name	Last commit message	Last commit date
parent directory ..
python	python
.env.example	.env.example
README.md	README.md

Multilingual & Translation

Real-time transcription and translation using microphone input - speak in English and see live translations in multiple languages.

Demonstrate multilingual capabilities by transcribing live audio and translating it to Spanish and Russian in real-time.

What You'll Learn

How to configure real-time transcription with translation
Working with multiple target languages simultaneously
Handling real-time translation events
Understanding translation timing vs transcription timing
Managing microphone input for live processing

Prerequisites

Speechmatics API Key: Get one from portal.speechmatics.com
Python 3.8+
Microphone: Built-in or external microphone
PyAudio: For microphone access (installation instructions below)

Quick Start

Python

Step 1: Create and activate a virtual environment

On Windows:

cd python
python -m venv .venv
.venv\Scripts\activate

On Mac/Linux:

cd python
python3 -m venv .venv
source .venv/bin/activate

Step 2: Install dependencies

pip install -r requirements.txt

Note

If PyAudio installation fails, see PyAudio Installation Issues in Troubleshooting.

Step 3: Configure API key

cp ../.env.example .env
# Edit .env and add your SPEECHMATICS_API_KEY

Step 4: Run the example

python main.py

Speak into your microphone and watch the real-time transcription and translations appear!

How It Works

Note

This example demonstrates real-time translation by:

Capturing microphone input - Uses PyAudio to stream audio from your microphone
Configuring source language - Sets English as the transcription language
Enabling translation - Configures Spanish and Russian as target languages
Streaming to Speechmatics - Sends audio chunks via WebSocket
Receiving real-time results - Processes both transcription and translation events
Displaying results - Shows English transcription and translations as they arrive

Code Walkthrough

1. Audio and Transcription Configuration

audio_format = AudioFormat(
    encoding=AudioEncoding.PCM_S16LE,
    chunk_size=4096,
    sample_rate=16000,
)

transcription_config = TranscriptionConfig(
    language="en",  # Source language: English
    enable_partials=True,  # Show partial results
)

translation_config = TranslationConfig(
    target_languages=["es", "ru"],  # Spanish and Russian
    enable_partials=True,
)

2. Event Handlers

The example registers four event handlers:

English Transcription (Final):

@client.on(ServerMessageType.ADD_TRANSCRIPT)
def handle_final_transcript(message):
    result = TranscriptResult.from_message(message)
    transcript = result.metadata.transcript
    if transcript:
        print(f"[EN]: {transcript}")
        transcript_parts.append(transcript.strip())

English Transcription (Partial):

@client.on(ServerMessageType.ADD_PARTIAL_TRANSCRIPT)
def handle_partial_transcript(message):
    result = TranscriptResult.from_message(message)
    transcript = result.metadata.transcript
    if transcript:
        print(f"[EN partial]: {transcript}")

Translations (Final):

@client.on(ServerMessageType.ADD_TRANSLATION)
def handle_final_translation(message):
    language = message.get("language")
    if "results" in message and message["results"]:
        translation = " ".join([r["content"] for r in message["results"]])
        if translation:
            lang_name = "ES" if language == "es" else "RU"
            print(f"[{lang_name}]: {translation}")
            translations[language].append(translation.strip())

Translations (Partial):

@client.on(ServerMessageType.ADD_PARTIAL_TRANSLATION)
def handle_partial_translation(message):
    language = message.get("language")
    if "results" in message and message["results"]:
        translation = " ".join([r["content"] for r in message["results"]])
        if translation:
            lang_name = "ES" if language == "es" else "RU"
            print(f"[{lang_name} partial]: {translation}")

3. Streaming Audio

async with AsyncClient() as client:
    await client.start_session(
        transcription_config=transcription_config,
        translation_config=translation_config,
        audio_format=audio_format,
    )

    while True:
        frame = await mic.read(audio_format.chunk_size)
        await client.send_audio(frame)

Expected Output

When you speak "Hello, how are you today?" you'll see:

Microphone started - speak now...
Press Ctrl+C to stop transcription

[EN]: Hello.
[ES]: Hola.
[RU]: Добрый день.
[EN partial]: How's it going?
[EN]: How's it
[EN partial]: going?
[EN]: going?
[ES]: ¿Cómo va todo?
[RU]: Как обстоят дела?

^C
Transcription session cancelled

Full transcript: Hello. How's it going?
Spanish: Hola. ¿Cómo va todo?
Russian: Добрый день. Как обстоят дела?

Key Features Demonstrated

Real-time Processing:

Live microphone input streaming
Immediate transcription feedback
Concurrent translation to multiple languages

Event Types:

ADD_TRANSCRIPT: Finalized English transcription segments
ADD_PARTIAL_TRANSCRIPT: Real-time English preview as you speak
ADD_TRANSLATION: Complete translated sentences
ADD_PARTIAL_TRANSLATION: Translation previews

Translation Behavior:

English transcription is word-by-word (incremental)
Translations wait for sentence context (complete phrases)
This is by design - translations need context for accuracy

Understanding Translation Timing

Why English shows fragments but translations show complete sentences:

English (Transcription):

Fires as each word is finalized
Example: "How's it", "going?"
Incremental real-time updates

Spanish/Russian (Translation):

Waits for enough context
Example: "¿Cómo va todo?" (complete sentence)
Better accuracy through context

This is expected behavior - the translation engine batches words together to provide coherent, accurate translations rather than word-for-word fragments.

Configuration Options

Change Target Languages

Modify the TranslationConfig to translate to different languages:

translation_config = TranslationConfig(
    target_languages=["fr", "de", "it"],  # French, German, Italian
    enable_partials=True,
)

Change Source Language

Transcribe in a different language:

transcription_config = TranscriptionConfig(
    language="es",  # Spanish input
    enable_partials=True,
)

Disable Partial Results

Only show final results:

transcription_config = TranscriptionConfig(
    language="en",
    enable_partials=False,  # No partial results
)

translation_config = TranslationConfig(
    target_languages=["es", "ru"],
    enable_partials=False,  # No partial translations
)

Supported Languages

Translation Support

Source Languages (55+):

English (en), Spanish (es), French (fr), German (de), Italian (it)
Portuguese (pt), Dutch (nl), Russian (ru), Japanese (ja), Korean (ko)
Chinese (zh), Arabic (ar), Hindi (hi), and 40+ more

Target Languages (55+):

All major European languages
Asian languages (Chinese, Japanese, Korean)
Middle Eastern languages (Arabic, Hebrew)
View full list: Supported Languages

Next Steps

Text-to-Speech - Convert translated text back to speech
Audio Intelligence - Extract insights from transcribed content
Video Captioning - Generate subtitles with translation
Call Center Analytics - Analyze multilingual customer calls

Troubleshooting

PyAudio Installation Issues

Windows:

# If pip install pyaudio fails, try:
pip install pipwin
pipwin install pyaudio

# Or download pre-built wheel from:
# https://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio
pip install PyAudio‑0.2.11‑cp39‑cp39‑win_amd64.whl

Mac:

# Install portaudio first
brew install portaudio
pip install pyaudio

Linux (Ubuntu/Debian):

sudo apt-get install portaudio19-dev
pip install pyaudio

"Microphone not available" message

Check that PyAudio is installed: pip list | grep PyAudio
Verify microphone permissions in your system settings
Test your microphone with another application

"Authentication failed" error

Verify your API key in .env file
Check your key at portal.speechmatics.com
Ensure no extra spaces in the .env file

No translations appearing

Speak complete sentences (translations need context)
Wait 1-2 seconds after speaking
Check that target languages are supported for translation

Double spaces in final transcript

This is handled by .strip() on each segment

Resources

Feedback

Help us improve this guide:

Found an issue? Report it
Have suggestions? Open a discussion

Time to Complete: 15 minutes Difficulty: Intermediate API Mode: Real-time Languages: Python

Back to Basics | Back to Academy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Multilingual & Translation

What You'll Learn

Prerequisites

Quick Start

Python

How It Works

Code Walkthrough

Expected Output

Key Features Demonstrated

Understanding Translation Timing

Configuration Options

Change Target Languages

Change Source Language

Disable Partial Results

Supported Languages

Translation Support

Next Steps

Troubleshooting

PyAudio Installation Issues

Resources

Feedback

FilesExpand file tree

05-multilingual-translation

Directory actions

More options

Directory actions

More options

Latest commit

History

05-multilingual-translation

Folders and files

parent directory

README.md

Multilingual & Translation

What You'll Learn

Prerequisites

Quick Start

Python

How It Works

Code Walkthrough

Expected Output

Key Features Demonstrated

Understanding Translation Timing

Configuration Options

Change Target Languages

Change Source Language

Disable Partial Results

Supported Languages

Translation Support

Next Steps

Troubleshooting

PyAudio Installation Issues

Resources

Feedback