Skip to content

Separator is not found, and chunk exceed the limit #7

@tath105

Description

@tath105

I tried to run the program in the my cloud machine. but i got the error (per subject, and full log below). Any advice u can kindly offer?

here is the test data I tried to use. https://github.com/fcbond/hkcancor/blob/master/sample/d1.mp3

========
2024-12-25 20:12:27,158 - main - INFO - handle_audio_file called.
2024-12-25 20:12:27,159 - main - INFO - allow_audio_files: True
2024-12-25 20:12:27,159 - main - INFO - Received an audio file from user 1610882654: d1.mp3
2024-12-25 20:12:27,159 - main - INFO - Extracted file extension: mp3
2024-12-25 20:12:27,159 - main - INFO - Allowed formats: ['mp3', 'wav', 'm4a', 'aac', 'flac', 'ogg', 'wma', 'aiff']
2024-12-25 20:12:28,570 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getFile "HTTP/1.1 200 OK"
2024-12-25 20:12:28,963 - httpx - INFO - HTTP Request: GET https://api.telegram.org/file/botIDDDDDDDDD%3AAAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/music/file_5.mp3 "HTTP/1.1 200 OK"
2024-12-25 20:12:29,994 - main - INFO - File downloaded to audio_messages/AgADqhQAAstBYFc.mp3
2024-12-25 20:12:29,995 - main - INFO - Processing task for user ID 1610882654: audio_messages/AgADqhQAAstBYFc.mp3
2024-12-25 20:12:29,997 - transcription_handler - INFO - No custom model for user 1610882654. Using default model: turbo
2024-12-25 20:12:29,997 - transcription_handler - INFO - Returning custom language for user 1610882654: Cantonese
2024-12-25 20:12:29,997 - main - INFO - Processing audio/video file: audio_messages/AgADqhQAAstBYFc.mp3
2024-12-25 20:12:29,999 - transcription_handler - ERROR - No GPUs found
2024-12-25 20:12:29,999 - main - INFO - No GPU available, using CPU for transcription.
2024-12-25 20:12:30,398 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/sendMessage "HTTP/1.1 200 OK"
2024-12-25 20:12:30,400 - main - INFO - File queued for transcription. Queue length: 1
2024-12-25 20:12:30,867 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/sendMessage "HTTP/1.1 200 OK"
2024-12-25 20:12:31,307 - transcription_handler - INFO - Estimating transcription time for model: turbo and audio duration: 58.932 seconds
2024-12-25 20:12:31,307 - transcription_handler - INFO - Estimated transcription time: 7.3665 seconds
2024-12-25 20:12:31,307 - main - INFO - Audio file length:
58s

Whisper model in use:
turbo

Model language set to:
Cantonese

Estimated transcription time:
1.0 minutes.

Time now:
2024-12-25 20:12:31

Time when finished (estimate):
2024-12-25 20:13:31

Transcribing audio...
2024-12-25 20:12:31,309 - asyncio - WARNING - Executing <Task pending name='Task-1' coro=<TranscriberBot.process_queue() running at /root/whisper-transcriber-telegram-bot/src/main.py:283> created at /root/whisper-transcriber-telegram-bot/src/main.py:788> took 0.442 seconds
2024-12-25 20:12:31,706 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/sendMessage "HTTP/1.1 200 OK"
2024-12-25 20:12:31,709 - transcription_handler - INFO - Using device: cpu for transcription
2024-12-25 20:12:31,710 - transcription_handler - INFO - Starting transcription with model 'turbo' and language 'Cantonese' for: audio_messages/AgADqhQAAstBYFc.mp3
2024-12-25 20:12:31,710 - transcription_handler - INFO - Transcription command: whisper audio_messages/AgADqhQAAstBYFc.mp3 --model turbo --output_dir transcriptions --device cpu --language Cantonese
2024-12-25 20:12:31,716 - asyncio - INFO - execute program 'whisper': <_UnixSubprocessTransport pid=174943 running stdout=<_UnixReadPipeTransport fd=14 polling> stderr=<_UnixReadPipeTransport fd=17 polling>>
2024-12-25 20:12:35,933 - transcription_handler - ERROR - Whisper stderr: /usr/local/lib/python3.10/dist-packages/whisper/init.py:69: UserWarning: /root/.cache/whisper/large-v3-turbo.pt exists, but the SHA256 checksum does not match; re-downloading the file
2024-12-25 20:12:35,933 - transcription_handler - ERROR - Whisper stderr: warnings.warn(
2024-12-25 20:12:37,371 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:12:47,577 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:12:57,782 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:13:07,995 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:13:18,217 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:13:28,420 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:13:38,626 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:13:48,838 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:13:59,040 - httpx - INFO - HTTP Request: POST https://api.telegram.org/botIDDDDDDDDD:AAFy2pmOMEgqLEwX2_i2s_F8iLSshDmJB5M/getUpdates "HTTP/1.1 200 OK"
2024-12-25 20:14:04,570 - transcription_handler - ERROR - An error occurred during transcription: Separator is not found, and chunk exceed the limit
2024-12-25 20:14:04,570 - main - INFO - Transcription paths returned: {}

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions