Skip to content

Commit dc965a6

Browse files
committed
made a few changes to the video indexer and custom neural voice and also updated readme
1 parent 99b849f commit dc965a6

File tree

3 files changed

+46
-10
lines changed

3 files changed

+46
-10
lines changed

3_custom_voices.py

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,11 @@
22
import yaml
33
import argparse
44

5-
def speak_to_azure(speech_key, speech_region, endpoint):
6-
config = speechsdk.SpeechConfig(subscription=speech_key, region=speech_region)
7-
config.endpoint_id = endpoint
8-
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=config)
5+
def speak_to_azure(speech_key, speech_region, speech_voice, custom_endpoint):
6+
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=speech_region)
7+
speech_config.speech_synthesis_voice_name = speech_voice
8+
speech_config.endpoint_id = custom_endpoint
9+
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config)
910
print("Type some text that you want to speak...")
1011
text = input()
1112
result = speech_synthesizer.speak_text_async(text).get()
@@ -32,9 +33,17 @@ def main():
3233
config = load_yaml(args.config_file)
3334
region = config['azure']['region']
3435
speech_key = config['azure']['speech_key']
35-
endpoint = config['azure']['custom_endpoint_id']
36+
custom_endpoint = config['azure']['custom_endpoint_id']
3637
print(f"Region: {region}")
3738
print(f"Speech Key: ****")
38-
print(f"Endpoint ID: ****")
39-
speak_to_azure(speech_key, region, endpoint)
40-
print("Done!")
39+
40+
speak_to_azure(speech_key, region, "Richard VoiceNeural", custom_endpoint)
41+
print("Done!")
42+
43+
if __name__ == '__main__':
44+
main()
45+
# python3 3_custom_voices.py richard.yaml
46+
47+
"""
48+
This is the story of the book "The Hitchhiker's Guide to the Galaxy"—perhaps the most remarkable, certainly the most successful book ever to come out of the great publishing corporations of Ursa Minor. More popular than The Celestial Home Care Omnibus, better-selling than Fifty-Three More Things to Do in Zero Gravity, and more controversial than Oolon Colluphid's trilogy of philosophical blockbusters, Where God Went Wrong, Some More of God's Greatest Mistakes, and Who is This God Person Anyway?
49+
"""

6_video_indexing.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@ def get_video_summary(location, account_id, access_token, video_id, language="en
3333
'accessToken': access_token,
3434
'format': 'srt',
3535
'language': language
36-
3736
}
3837

3938
headers = {

README.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,33 @@
11
# Summary
22

3-
This is based on an UK Azure User Group talk that I've done. Contains some sample files which I'll keep updating. A lot of this should make sense, however, I'll try and add some narratives and markdown for video indexing and speech studio as well.
3+
This is my talk on Speaking to Azure. As Azure grows in complexity and aligns to the age of the AI we are being gifted with a bunch of services which make quickstarting much easier. In this case understanding the speech SDK and what you can do with voice in your applications is key. This set of examples and slides show the breadth but not the depth of how you can leverage speech services in Azure. Feel free to use as needed and just add a citation.
44

55
# File descriptions
6+
7+
There are seven examples in this repository. Here is a short description of each and how to configure them.
8+
9+
**1_record_azure.py** - this contains a method to record your voice using pyaudio. In order to install ensure that you use pip to install all of the dependencies. I'll be adding a requirements file to help in a future release. If you are using mac or linux use brew or apt-get to install portaudio first.
10+
**2_voice_of_azure.py** - this allows you to choose between three voices Sonia, Jorge and Aria from the Microsoft voice collection.
11+
**3_custom_voices.py** - this uses a custom neural voice that has been trained on your voice. at the time of writing the custom neural voice is in private preview so you will need to request access. You need at least 20 training samples through *speech studio* and this will give you the ability to use your voice as per sample 2 with a custom endpoint.
12+
**4_transcribing_voice.py** - this will allow you to transcribe a wav file and spit out the audio as text. It uses the batch speech to text api.
13+
**5_decomposing_video.py** - this takes parts of the first sample and allows you to record video and strip the audio channel from it so that you can use the *speech service SDK*.
14+
**6_video_indexing.py** - takes an example from the video indexing service to show how you can use the API to get insight and audio data from the service.
15+
**7_speech_enrolment.py** - [TBC] uses the speech enrolment API which allows you to enrol as a speaker using voice snippets and then verify or identify speakers in audio.
16+
17+
To configure use the following **config.yaml** file replacing the values below.
18+
```
19+
azure:
20+
region: <add your speech region here>
21+
speech_key: <add your speech key here>
22+
custom_endpoint_id: <add your custom endpoint id here>
23+
video_key: <add your video key here>
24+
video_account_id: <add your video account id here>
25+
```
26+
27+
In order to complete this it's important to ensure that you get the values from the right places.
28+
29+
- *region* - is the Azure region which is generally a lower case concatenation of the region, for example West Europe is **westeurope**
30+
- *speech_key* - this is the key that comes from the Azure Speech Service. You can find this in the Azure Portal.
31+
- *custom_endpoint_id* - this is the endpoint of the trained custom neural voice that you can find in the speech services portal after you've deployed your model. More details on this here https://learn.microsoft.com/en-us/azure/ai-services/speech-service/custom-neural-voice-lite
32+
- *video_key* - the key for the videio indexer account can be found through the Azure Portal in the service blade.
33+
- *video_account_id* - this is the id of the video account can also be found through the Video Indexer Portal.

0 commit comments

Comments
 (0)