This is a very loaded question but extremely detailed. Any help greatly appreciated.
This is the older code that had speech recognition of googles from Python...
def takeCommand():
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
r.pause_threshold = .5
audio = r.listen(source)
try:
print("Recognizing...")
query = r.recognize_google(audio, language='en-us')
print("User said: {query}\n")
except Exception as e:
print(e)
speak("I can't hear you sir.")
print("I can't hear you sir.")
return "None"
return query
These are some commands...
query = takeCommand().lower()
# All the commands said by user will be
# stored here in 'query' and will be
# converted to lower case for easily
# recognition of command
if 'wikipedia' in query:
speak('Searching Wikipedia...')
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences=3)
speak("According to Wikipedia")
print(results)
speak(results)
elif 'open youtube' in query:
speak("Here you go to Youtube\n")
webbrowser.open("youtube.com")
elif 'open google' in query:
speak("Here you go to Google\n")
webbrowser.open("google.com")
elif 'open stackoverflow' in query:
speak("Here is Stack Over Flow. Be sure to give me an upgrade!")
webbrowser.open("stackoverflow.com")
Now, I am using IBM Cloud for the more advanced voice. Here is my current code for speaking to me within the Pycharm IDE...
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
from ibm_watson import TextToSpeechV1
import vlc
import time
authenticator = IAMAuthenticator("API KEY")
text_to_speech = TextToSpeechV1(
authenticator=authenticator
)
text_to_speech.set_service_url(
'URL HERE')
with open('hello_world.wav', 'wb') as audio_file:
audio_file.write(
text_to_speech.synthesize(
'Hello world',
voice='en-US_AllisonV3Voice',
accept='audio/wav'
).get_result().content)
instance = vlc.Instance()
# Create a MediaPlayer with the default instance
player = instance.media_player_new()
# Load the media file
media = instance.media_new('hello_world.wav')
# Add the media to the player
player.set_media(media)
# Play for 10 seconds then exit
player.play()
time.sleep(10)
It plays a generated audio file by IBM in the IDE and waits 10 seconds to then sleep.
I think I need to do something like this...
with open('hello_world.wav', 'wb') as audio_file:
audio_file.write(
text_to_speech.synthesize(
And then have an input create that command with different wav file names. Like if "" is said, run the IBM text to speech and play this audio file
How can I create a similar system above where when I input with my voice a specific command, I can have it generate and play a different audio file. All the necessary code is above to reproduce. Thank you to anyone who can help me better structure this.
Aucun commentaire:
Enregistrer un commentaire