Your software is able to recognize a key pressure, a key sequence or a movement of the mouse. It could be nice to be able to bind action to voice orders.
"write a note"::Run notepad
or
^!"write a note"::Run notepad
Other software are already able to recognize the voice but they don't offer all the functions you able to do.
I also invite you to investigate about the GlovePIE software. GlovePIE manage more input devices like speech, joystick, joypad, wiimote, remote devices, etc... but don't offer so much output actions than you are able to do. May I hope a collaboration between those two projects is possible ?
Voice recognition
Started by
DranDane
, Jun 26 2007 10:18 AM
45 replies to this topic
#1
-
Posted 26 June 2007 - 10:18 AM
Speech Recognition may be possible with COM.
I found an example of how its done in Python.
I didn't have the time to investigate how to translate it to ahk (if possible at all). Maybe someone else that has a good understanding of COM knows a little Python and has time can investigate this can make it work. Because i am busy with exploring WinSock2 with ahk at the moment.
I found an example of how its done in Python.
from win32com.client import constants import win32com.client import pythoncom """Sample code for using the Microsoft Speech SDK 5.1 via COM in Python. Requires that the SDK be installed; it's a free download from http://microsoft.com/speech and that MakePy has been used on it (in PythonWin, select Tools | COM MakePy Utility | Microsoft Speech Object Library 5.1). After running this, then saying "One", "Two", "Three" or "Four" should display "You said One" etc on the console. The recognition can be a bit shaky at first until you've trained it (via the Speech entry in the Windows Control Panel.""" class SpeechRecognition: """ Initialize the speech recognition with the passed in list of words """ def __init__(self, wordsToAdd): # For text-to-speech self.speaker = win32com.client.Dispatch("SAPI.SpVoice") # For speech recognition - first create a listener self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer") # Then a recognition context self.context = self.listener.CreateRecoContext() # which has an associated grammar self.grammar = self.context.CreateGrammar() # Do not allow free word recognition - only command and control # recognizing the words in the grammar only self.grammar.DictationSetState(0) # Create a new rule for the grammar, that is top level (so it begins # a recognition) and dynamic (ie we can change it at runtime) self.wordsRule = self.grammar.Rules.Add("wordsRule", constants.SRATopLevel + constants.SRADynamic, 0) # Clear the rule (not necessary first time, but if we're changing it # dynamically then it's useful) self.wordsRule.Clear() # And go through the list of words, adding each to the rule [ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ] # Set the wordsRule to be active self.grammar.Rules.Commit() self.grammar.CmdSetRuleState("wordsRule", 1) # Commit the changes to the grammar self.grammar.Rules.Commit() # And add an event handler that's called back when recognition occurs self.eventHandler = ContextEvents(self.context) # Announce we've started using speech synthesis self.say("Started successfully") """Speak a word or phrase""" def say(self, phrase): self.speaker.Speak(phrase) """The callback class that handles the events raised by the speech object. See "Automation | SpSharedRecoContext (Events)" in the MS Speech SDK online help for documentation of the other events supported. """ class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")): """Called when a word/phrase is successfully recognized - ie it is found in a currently open grammar with a sufficiently high confidence""" def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result): newResult = win32com.client.Dispatch(Result) print "You said: ",newResult.PhraseInfo.GetText() if __name__=='__main__': wordsToAdd = [ "One", "Two", "Three", "Four" , "boom"] speechReco = SpeechRecognition(wordsToAdd) while 1: pythoncom.PumpWaitingMessages()
I didn't have the time to investigate how to translate it to ahk (if possible at all). Maybe someone else that has a good understanding of COM knows a little Python and has time can investigate this can make it work. Because i am busy with exploring WinSock2 with ahk at the moment.
#2
-
Posted 26 June 2007 - 12:24 PM
Although it's outside my experience, I would like to extend AutoHotkey's input capabilities to include more types of remote controls, etc. Also, the ability to simulate joystick input (e.g. pressing of buttons) would be great because it would allow a keyboard or mouse to be used as a complete or partial substitute for a joystick.
Thanks for the suggestion.
Thanks for the suggestion.
#3
-
Posted 27 June 2007 - 10:53 AM
I should be able to put out a script within the next week. So keep an eye on this topic and the scripts and functions forum.
#4
-
Posted 28 June 2007 - 01:27 PM
I'm all excited !!!I should be able to put out a script within the next week. So keep an eye on this topic and the scripts and functions forum.
Can't wait !
#5
-
Posted 28 June 2007 - 06:39 PM
OK i wanted to let you know that i got it working, since ehhrrm, a couple of days now but had no time to clean up the mess so you will need to wait a couple more days.
#6
-
Posted 04 July 2007 - 05:14 PM
Thanks for the update...
Still super-excited to see this!
Still super-excited to see this!
#7
-
Posted 05 July 2007 - 03:49 PM
Thanks for the update...
Still super-excited to see this!
As am I, looking forward to your follow up post foom!
#8
-
Posted 05 July 2007 - 09:46 PM
Just a note: Vista comes with built-in voice recognition. Works quite good 8)
#9
-
Posted 25 July 2007 - 12:09 AM
How's it comin'?
I REALLY wanna play with this!
:idea: :!: :!: :idea:
I REALLY wanna play with this!
:idea: :!: :!: :idea:
#10
-
Posted 26 July 2007 - 05:18 PM
Just a note: Vista comes with built-in voice recognition. Works quite good 8)
And the voice recognition is easy to install on XP :wink:
More info here <!-- m -->http://clans.gameclu...t/downloads.php<!-- m --> It's a website of shoot 1.6.4 a voice recognition software I don't use any more. The software is good but very limited.
I invite again you to investigate about the GlovePIE software. See my first message.
#12
-
Posted 09 August 2007 - 02:37 PM
Looks like foom is busy.
Anyway, here is the verbatim translation of the script foom linked.
I can't test it myself as no mike is installed on my machine.
If you hear "Starting Succeeded", then, try to say one/two/three.
NEED CoHelper.ahk.
Anyway, here is the verbatim translation of the script foom linked.
I can't test it myself as no mike is installed on my machine.
If you hear "Starting Succeeded", then, try to say one/two/three.
NEED CoHelper.ahk.
#Persistent OnExit, CleanUp CoInitialize() pspeaker := ActiveXObject("SAPI.SpVoice") plistener:= ActiveXObject("SAPI.SpSharedRecognizer") pcontext := Invoke(plistener, "CreateRecoContext") pgrammar := Invoke(pcontext, "CreateGrammar") Invoke(pgrammar, "DictationSetState", 0) prules := Invoke(pgrammar, "Rules") prulec := Invoke(prules, "Add", "wordsRule", 0x1|0x20) Invoke(prulec, "Clear") pstate := Invoke(prulec, "InitialState") ; [color=red]Add here the words to be recognized![/color] Looks like it understands the null pointer. Invoke(pstate, "AddWordTransition", [color=red]"+"[/color] . 0, "One") Invoke(pstate, "AddWordTransition", [color=red]"+"[/color] . 0, "Two") Invoke(pstate, "AddWordTransition", [color=red]"+"[/color] . 0, "Three") ;; Invoke(prules, "Commit") Invoke(pgrammar, "CmdSetRuleState", "wordsRule", 1) Invoke(prules, "Commit") ConnectObject(pcontext, "On") If (pspeaker && plistener && pcontext && pgrammar && prules && prulec && pstate) Invoke(pspeaker, "Speak", "Starting Succeeded") Else Invoke(pspeaker, "Speak", "Starting Failed") Return CleanUp: Release(pstate) Release(prulec) Release(prules) Release(pgrammar) Release(pcontext) Release(plistener) Release(pspeaker) CoUninitialize() ExitApp OnRecognition(prms, this) { Global pspeaker presult := DispGetParam(prms, 3, 9) pphrase := Invoke(presult, "PhraseInfo") Invoke(pspeaker, "Speak", "You said " . Invoke(pphrase, "GetText")) Release(pphrase) } #Include CoHelper.ahk
#13
-
Posted 10 August 2007 - 05:14 AM
Sean,
The script start and I hear "Starting Suceeded". When I say "One" I hear "You said one" and a one second later I get an error message. The script crash and close.
But the most important... the voice recognition works :-D. Thank you.
eventvwr message :
The script start and I hear "Starting Suceeded". When I say "One" I hear "You said one" and a one second later I get an error message. The script crash and close.
But the most important... the voice recognition works :-D. Thank you.
eventvwr message :
Event Type: Error Event Source: Application Error Event Category: None Event ID: 1000 Date: 11/08/2007 Time: 14:21:46 User: N/A Computer: WS-P5GL-MX-300 Description: Faulting application autohotkey.exe, version 1.0.47.3, faulting module sapi.dll, version 5.1.4111.0, fault address 0x0000c969. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. Data: 0000: 41 70 70 6c 69 63 61 74 Applicat 0008: 69 6f 6e 20 46 61 69 6c ion Fail 0010: 75 72 65 20 20 61 75 74 ure aut 0018: 6f 68 6f 74 6b 65 79 2e ohotkey. 0020: 65 78 65 20 31 2e 30 2e exe 1.0. 0028: 34 37 2e 33 20 69 6e 20 47.3 in 0030: 73 61 70 69 2e 64 6c 6c sapi.dll 0038: 20 35 2e 31 2e 34 31 31 5.1.411 0040: 31 2e 30 20 61 74 20 6f 1.0 at o 0048: 66 66 73 65 74 20 30 30 ffset 00 0050: 30 30 63 39 36 39 0d 0a 00c969..:lol:
#14
-
Posted 11 August 2007 - 12:30 PM
I always felt helpless when facing a situation like this where I can't test myself...The script start and I hear "Starting Suceeded". When I say "One" I hear "You said one" and a one second later I get an error message. The script crash and close.
BTW, the address in the eventvwr you posted appeared somewhat odd to me.
Anyway, the one I could think of atm that can go wrong is in the following function Rec_Recognition, so I updated it.
Please recopy and try it again, and inform me about the result.
#15
-
Posted 11 August 2007 - 02:01 PM