Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Voice recognition


  • Please log in to reply
45 replies to this topic
DranDane
  • Members
  • 53 posts
  • Last active: Feb 04 2009 04:30 PM
  • Joined: 26 Jun 2007
Your software is able to recognize a key pressure, a key sequence or a movement of the mouse. It could be nice to be able to bind action to voice orders.

"write a note"::Run notepad

or
^!"write a note"::Run notepad

Other software are already able to recognize the voice but they don't offer all the functions you able to do.

I also invite you to investigate about the GlovePIE software. GlovePIE manage more input devices like speech, joystick, joypad, wiimote, remote devices, etc... but don't offer so much output actions than you are able to do. May I hope a collaboration between those two projects is possible ?

foom
  • Members
  • 386 posts
  • Last active: Jul 04 2007 04:53 PM
  • Joined: 19 Apr 2006
Speech Recognition may be possible with COM.
I found an example of how its done in Python.
from win32com.client import constants
import win32com.client
import pythoncom

"""Sample code for using the Microsoft Speech SDK 5.1 via COM in Python.
    Requires that the SDK be installed; it's a free download from
            http://microsoft.com/speech
    and that MakePy has been used on it (in PythonWin,
    select Tools | COM MakePy Utility | Microsoft Speech Object Library 5.1).

    After running this, then saying "One", "Two", "Three" or "Four" should
    display "You said One" etc on the console. The recognition can be a bit
    shaky at first until you've trained it (via the Speech entry in the Windows
    Control Panel."""
class SpeechRecognition:
    """ Initialize the speech recognition with the passed in list of words """
    def __init__(self, wordsToAdd):
        # For text-to-speech
        self.speaker = win32com.client.Dispatch("SAPI.SpVoice")
        # For speech recognition - first create a listener
        self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
        # Then a recognition context
        self.context = self.listener.CreateRecoContext()
        # which has an associated grammar
        self.grammar = self.context.CreateGrammar()
        # Do not allow free word recognition - only command and control
        # recognizing the words in the grammar only
        self.grammar.DictationSetState(0)
        # Create a new rule for the grammar, that is top level (so it begins
        # a recognition) and dynamic (ie we can change it at runtime)
        self.wordsRule = self.grammar.Rules.Add("wordsRule",
                        constants.SRATopLevel + constants.SRADynamic, 0)
        # Clear the rule (not necessary first time, but if we're changing it
        # dynamically then it's useful)
        self.wordsRule.Clear()
        # And go through the list of words, adding each to the rule
        [ self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd ]
        # Set the wordsRule to be active
        self.grammar.Rules.Commit()
        self.grammar.CmdSetRuleState("wordsRule", 1)
        # Commit the changes to the grammar
        self.grammar.Rules.Commit()
        # And add an event handler that's called back when recognition occurs
        self.eventHandler = ContextEvents(self.context)
        # Announce we've started using speech synthesis
        self.say("Started successfully")
    """Speak a word or phrase"""
    def say(self, phrase):
        self.speaker.Speak(phrase)


"""The callback class that handles the events raised by the speech object.
    See "Automation | SpSharedRecoContext (Events)" in the MS Speech SDK
    online help for documentation of the other events supported. """
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
    """Called when a word/phrase is successfully recognized  -
        ie it is found in a currently open grammar with a sufficiently high
        confidence"""
    def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
        newResult = win32com.client.Dispatch(Result)
        print "You said: ",newResult.PhraseInfo.GetText()
    
if __name__=='__main__':
    wordsToAdd = [ "One", "Two", "Three", "Four" , "boom"]
    speechReco = SpeechRecognition(wordsToAdd)
    while 1:
        pythoncom.PumpWaitingMessages()

I didn't have the time to investigate how to translate it to ahk (if possible at all). Maybe someone else that has a good understanding of COM knows a little Python and has time can investigate this can make it work. Because i am busy with exploring WinSock2 with ahk at the moment. :)

Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004
Although it's outside my experience, I would like to extend AutoHotkey's input capabilities to include more types of remote controls, etc. Also, the ability to simulate joystick input (e.g. pressing of buttons) would be great because it would allow a keyboard or mouse to be used as a complete or partial substitute for a joystick.

Thanks for the suggestion.

foom
  • Members
  • 386 posts
  • Last active: Jul 04 2007 04:53 PM
  • Joined: 19 Apr 2006
I should be able to put out a script within the next week. So keep an eye on this topic and the scripts and functions forum.

SoggyDog
  • Members
  • 803 posts
  • Last active: Mar 04 2013 06:27 AM
  • Joined: 02 May 2006

I should be able to put out a script within the next week. So keep an eye on this topic and the scripts and functions forum.

I'm all excited !!!

Can't wait !

foom
  • Members
  • 386 posts
  • Last active: Jul 04 2007 04:53 PM
  • Joined: 19 Apr 2006
OK i wanted to let you know that i got it working, since ehhrrm, a couple of days now but had no time to clean up the mess so you will need to wait a couple more days. :)

SoggyDog
  • Members
  • 803 posts
  • Last active: Mar 04 2013 06:27 AM
  • Joined: 02 May 2006
Thanks for the update...
Still super-excited to see this!
:p

  • Guests
  • Last active:
  • Joined: --

Thanks for the update...
Still super-excited to see this!
:p


As am I, looking forward to your follow up post foom!

Andreas
  • Guests
  • Last active:
  • Joined: --
Just a note: Vista comes with built-in voice recognition. Works quite good 8)

SoggyDog
  • Members
  • 803 posts
  • Last active: Mar 04 2013 06:27 AM
  • Joined: 02 May 2006
How's it comin'?

I REALLY wanna play with this!

:idea: :!: :D :!: :idea:

superbem
  • Members
  • 21 posts
  • Last active: Dec 23 2010 09:00 PM
  • Joined: 30 Jul 2007
can't wait..

DranDane
  • Members
  • 53 posts
  • Last active: Feb 04 2009 04:30 PM
  • Joined: 26 Jun 2007

Just a note: Vista comes with built-in voice recognition. Works quite good 8)


And the voice recognition is easy to install on XP :wink:

More info here <!-- m -->http://clans.gameclu...t/downloads.php<!-- m --> It's a website of shoot 1.6.4 a voice recognition software I don't use any more. The software is good but very limited.

I invite again you to investigate about the GlovePIE software. See my first message.

Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007
Looks like foom is busy.
Anyway, here is the verbatim translation of the script foom linked.
I can't test it myself as no mike is installed on my machine.
If you hear "Starting Succeeded", then, try to say one/two/three.

NEED CoHelper.ahk.

#Persistent
OnExit, CleanUp

CoInitialize()
pspeaker := ActiveXObject("SAPI.SpVoice")
plistener:= ActiveXObject("SAPI.SpSharedRecognizer")
pcontext := Invoke(plistener, "CreateRecoContext")
pgrammar := Invoke(pcontext, "CreateGrammar")
Invoke(pgrammar, "DictationSetState", 0)
prules := Invoke(pgrammar, "Rules")
prulec := Invoke(prules, "Add", "wordsRule", 0x1|0x20)
Invoke(prulec, "Clear")
pstate := Invoke(prulec, "InitialState")

; [color=red]Add here the words to be recognized![/color] Looks like it understands the null pointer.
Invoke(pstate, "AddWordTransition", [color=red]"+"[/color] . 0, "One")
Invoke(pstate, "AddWordTransition", [color=red]"+"[/color] . 0, "Two")
Invoke(pstate, "AddWordTransition", [color=red]"+"[/color] . 0, "Three")
;;

Invoke(prules, "Commit")
Invoke(pgrammar, "CmdSetRuleState", "wordsRule", 1)
Invoke(prules, "Commit")
ConnectObject(pcontext, "On")

If (pspeaker && plistener && pcontext && pgrammar && prules && prulec && pstate)
	Invoke(pspeaker, "Speak", "Starting Succeeded")
Else	Invoke(pspeaker, "Speak", "Starting Failed")
Return

CleanUp:
Release(pstate)
Release(prulec)
Release(prules)
Release(pgrammar)
Release(pcontext)
Release(plistener)
Release(pspeaker)
CoUninitialize()
ExitApp


OnRecognition(prms, this)
{
	Global	pspeaker
	presult := DispGetParam(prms, 3, 9)
	pphrase := Invoke(presult, "PhraseInfo")
	Invoke(pspeaker, "Speak", "You said " . Invoke(pphrase, "GetText"))
	Release(pphrase)
}

#Include CoHelper.ahk


DranDane
  • Members
  • 53 posts
  • Last active: Feb 04 2009 04:30 PM
  • Joined: 26 Jun 2007
Sean,

The script start and I hear "Starting Suceeded". When I say "One" I hear "You said one" and a one second later I get an error message. The script crash and close.

But the most important... the voice recognition works :-D. Thank you.

eventvwr message :

Event Type:	Error
Event Source:	Application Error
Event Category:	None
Event ID:	1000
Date:		11/08/2007
Time:		14:21:46
User:		N/A
Computer:	WS-P5GL-MX-300
Description:
Faulting application autohotkey.exe, version 1.0.47.3, faulting module sapi.dll, version 5.1.4111.0, fault address 0x0000c969.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 41 70 70 6c 69 63 61 74   Applicat
0008: 69 6f 6e 20 46 61 69 6c   ion Fail
0010: 75 72 65 20 20 61 75 74   ure  aut
0018: 6f 68 6f 74 6b 65 79 2e   ohotkey.
0020: 65 78 65 20 31 2e 30 2e   exe 1.0.
0028: 34 37 2e 33 20 69 6e 20   47.3 in 
0030: 73 61 70 69 2e 64 6c 6c   sapi.dll
0038: 20 35 2e 31 2e 34 31 31    5.1.411
0040: 31 2e 30 20 61 74 20 6f   1.0 at o
0048: 66 66 73 65 74 20 30 30   ffset 00
0050: 30 30 63 39 36 39 0d 0a   00c969..
:lol:

Sean
  • Members
  • 2462 posts
  • Last active: Feb 07 2012 04:00 AM
  • Joined: 12 Feb 2007

The script start and I hear "Starting Suceeded". When I say "One" I hear "You said one" and a one second later I get an error message. The script crash and close.

I always felt helpless when facing a situation like this where I can't test myself...
BTW, the address in the eventvwr you posted appeared somewhat odd to me.
Anyway, the one I could think of atm that can go wrong is in the following function Rec_Recognition, so I updated it.

Please recopy and try it again, and inform me about the result.