AutoHotkey Community

It is currently May 26th, 2012, 4:30 am

All times are UTC [ DST ]




Post new topic Reply to topic  [ 129 posts ]  Go to page 1, 2, 3, 4, 5 ... 9  Next
Author Message
 Post subject: Voice Recognition COM
PostPosted: December 21st, 2007, 2:44 am 
Offline

Joined: February 12th, 2007, 7:54 am
Posts: 2462
I rewrite the script in the following topic to use COM Standard Library, which is recommended over CoHelper version:
http://www.autohotkey.com/forum/viewtopic.php?t=20493

Code:
#Persistent
OnExit, CleanUp

COM_Init()
plistener:= COM_CreateObject("SAPI.SpSharedRecognizer")
COM_Invoke(plistener, "AudioInput", paudioin ? "+" . paudioin : "+0")
pcontext := COM_Invoke(plistener, "CreateRecoContext")
pgrammar := COM_Invoke(pcontext , "CreateGrammar")
COM_Invoke(pgrammar, "DictationSetState", 0)
prules := COM_Invoke(pgrammar, "Rules")
prulec := COM_Invoke(prules, "Add", "wordsRule", 0x1|0x20)
COM_Invoke(prulec, "Clear")
pstate := COM_Invoke(prulec, "InitialState")

; Add here the words to be recognized!
COM_Invoke(pstate, "AddWordTransition", "+" . 0, "One")
COM_Invoke(pstate, "AddWordTransition", "+" . 0, "Two")
COM_Invoke(pstate, "AddWordTransition", "+" . 0, "Three")
;;

COM_Invoke(prules, "Commit")
COM_Invoke(pgrammar, "CmdSetRuleState", "wordsRule", 1)
COM_Invoke(prules, "Commit")
pevent := COM_ConnectObject(pcontext, "On")
Return

CleanUp:
COM_Release(pevent)
COM_Release(pstate)
COM_Release(prulec)
COM_Release(prules)
COM_Release(pgrammar)
COM_Release(pcontext)
COM_Release(plistener)
COM_Term()
ExitApp

OnRecognition(prms, this)
{
   presult := COM_DispGetParam(prms, 3, 9)
   pphrase := COM_Invoke(presult, "PhraseInfo")
   sText   := COM_Invoke(pphrase, "GetText")
   COM_Release(pphrase)
;   Add custom operations from here!
}


Last edited by Sean on November 22nd, 2008, 1:38 am, edited 1 time in total.

Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: March 12th, 2008, 9:43 pm 
Offline

Joined: March 12th, 2008, 8:45 pm
Posts: 62
Location: OR
ok i'm haveing a problem with your script and i have no idea how to fix it here's the error

Error: Call to nonexistent function.
specifically:COM_Init()
(points to the line in code)
the program will exit

how can i fix this :?:


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: March 12th, 2008, 9:49 pm 
Offline

Joined: April 22nd, 2007, 6:33 pm
Posts: 1833
Quote:
I rewrite the script in the following topic to use COM Standard Library
:?:


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: March 12th, 2008, 10:24 pm 
Offline

Joined: March 12th, 2008, 8:45 pm
Posts: 62
Location: OR
i have that but it still dosent work


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: March 14th, 2008, 7:48 am 
Read about Standard Library:
http://www.autohotkey.com/docs/Functions.htm#lib


Report this post
Top
  
Reply with quote  
 Post subject:
PostPosted: March 14th, 2008, 11:01 pm 
Offline

Joined: March 12th, 2008, 8:45 pm
Posts: 62
Location: OR
ok i got it to kinda work but now it says

No Event Interface Exists! Now exit the application.

so how do i fix that now


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: March 15th, 2008, 12:46 am 
Read the original thread linked at the top. You need to install
Speech SDK 5.1


Report this post
Top
  
Reply with quote  
 Post subject:
PostPosted: March 17th, 2008, 5:25 pm 
Offline

Joined: June 26th, 2007, 10:52 am
Posts: 53
Thank you for this sample. It's good to see it's possible to use ahk with speech recognition.

BUT I'm still thinking that speech recognition should be added in the core C++ code of ahk and be implemented like hotstrings.


Code:
"What time is it"::
  ;gives the time
return


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: March 17th, 2008, 10:21 pm 
Offline

Joined: October 17th, 2007, 10:57 pm
Posts: 17
Anonymous wrote:
Read the original thread linked at the top. You need to install
Speech SDK 5.1


Which parts of the SDK are needed?


Report this post
Top
 Profile  
Reply with quote  
PostPosted: March 30th, 2008, 7:11 pm 
Offline

Joined: December 29th, 2007, 9:40 pm
Posts: 142
Sean wrote:
I rewrite the script in the following topic to use COM Standard Library, which is recommended over CoHelper version:
http://www.autohotkey.com/forum/viewtopic.php?t=20493
^^^Nice Tool!^^^

Hey Sean.

Thanks for the script. I have been able to get it running and working stand alone reasonably well (I occasionally get the no interface dialog, but I suspect that it is because I am being too impatient (opening a second instance of the script while the first one is releasing it resources, etc.) and confusing the underlying MS framework).

At any rate, I wanted to take a second to ask you if you'd (or someone else who is also knowledgeable) be willing to take a minute and explain the script's functionality a bit (as a teaching/learning exercise, as I am still absorbing the various concepts...). I'd like to evaluate incorporating the speech recognition into my project, but not understanding how it is actually able to do what it is doing causes me some inability to properly evaluate it.

So, I'll pose my questions:

From the manual, I understand the following:

The Manual wrote:
A script that is not persistent and that lacks hotkeys, hotstrings, OnMessage, and GUI will terminate after the auto-execute section has completed. Otherwise, it will stay running in an idle state, responding to events such as hotkeys, hotstrings, GUI events, custom menu items, and timers.


So, with the #persistent after including the COM library, the script executes up to the evaluation/assignment of pevent, and then goes idle after same, waiting for events to occur.

This is what I am struggling with: How the hell is the script triggered to cause the OnRecognition function to kick off? i.e. what faculties of AHK are you employing to cause this functionality to exist. An explanation and/or a link to the manual would be greatly appreciated.

I suspect that it has to do with COM and COMevents, but I am not able to glean how this is actually implemented from your code. I suspect that it is probably a concept that is easily grasped, once 'verbalized'.?.

A 2nd question related to the Speech SDK: I suspect that the speech SDK facilities that are being employed are indeed NOT speaker independent - i.e. that some training would have to have taken place previously, so that the SDK's tool set has a frame of reference for the user. Is this indeed the case, or is it using generic enough speech recognition patterns to allow for user-independent voice recognition?

Thanks so much. Your time, in reviewing my questions, consider same, and replying to advise, is greatly appreciated.

Have a great day.

-t

_________________
When replying, please feel free to address me as Tod. My AHK.net site...


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: March 30th, 2008, 7:21 pm 
@ TodWulff
Just to let you know ... don't be angry if a German won't prefer to call you [Tod]


Report this post
Top
  
Reply with quote  
 Post subject:
PostPosted: March 30th, 2008, 7:24 pm 
Offline

Joined: December 29th, 2007, 9:40 pm
Posts: 142
BoBo¨ wrote:
@ TodWulff
Just to let you know ... don't be angry if a German won't prefer to call you [Tod]

OK, I am obviously a bit ignorant here, admittedly, so please do me the flavor of bringing me up to speed so as to remove any egg that I might have on my face... TIA. Anxiously anticipating an explanation... :)

[EDIT] Didn't notice the link. Sorry. Yikes. Fully understood. Thanks! [/EDIT]

_________________
When replying, please feel free to address me as Tod. My AHK.net site...


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: April 24th, 2008, 6:47 am 
Offline

Joined: April 24th, 2008, 6:45 am
Posts: 1
Bravo on the conversion. But may I ask, what are some of the optional parameters I can pass in, to tweak and adjust the way things work? Any flexibility here.... does anybody know? :)


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: April 27th, 2008, 5:06 am 
Offline

Joined: March 27th, 2008, 2:14 pm
Posts: 700
maximina wrote:
Which parts of the SDK are needed?

That download page says this:
Microsoft Website wrote:
Important File Download Details

* If you want to download sample code, documentation, SAPI, and the U.S. English Speech engines for development purposes, download the Speech SDK 5.1 file (SpeechSDK51.exe).

* If you want to use the Japanese and Simplified Chinese engines for development purposes, download the Speech SDK 5.1 Language Pack file (SpeechSDK51LangPack.exe) in addition to the Speech SDK 5.1 file.

* If you want to redistribute the Speech API and/or the Speech engines to integrate and ship as a part of your product (e.g. AHK) **Hint**Hint** ;), download the Speech 5.1 SDK Redistributables file (SpeechSDK51MSM.exe).

* If you want to get only the Mike and Mary voices redistributable for Windows XP, download Mike and Mary redistributables (Sp5TTIntXP.exe).

* If you only want the documentation, download the Documentation file (sapi.chm).
(2maximina: So you'd probably want to just go with the first one: "SpeechSDK51.exe")

Hey guys, look at that - microsoft's dropping hints for AHK. and wink smileys too, interesting... :D:D **LooksAway&Whistles**

_________________
Scripts - License


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: April 30th, 2008, 6:31 am 
Offline

Joined: March 27th, 2008, 2:14 pm
Posts: 700
This is really cool, thnx. :D I'll have to get a headset or something so it can understand me better. :P

Anyway, I have some questions about this. Any answers greatly appreciated. :)

1.__________
As with TodWulff, I don't understand how the "OnRecognition()" function is called. I tried looking through the COM lib, but it references itself so much, I had my head spinning by the 3rd func. :P (bit manipulation is a bit over my head anyway)
It seems to work similar to "OnMessage()", does COM set that up? or is OnRecognition a built-in function by it's own right?
I have a simplified idea of how it works, please correct me if I'm wrong: (probably am, so that means "please correct me", i guess ;))
Quote:
You pass the word you want to recognize to COM along with some other info, COM tells SDK which word to look for, and dynamically sets up an OnMessage based on that. Then when SDK recognizes a word, it sends a message to the script telling it that the word was recognized, COM receives that message, and sends it to OnRecognition, which uses COM agian to turn it back into the text you get from the var: "sText"
Is that even close? Even if, I still don't understand how OnRecognition is defined as the function to go to if it finds it, unless that's just part of COM.

2.__________
How would I clear words from being recognized anymore? Can I do it individually or does it have to be done all at once? If all at once, is it "COM_Release(pstate)"? (Because you use "pstate" in the COM_Invoke for recognizing a word, and that's in the OnExit subroutine)

3.__________
Is there a way to tell it to receive All words SDK recognizes? I know it would be unreliable, so I would only use it momentarily, but I would like to know. (does it have to do with the "+" . 0 paramater passed to the COM_Invoke?)

Thanks agian for any help. :D Voice recognition is really cool, and I would love to understand it more so I can meet it's potential better. 8)

_________________
Scripts - License


Report this post
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 129 posts ]  Go to page 1, 2, 3, 4, 5 ... 9  Next

All times are UTC [ DST ]


Who is online

Users browsing this forum: No registered users and 12 guests


You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group