UWP continuous speech recognition API. Win10

Post your working scripts, libraries and tools for AHK v1.1 and older
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

UWP continuous speech recognition API. Win10

17 Nov 2021, 19:59

https://docs.microsoft.com/en-us/uwp/api/windows.media.speechrecognition
Only 8 languages supported
https://en.wikipedia.org/wiki/Cortana
We needed language with installed speech recognition pack, internet connection and enabled privacy settings for online speech recognition.
f11 - start recognition, f12 - stop.
Also there is small memory leaking, to get round it We can uncomment DllCall("psapi.dll\EmptyWorkingSet", "ptr", -1).

Code: Select all

lang := "en-us"
setbatchlines -1
#singleinstance, force
return

f11::
if RecordingStarted
   return
RecordingStarted := !RecordingStarted
global text := ""
if !init
{
   CreateClass("Windows.Globalization.Language", ILanguageFactory := "{9B0252AC-0C27-44F8-B792-9793FB66C63E}", LanguageFactory)
   CreateHString(lang, hString)
   DllCall(NumGet(NumGet(LanguageFactory+0)+6*A_PtrSize), "ptr", LanguageFactory, "ptr", hString, "ptr*", Language)   ; CreateLanguage
   DeleteHString(hString)
   CreateClass("Windows.Media.SpeechRecognition.SpeechRecognizer", ISpeechRecognizerFactory := "{60C488DD-7FB8-4033-AC70-D046F64818E1}", SpeechRecognizerFactory)
   CreateClass("Windows.Media.SpeechRecognition.SpeechRecognitionTopicConstraint", ISpeechRecognitionTopicConstraintFactory := "{6E6863DF-EC05-47D7-A5DF-56A3431E58D2}", SpeechRecognitionTopicConstraintFactory)
   CreateHString("Dictation", hString)
   DllCall(NumGet(NumGet(SpeechRecognitionTopicConstraintFactory+0)+6*A_PtrSize), "ptr", SpeechRecognitionTopicConstraintFactory, "int", Dictation := 1, "ptr", hString, "ptr*", SpeechRecognitionTopicConstraint)   ; SpeechRecognitionTopicConstraintFactory.Create
   DeleteHString(hString)
   SpeechRecognitionConstraint := ComObjQuery(SpeechRecognitionTopicConstraint, ISpeechRecognitionConstraint := "{79AC1628-4D68-43C4-8911-40DC4101B55B}")
   DllCall(NumGet(NumGet(SpeechRecognitionConstraint+0)+7*A_PtrSize), "ptr", SpeechRecognitionConstraint, "int", 1)   ; SpeechRecognitionConstraint.put_IsEnabled
   init := 1
}
hr := DllCall(NumGet(NumGet(SpeechRecognizerFactory+0)+6*A_PtrSize), "ptr", SpeechRecognizerFactory, "ptr", Language, "ptr*", SpeechRecognizer, "uint")   ; SpeechRecognizerFactory.Create
if (hr != 0)
{
   if (hr = 0x800455BC)
      msgbox Specified language is not supported
   else
      msgbox SpeechRecognizerFactory.Create error %hr%
   exitapp
}
DllCall(NumGet(NumGet(SpeechRecognizer+0)+7*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognitionConstraints)   ; SpeechRecognizer.get_Constraints
DllCall(NumGet(NumGet(SpeechRecognitionConstraints+0)+13*A_PtrSize), "ptr", SpeechRecognitionConstraints, "ptr", SpeechRecognitionConstraint)   ; IVector.Append(T)
DllCall(NumGet(NumGet(SpeechRecognizer+0)+8*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognizerTimeouts)   ; SpeechRecognizer.get_Timeouts
DllCall(NumGet(NumGet(SpeechRecognizerTimeouts+0)+7*A_PtrSize), "ptr", SpeechRecognizerTimeouts, "int64", 0)   ; SpeechRecognizerTimeouts.put_InitialSilenceTimeout
DllCall(NumGet(NumGet(SpeechRecognizer+0)+10*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognitionCompilationResult)   ; SpeechRecognizer.CompileConstraintsAsync
WaitForAsync(SpeechRecognitionCompilationResult)
DllCall(NumGet(NumGet(SpeechRecognitionCompilationResult+0)+6*A_PtrSize), "ptr", SpeechRecognitionCompilationResult, "uint*", status)   ; SpeechRecognitionCompilationResult.get_Status
if (status != 0)
{
   if (status = 1)
      msgbox SpeechRecognitionCompilation error`nA topic constraint was set for an unsupported language.
   else if (status = 2)
      msgbox SpeechRecognitionCompilation error`nThe language of the speech recognizer does not match the language of a grammar.
   else if (status = 3)
      msgbox SpeechRecognitionCompilation error`nA grammar failed to compile.
   else if (status = 4)
      msgbox SpeechRecognitionCompilation error`nAudio problems caused recognition to fail.
   else if (status = 5)
      msgbox SpeechRecognitionCompilation error`nUser canceled recognition session.
   else if (status = 6)
      msgbox SpeechRecognitionCompilation error`nAn unknown problem caused recognition or compilation to fail.
   else if (status = 7)
      msgbox SpeechRecognitionCompilation error`nA timeout due to extended silence or poor audio caused recognition to fail.
   else if (status = 8)
      msgbox SpeechRecognitionCompilation error`nAn extended pause, or excessive processing time, caused recognition to fail.
   else if (status = 9)
      msgbox SpeechRecognitionCompilation error`nNetwork problems caused recognition to fail.
   else if (status = 10)
      msgbox SpeechRecognitionCompilation error`nLack of a microphone caused recognition to fail.
   else
      msgbox SpeechRecognitionCompilation error`n status %status%
   exitapp
}
SpeechRecognizer2 := ComObjQuery(SpeechRecognizer, ISpeechRecognizer2 := "{63C9BAF1-91E3-4EA4-86A1-7C3867D084A6}")
DllCall(NumGet(NumGet(SpeechRecognizer2+0)+6*A_PtrSize), "ptr", SpeechRecognizer2, "ptr*", SpeechContinuousRecognitionSession)   ; SpeechRecognizer2.get_ContinuousRecognitionSession
DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+14*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr", SpeechContinuousRecognitionCompletedEvent := ISpeechContinuousRecognitionCompletedEvent_new(), "int64*", CompletedToken)   ; SpeechContinuousRecognitionSession.add_Completed
DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+16*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr", SpeechContinuousRecognitionResultGeneratedEvent := ISpeechContinuousRecognitionResultGeneratedEvent_new(), "int64*", ResultGeneratedToken)   ; SpeechContinuousRecognitionSession.add_ResultGenerated
hr := DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+8*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr*", AsyncAction, "uint")   ; SpeechContinuousRecognitionSession.StartAsync
if (hr != 0)
{
   if (hr = 0x80045509)
      msgbox Error. Turn on Online speech recognition
   else
      msgbox SpeechContinuousRecognitionSession.StartAsync error %hr%
   exitapp
}
WaitForAsync(AsyncAction)
if RecordingStarted
   tooltip Record Started
return

f12::
stop:
if !RecordingStarted
   return
tooltip Record is finishing
RecordingStarted := !RecordingStarted
sleep 200   ; for recognizing last word
hr := DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+10*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr*", AsyncAction)   ; SpeechContinuousRecognitionSession.StopAsync
if (hr = 0)
   WaitForAsync(AsyncAction)
tooltip
ObjReleaseClose(SpeechContinuousRecognitionCompletedEvent)
ObjReleaseClose(SpeechContinuousRecognitionResultGeneratedEvent)
if CompletedToken
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+15*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "int64", CompletedToken)   ; SpeechContinuousRecognitionSession.remove_Completed
   CompletedToken := ""
}
if ResultGeneratedToken
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+17*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "int64", ResultGeneratedToken)   ; SpeechContinuousRecognitionSession.remove_ResultGenerated
   ResultGeneratedToken := ""
}
ObjReleaseClose(SpeechContinuousRecognitionSession)
ObjReleaseClose(SpeechRecognizer2)
ObjReleaseClose(SpeechRecognitionCompilationResult)
ObjReleaseClose(SpeechRecognizerTimeouts)
ObjReleaseClose(SpeechRecognitionConstraints)
ObjReleaseClose(SpeechRecognizer)
; DllCall("psapi.dll\EmptyWorkingSet", "ptr", -1)
if (A_ThisLabel != "stop")
   msgbox % text
return



CreateClass(string, interface, ByRef Class)
{
   CreateHString(string, hString)
   VarSetCapacity(GUID, 16)
   DllCall("ole32\CLSIDFromString", "wstr", interface, "ptr", &GUID)
   result := DllCall("Combase.dll\RoGetActivationFactory", "ptr", hString, "ptr", &GUID, "ptr*", Class, "uint")
   if (result != 0)
   {
      if (result = 0x80004002)
         msgbox No such interface supported
      else if (result = 0x80040154)
         msgbox Class not registered
      else
         msgbox error: %result%
      ExitApp
   }
   DeleteHString(hString)
}

CreateHString(string, ByRef hString)
{
    DllCall("Combase.dll\WindowsCreateString", "wstr", string, "uint", StrLen(string), "ptr*", hString)
}

DeleteHString(hString)
{
   DllCall("Combase.dll\WindowsDeleteString", "ptr", hString)
}

WaitForAsync(ByRef Object)
{
   AsyncInfo := ComObjQuery(Object, IAsyncInfo := "{00000036-0000-0000-C000-000000000046}")
   loop
   {
      DllCall(NumGet(NumGet(AsyncInfo+0)+7*A_PtrSize), "ptr", AsyncInfo, "uint*", status)   ; IAsyncInfo.Status
      if (status != 0)
      {
         if (status != 1)
         {
            DllCall(NumGet(NumGet(AsyncInfo+0)+8*A_PtrSize), "ptr", AsyncInfo, "uint*", ErrorCode)   ; IAsyncInfo.ErrorCode
            msgbox AsyncInfo status error: %ErrorCode%
            ExitApp
         }
         ObjRelease(AsyncInfo)
         break
      }
      sleep 10
   }
   DllCall(NumGet(NumGet(Object+0)+8*A_PtrSize), "ptr", Object, "ptr*", ObjectResult)   ; GetResults
   ObjReleaseClose(Object)
   Object := ObjectResult
}

ObjReleaseClose(ByRef Object)
{
   if Object
   {
      if (Close := ComObjQuery(Object, IClosable := "{30D5A829-7FA4-4026-83BB-D75BAE4EA99E}"))
      {
         DllCall(NumGet(NumGet(Close+0)+6*A_PtrSize), "ptr", Close)   ; Close
         ObjRelease(Close)
      }
      ObjRelease(Object)
      Object := ""
   }
}


ISpeechContinuousRecognitionCompletedEvent_new() {
   static VTBL := [ "QueryInterface"
                  , "AddRef"
                  , "Release"
                  , "Invoke" ]
        , heapSize := A_PtrSize*10
        , heapOffset := A_PtrSize*9
        
        , flags := (HEAP_GENERATE_EXCEPTIONS := 0x4) | (HEAP_NO_SERIALIZE := 0x1)
        , HEAP_ZERO_MEMORY := 0x8
   
   hHeap := DllCall("HeapCreate", "UInt", flags, "Ptr", 0, "Ptr", 0, "Ptr")
   addr := ISpeechContinuousRecognitionCompletedEvent := DllCall("HeapAlloc", "Ptr", hHeap, "UInt", HEAP_ZERO_MEMORY, "Ptr", heapSize, "Ptr")
   addr := NumPut(addr + A_PtrSize, addr + 0)
   for k, v in VTBL
      addr := NumPut(RegisterSyncCallback("ISpeechContinuousRecognitionCompletedEvent_" . v), addr + 0 )
   NumPut(hHeap, ISpeechContinuousRecognitionCompletedEvent + heapOffset)
   Return ISpeechContinuousRecognitionCompletedEvent
}

ISpeechContinuousRecognitionCompletedEvent_QueryInterface(this, riid, ppvObject)
{
   static IID_IUnknown, IID_ISpeechContinuousRecognitionCompletedEvent
   if (!VarSetCapacity(IID_IUnknown))
   {
      VarSetCapacity(IID_IUnknown, 16), VarSetCapacity(IID_ISpeechContinuousRecognitionCompletedEvent, 16)
      DllCall("ole32\CLSIDFromString", "WStr", "{00000000-0000-0000-C000-000000000046}", "Ptr", &IID_IUnknown)
      DllCall("ole32\CLSIDFromString", "WStr", "{8103c018-7952-59f9-9f41-23b17d6e452d}", "Ptr", &IID_ISpeechContinuousRecognitionCompletedEvent)
   }
   if (DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_ISpeechContinuousRecognitionCompletedEvent) || DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_IUnknown))
   {
      NumPut(this, ppvObject+0, "Ptr")
      ISpeechContinuousRecognitionCompletedEvent_AddRef(this)
      return 0 ; S_OK
   }
   NumPut(0, ppvObject+0, "Ptr")
   return 0x80004002 ; E_NOINTERFACE
}

ISpeechContinuousRecognitionCompletedEvent_AddRef(this) {
   static refOffset := A_PtrSize*8
   NumPut(refCount := NumGet(this + refOffset, "UInt") + 1, this + refOffset, "UInt")
   Return refCount
}

ISpeechContinuousRecognitionCompletedEvent_Release(this) {
   static refOffset := A_PtrSize*8
        , heapOffset := A_PtrSize*9
   NumPut(refCount := NumGet(this + refOffset, "UInt") - 1, this + refOffset, "UInt")
   if (refCount = 0) {
      hHeap := NumGet(this + heapOffset)
      DllCall("HeapDestroy", "Ptr", hHeap)
   }
   Return refCount
}

ISpeechContinuousRecognitionCompletedEvent_Invoke(this, sender, SpeechContinuousRecognitionCompleted)
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionCompleted+0)+6*A_PtrSize), "ptr", SpeechContinuousRecognitionCompleted, "uint*", status)   ; SpeechContinuousRecognitionCompleted.get_Status
   if (status != 0)
   {
      if (status = 1)
         msgbox A topic constraint was set for an unsupported language.
      else if (status = 2)
         msgbox The language of the speech recognizer does not match the language of a grammar.
      else if (status = 3)
         msgbox A grammar failed to compile.
      else if (status = 4)
         msgbox Audio problems caused recognition to fail.
      else if (status = 5)
         msgbox User canceled recognition session.
      else if (status = 6)
         msgbox An unknown problem caused recognition or compilation to fail.
      else if (status = 7)
         msgbox A timeout due to extended silence or poor audio caused recognition to fail.
      else if (status = 8)
         msgbox An extended pause, or excessive processing time, caused recognition to fail.
      else if (status = 9)
         msgbox Network problems caused recognition to fail.
      else if (status = 10)
         msgbox Lack of a microphone caused recognition to fail.
      else
         msgbox error status %status%
      settimer, stop, -1
   }
   return
}


ISpeechContinuousRecognitionResultGeneratedEvent_new() {
   static VTBL := [ "QueryInterface"
                  , "AddRef"
                  , "Release"
                  , "Invoke" ]
        , heapSize := A_PtrSize*10
        , heapOffset := A_PtrSize*9
        
        , flags := (HEAP_GENERATE_EXCEPTIONS := 0x4) | (HEAP_NO_SERIALIZE := 0x1)
        , HEAP_ZERO_MEMORY := 0x8
   
   hHeap := DllCall("HeapCreate", "UInt", flags, "Ptr", 0, "Ptr", 0, "Ptr")
   addr := ISpeechContinuousRecognitionResultGeneratedEvent := DllCall("HeapAlloc", "Ptr", hHeap, "UInt", HEAP_ZERO_MEMORY, "Ptr", heapSize, "Ptr")
   addr := NumPut(addr + A_PtrSize, addr + 0)
   for k, v in VTBL
      addr := NumPut(RegisterSyncCallback("ISpeechContinuousRecognitionResultGeneratedEvent_" . v), addr + 0 )
   NumPut(hHeap, ISpeechContinuousRecognitionResultGeneratedEvent + heapOffset)
   Return ISpeechContinuousRecognitionResultGeneratedEvent
}

ISpeechContinuousRecognitionResultGeneratedEvent_QueryInterface(this, riid, ppvObject)
{
   static IID_IUnknown, IID_ISpeechContinuousRecognitionResultGeneratedEvent
   if (!VarSetCapacity(IID_IUnknown))
   {
      VarSetCapacity(IID_IUnknown, 16), VarSetCapacity(IID_ISpeechContinuousRecognitionResultGeneratedEvent, 16)
      DllCall("ole32\CLSIDFromString", "WStr", "{00000000-0000-0000-C000-000000000046}", "Ptr", &IID_IUnknown)
      DllCall("ole32\CLSIDFromString", "WStr", "{26192073-a2c9-527d-9bd3-911c05e0011e}", "Ptr", &IID_ISpeechContinuousRecognitionResultGeneratedEvent)
   }
   if (DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_ISpeechContinuousRecognitionResultGeneratedEvent) || DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_IUnknown))
   {
      NumPut(this, ppvObject+0, "Ptr")
      ISpeechContinuousRecognitionResultGeneratedEvent_AddRef(this)
      return 0 ; S_OK
   }
   NumPut(0, ppvObject+0, "Ptr")
   return 0x80004002 ; E_NOINTERFACE
}

ISpeechContinuousRecognitionResultGeneratedEvent_AddRef(this) {
   static refOffset := A_PtrSize*8
   NumPut(refCount := NumGet(this + refOffset, "UInt") + 1, this + refOffset, "UInt")
   Return refCount
}

ISpeechContinuousRecognitionResultGeneratedEvent_Release(this) {
   static refOffset := A_PtrSize*8
        , heapOffset := A_PtrSize*9
   NumPut(refCount := NumGet(this + refOffset, "UInt") - 1, this + refOffset, "UInt")
   if (refCount = 0) {
      hHeap := NumGet(this + heapOffset)
      DllCall("HeapDestroy", "Ptr", hHeap)
   }
   Return refCount
}

ISpeechContinuousRecognitionResultGeneratedEvent_Invoke(this, sender, SpeechContinuousRecognitionResultGenerated)
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionResultGenerated+0)+6*A_PtrSize), "ptr", SpeechContinuousRecognitionResultGenerated, "ptr*", SpeechRecognitionResult)   ; SpeechContinuousRecognitionResultGenerated.get_Result
   DllCall(NumGet(NumGet(SpeechRecognitionResult+0)+7*A_PtrSize), "ptr", SpeechRecognitionResult, "ptr*", htext)   ; SpeechRecognitionResult.get_Text
   buffer := DllCall("Combase.dll\WindowsGetStringRawBuffer", "ptr", hText, "uint*", length, "ptr")
   text .= StrGet(buffer, "utf-16")
   ObjReleaseClose(SpeechRecognitionResult)
   return
}


/*
    RegisterSyncCallback

    A replacement for RegisterCallback for use with APIs that will call
    the callback on the wrong thread.  Synchronizes with the script's main
    thread via a window message.

    This version tries to emulate RegisterCallback as much as possible
    without using RegisterCallback, so shares most of its limitations,
    and some enhancements that could be made are not.

    Other differences from v1 RegisterCallback:
      - Variadic mode can't be emulated exactly, so is not supported.
      - A_EventInfo can't be set in v1, so is not supported.
      - Fast mode is not supported (the option is ignored).
      - ByRef parameters are allowed (but ByRef is ignored).
      - Throws instead of returning "" on failure.
*/
RegisterSyncCallback(FunctionName, Options:="", ParamCount:="")
{
    if !(fn := Func(FunctionName)) || fn.IsBuiltIn
        throw Exception("Bad function", -1, FunctionName)
    if (ParamCount == "")
        ParamCount := fn.MinParams
    if (ParamCount > fn.MaxParams && !fn.IsVariadic || ParamCount+0 < fn.MinParams)
        throw Exception("Bad param count", -1, ParamCount)

    static sHwnd := 0, sMsg, sSendMessageW
    if !sHwnd
    {
        Gui RegisterSyncCallback: +Parent%A_ScriptHwnd% +hwndsHwnd
        OnMessage(sMsg := 0x8000, Func("RegisterSyncCallback_Msg"))
        sSendMessageW := DllCall("GetProcAddress", "ptr", DllCall("GetModuleHandle", "str", "user32.dll", "ptr"), "astr", "SendMessageW", "ptr")
    }

    if !(pcb := DllCall("GlobalAlloc", "uint", 0, "ptr", 96, "ptr"))
        throw
    DllCall("VirtualProtect", "ptr", pcb, "ptr", 96, "uint", 0x40, "uint*", 0)

    p := pcb
    if (A_PtrSize = 8)
    {
        /*
        48 89 4c 24 08  ; mov [rsp+8], rcx
        48 89 54'24 10  ; mov [rsp+16], rdx
        4c 89 44 24 18  ; mov [rsp+24], r8
        4c'89 4c 24 20  ; mov [rsp+32], r9
        48 83 ec 28'    ; sub rsp, 40
        4c 8d 44 24 30  ; lea r8, [rsp+48]  (arg 3, &params)
        49 b9 ..        ; mov r9, .. (arg 4, operand to follow)
        */
        p := NumPut(0x54894808244c8948, p+0)
        p := NumPut(0x4c182444894c1024, p+0)
        p := NumPut(0x28ec834820244c89, p+0)
        p := NumPut(  0xb9493024448d4c, p+0) - 1
        lParamPtr := p, p += 8

        p := NumPut(0xba, p+0, "char") ; mov edx, nmsg
        p := NumPut(sMsg, p+0, "int")
        p := NumPut(0xb9, p+0, "char") ; mov ecx, hwnd
        p := NumPut(sHwnd, p+0, "int")
        p := NumPut(0xb848, p+0, "short") ; mov rax, SendMessageW
        p := NumPut(sSendMessageW, p+0)
        /*
        ff d0        ; call rax
        48 83 c4 28  ; add rsp, 40
        c3           ; ret
        */
        p := NumPut(0x00c328c48348d0ff, p+0)
    }
    else ;(A_PtrSize = 4)
    {
        p := NumPut(0x68, p+0, "char")      ; push ... (lParam data)
        lParamPtr := p, p += 4
        p := NumPut(0x0824448d, p+0, "int") ; lea eax, [esp+8]
        p := NumPut(0x50, p+0, "char")      ; push eax
        p := NumPut(0x68, p+0, "char")      ; push nmsg
        p := NumPut(sMsg, p+0, "int")
        p := NumPut(0x68, p+0, "char")      ; push hwnd
        p := NumPut(sHwnd, p+0, "int")
        p := NumPut(0xb8, p+0, "char")      ; mov eax, &SendMessageW
        p := NumPut(sSendMessageW, p+0, "int")
        p := NumPut(0xd0ff, p+0, "short")   ; call eax
        p := NumPut(0xc2, p+0, "char")      ; ret argsize
        p := NumPut((InStr(Options, "C") ? 0 : ParamCount*4), p+0, "short")
    }
    NumPut(p, lParamPtr+0) ; To be passed as lParam.
    p := NumPut(&fn, p+0)
    p := NumPut(ParamCount, p+0, "int")
    return pcb
}

RegisterSyncCallback_Msg(wParam, lParam)
{
    if (A_Gui != "RegisterSyncCallback")
        return
    fn := Object(NumGet(lParam + 0))
    paramCount := NumGet(lParam + A_PtrSize, "int")
    params := []
    Loop % paramCount
        params.Push(NumGet(wParam + A_PtrSize * (A_Index-1)))
    return %fn%(params*)
}
Example of detecting phrases "run calculator" and "run notepad"

Code: Select all

#singleinstance, force
setbatchlines -1
lang := "en-us"
GoSub, start
return

ISpeechContinuousRecognitionResultGeneratedEvent_Invoke(this, sender, SpeechContinuousRecognitionResultGenerated)
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionResultGenerated+0)+6*A_PtrSize), "ptr", SpeechContinuousRecognitionResultGenerated, "ptr*", SpeechRecognitionResult)   ; SpeechContinuousRecognitionResultGenerated.get_Result
   DllCall(NumGet(NumGet(SpeechRecognitionResult+0)+7*A_PtrSize), "ptr", SpeechRecognitionResult, "ptr*", htext)   ; SpeechRecognitionResult.get_Text
   buffer := DllCall("Combase.dll\WindowsGetStringRawBuffer", "ptr", hText, "uint*", length, "ptr")
   text := StrGet(buffer, "utf-16")
   ObjReleaseClose(SpeechRecognitionResult)
   if (text = "run calculator")
      msgbox run calculator
   else if (text = "run notepad")
      msgbox run notepad
   return
}

ISpeechContinuousRecognitionCompletedEvent_Invoke(this, sender, SpeechContinuousRecognitionCompleted)
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionCompleted+0)+6*A_PtrSize), "ptr", SpeechContinuousRecognitionCompleted, "uint*", status)   ; SpeechContinuousRecognitionCompleted.get_Status
   if (status != 0)
   {
      if (status = 1)
         msgbox A topic constraint was set for an unsupported language.
      else if (status = 2)
         msgbox The language of the speech recognizer does not match the language of a grammar.
      else if (status = 3)
         msgbox A grammar failed to compile.
      else if (status = 4)
         msgbox Audio problems caused recognition to fail.
      else if (status = 5)   ; User canceled recognition session.
      {
         settimer, restart, -1
         return
      }
      else if (status = 6)   ;  An unknown problem caused recognition or compilation to fail.
      {
         ; msgbox An unknown problem caused recognition or compilation to fail.
         reload
      }
      else if (status = 7)   ; A timeout due to extended silence or poor audio caused recognition to fail.
      {
         ; msgbox A timeout due to extended silence or poor audio caused recognition to fail.
         reload
      }
      else if (status = 8)   ; An extended pause, or excessive processing time, caused recognition to fail.
      {
         ; msgbox An extended pause, or excessive processing time, caused recognition to fail.
         reload
      }
      else if (status = 9)
         msgbox Network problems caused recognition to fail.
      else if (status = 10)
         msgbox Lack of a microphone caused recognition to fail.
      else
         msgbox error status %status%
      exitapp
   }
   return
}

ISpeechRecognizerStateChangedEvent_Invoke(this, sender, SpeechRecognizerStateChangedEventArgs)
{
   DllCall(NumGet(NumGet(SpeechRecognizerStateChangedEventArgs+0)+6*A_PtrSize), "ptr", SpeechRecognizerStateChangedEventArgs, "uint*", state)   ; SpeechRecognizerStateChangedEventArgs.get_State
   if (state = 0)  ; Idle
      settimer, restart, -1
   return
}

CreateClass(string, interface, ByRef Class)
{
   CreateHString(string, hString)
   VarSetCapacity(GUID, 16)
   DllCall("ole32\CLSIDFromString", "wstr", interface, "ptr", &GUID)
   result := DllCall("Combase.dll\RoGetActivationFactory", "ptr", hString, "ptr", &GUID, "ptr*", Class, "uint")
   if (result != 0)
   {
      if (result = 0x80004002)
         msgbox No such interface supported
      else if (result = 0x80040154)
         msgbox Class not registered
      else
         msgbox error: %result%
      ExitApp
   }
   DeleteHString(hString)
}

CreateHString(string, ByRef hString)
{
   DllCall("Combase.dll\WindowsCreateString", "wstr", string, "uint", StrLen(string), "ptr*", hString)
}

DeleteHString(hString)
{
   DllCall("Combase.dll\WindowsDeleteString", "ptr", hString)
}

WaitForAsync(ByRef Object)
{
   AsyncInfo := ComObjQuery(Object, IAsyncInfo := "{00000036-0000-0000-C000-000000000046}")
   loop
   {
      DllCall(NumGet(NumGet(AsyncInfo+0)+7*A_PtrSize), "ptr", AsyncInfo, "uint*", status)   ; IAsyncInfo.Status
      if (status != 0)
      {
         if (status != 1)
         {
            DllCall(NumGet(NumGet(AsyncInfo+0)+8*A_PtrSize), "ptr", AsyncInfo, "uint*", ErrorCode)   ; IAsyncInfo.ErrorCode
            msgbox AsyncInfo status error: %ErrorCode%
            ExitApp
         }
         ObjRelease(AsyncInfo)
         break
      }
      sleep 10
   }
   DllCall(NumGet(NumGet(Object+0)+8*A_PtrSize), "ptr", Object, "ptr*", ObjectResult)   ; GetResults
   ObjReleaseClose(Object)
   Object := ObjectResult
}

ObjReleaseClose(ByRef Object)
{
   if Object
   {
      if (Close := ComObjQuery(Object, IClosable := "{30D5A829-7FA4-4026-83BB-D75BAE4EA99E}"))
      {
         DllCall(NumGet(NumGet(Close+0)+6*A_PtrSize), "ptr", Close)   ; Close
         ObjRelease(Close)
      }
      ObjRelease(Object)
      Object := ""
   }
}


ISpeechRecognizerStateChangedEvent_new() {
   static VTBL := [ "QueryInterface"
                  , "AddRef"
                  , "Release"
                  , "Invoke" ]
        , heapSize := A_PtrSize*10
        , heapOffset := A_PtrSize*9
        
        , flags := (HEAP_GENERATE_EXCEPTIONS := 0x4) | (HEAP_NO_SERIALIZE := 0x1)
        , HEAP_ZERO_MEMORY := 0x8
   
   hHeap := DllCall("HeapCreate", "UInt", flags, "Ptr", 0, "Ptr", 0, "Ptr")
   addr := ISpeechRecognizerStateChangedEvent := DllCall("HeapAlloc", "Ptr", hHeap, "UInt", HEAP_ZERO_MEMORY, "Ptr", heapSize, "Ptr")
   addr := NumPut(addr + A_PtrSize, addr + 0)
   for k, v in VTBL
      addr := NumPut(RegisterSyncCallback("ISpeechRecognizerStateChangedEvent_" . v), addr + 0 )
   NumPut(hHeap, ISpeechRecognizerStateChangedEvent + heapOffset)
   Return ISpeechRecognizerStateChangedEvent
}

ISpeechRecognizerStateChangedEvent_QueryInterface(this, riid, ppvObject)
{
   static IID_IUnknown, IID_ISpeechRecognizerStateChangedEvent
   if (!VarSetCapacity(IID_IUnknown))
   {
      VarSetCapacity(IID_IUnknown, 16), VarSetCapacity(IID_ISpeechRecognizerStateChangedEvent, 16)
      DllCall("ole32\CLSIDFromString", "WStr", "{00000000-0000-0000-C000-000000000046}", "Ptr", &IID_IUnknown)
      DllCall("ole32\CLSIDFromString", "WStr", "{d1185e92-5c30-5561-b3e2-e82ddbd872c3}", "Ptr", &IID_ISpeechRecognizerStateChangedEvent)
   }
   if (DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_ISpeechRecognizerStateChangedEvent) || DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_IUnknown))
   {
      NumPut(this, ppvObject+0, "Ptr")
      ISpeechRecognizerStateChangedEvent_AddRef(this)
      return 0 ; S_OK
   }
   NumPut(0, ppvObject+0, "Ptr")
   return 0x80004002 ; E_NOINTERFACE
}

ISpeechRecognizerStateChangedEvent_AddRef(this) {
   static refOffset := A_PtrSize*8
   NumPut(refCount := NumGet(this + refOffset, "UInt") + 1, this + refOffset, "UInt")
   Return refCount
}

ISpeechRecognizerStateChangedEvent_Release(this) {
   static refOffset := A_PtrSize*8
        , heapOffset := A_PtrSize*9
   NumPut(refCount := NumGet(this + refOffset, "UInt") - 1, this + refOffset, "UInt")
   if (refCount = 0) {
      hHeap := NumGet(this + heapOffset)
      DllCall("HeapDestroy", "Ptr", hHeap)
   }
   Return refCount
}


ISpeechContinuousRecognitionCompletedEvent_new() {
   static VTBL := [ "QueryInterface"
                  , "AddRef"
                  , "Release"
                  , "Invoke" ]
        , heapSize := A_PtrSize*10
        , heapOffset := A_PtrSize*9
        
        , flags := (HEAP_GENERATE_EXCEPTIONS := 0x4) | (HEAP_NO_SERIALIZE := 0x1)
        , HEAP_ZERO_MEMORY := 0x8
   
   hHeap := DllCall("HeapCreate", "UInt", flags, "Ptr", 0, "Ptr", 0, "Ptr")
   addr := ISpeechContinuousRecognitionCompletedEvent := DllCall("HeapAlloc", "Ptr", hHeap, "UInt", HEAP_ZERO_MEMORY, "Ptr", heapSize, "Ptr")
   addr := NumPut(addr + A_PtrSize, addr + 0)
   for k, v in VTBL
      addr := NumPut(RegisterSyncCallback("ISpeechContinuousRecognitionCompletedEvent_" . v), addr + 0 )
   NumPut(hHeap, ISpeechContinuousRecognitionCompletedEvent + heapOffset)
   Return ISpeechContinuousRecognitionCompletedEvent
}

ISpeechContinuousRecognitionCompletedEvent_QueryInterface(this, riid, ppvObject)
{
   static IID_IUnknown, IID_ISpeechContinuousRecognitionCompletedEvent
   if (!VarSetCapacity(IID_IUnknown))
   {
      VarSetCapacity(IID_IUnknown, 16), VarSetCapacity(IID_ISpeechContinuousRecognitionCompletedEvent, 16)
      DllCall("ole32\CLSIDFromString", "WStr", "{00000000-0000-0000-C000-000000000046}", "Ptr", &IID_IUnknown)
      DllCall("ole32\CLSIDFromString", "WStr", "{8103c018-7952-59f9-9f41-23b17d6e452d}", "Ptr", &IID_ISpeechContinuousRecognitionCompletedEvent)
   }
   if (DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_ISpeechContinuousRecognitionCompletedEvent) || DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_IUnknown))
   {
      NumPut(this, ppvObject+0, "Ptr")
      ISpeechContinuousRecognitionCompletedEvent_AddRef(this)
      return 0 ; S_OK
   }
   NumPut(0, ppvObject+0, "Ptr")
   return 0x80004002 ; E_NOINTERFACE
}

ISpeechContinuousRecognitionCompletedEvent_AddRef(this) {
   static refOffset := A_PtrSize*8
   NumPut(refCount := NumGet(this + refOffset, "UInt") + 1, this + refOffset, "UInt")
   Return refCount
}

ISpeechContinuousRecognitionCompletedEvent_Release(this) {
   static refOffset := A_PtrSize*8
        , heapOffset := A_PtrSize*9
   NumPut(refCount := NumGet(this + refOffset, "UInt") - 1, this + refOffset, "UInt")
   if (refCount = 0) {
      hHeap := NumGet(this + heapOffset)
      DllCall("HeapDestroy", "Ptr", hHeap)
   }
   Return refCount
}


ISpeechContinuousRecognitionResultGeneratedEvent_new() {
   static VTBL := [ "QueryInterface"
                  , "AddRef"
                  , "Release"
                  , "Invoke" ]
        , heapSize := A_PtrSize*10
        , heapOffset := A_PtrSize*9
        
        , flags := (HEAP_GENERATE_EXCEPTIONS := 0x4) | (HEAP_NO_SERIALIZE := 0x1)
        , HEAP_ZERO_MEMORY := 0x8
   
   hHeap := DllCall("HeapCreate", "UInt", flags, "Ptr", 0, "Ptr", 0, "Ptr")
   addr := ISpeechContinuousRecognitionResultGeneratedEvent := DllCall("HeapAlloc", "Ptr", hHeap, "UInt", HEAP_ZERO_MEMORY, "Ptr", heapSize, "Ptr")
   addr := NumPut(addr + A_PtrSize, addr + 0)
   for k, v in VTBL
      addr := NumPut(RegisterSyncCallback("ISpeechContinuousRecognitionResultGeneratedEvent_" . v), addr + 0 )
   NumPut(hHeap, ISpeechContinuousRecognitionResultGeneratedEvent + heapOffset)
   Return ISpeechContinuousRecognitionResultGeneratedEvent
}

ISpeechContinuousRecognitionResultGeneratedEvent_QueryInterface(this, riid, ppvObject)
{
   static IID_IUnknown, IID_ISpeechContinuousRecognitionResultGeneratedEvent
   if (!VarSetCapacity(IID_IUnknown))
   {
      VarSetCapacity(IID_IUnknown, 16), VarSetCapacity(IID_ISpeechContinuousRecognitionResultGeneratedEvent, 16)
      DllCall("ole32\CLSIDFromString", "WStr", "{00000000-0000-0000-C000-000000000046}", "Ptr", &IID_IUnknown)
      DllCall("ole32\CLSIDFromString", "WStr", "{26192073-a2c9-527d-9bd3-911c05e0011e}", "Ptr", &IID_ISpeechContinuousRecognitionResultGeneratedEvent)
   }
   if (DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_ISpeechContinuousRecognitionResultGeneratedEvent) || DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_IUnknown))
   {
      NumPut(this, ppvObject+0, "Ptr")
      ISpeechContinuousRecognitionResultGeneratedEvent_AddRef(this)
      return 0 ; S_OK
   }
   NumPut(0, ppvObject+0, "Ptr")
   return 0x80004002 ; E_NOINTERFACE
}

ISpeechContinuousRecognitionResultGeneratedEvent_AddRef(this) {
   static refOffset := A_PtrSize*8
   NumPut(refCount := NumGet(this + refOffset, "UInt") + 1, this + refOffset, "UInt")
   Return refCount
}

ISpeechContinuousRecognitionResultGeneratedEvent_Release(this) {
   static refOffset := A_PtrSize*8
        , heapOffset := A_PtrSize*9
   NumPut(refCount := NumGet(this + refOffset, "UInt") - 1, this + refOffset, "UInt")
   if (refCount = 0) {
      hHeap := NumGet(this + heapOffset)
      DllCall("HeapDestroy", "Ptr", hHeap)
   }
   Return refCount
}



/*
    RegisterSyncCallback

    A replacement for RegisterCallback for use with APIs that will call
    the callback on the wrong thread.  Synchronizes with the script's main
    thread via a window message.

    This version tries to emulate RegisterCallback as much as possible
    without using RegisterCallback, so shares most of its limitations,
    and some enhancements that could be made are not.

    Other differences from v1 RegisterCallback:
      - Variadic mode can't be emulated exactly, so is not supported.
      - A_EventInfo can't be set in v1, so is not supported.
      - Fast mode is not supported (the option is ignored).
      - ByRef parameters are allowed (but ByRef is ignored).
      - Throws instead of returning "" on failure.
*/
RegisterSyncCallback(FunctionName, Options:="", ParamCount:="")
{
    if !(fn := Func(FunctionName)) || fn.IsBuiltIn
        throw Exception("Bad function", -1, FunctionName)
    if (ParamCount == "")
        ParamCount := fn.MinParams
    if (ParamCount > fn.MaxParams && !fn.IsVariadic || ParamCount+0 < fn.MinParams)
        throw Exception("Bad param count", -1, ParamCount)

    static sHwnd := 0, sMsg, sSendMessageW
    if !sHwnd
    {
        Gui RegisterSyncCallback: +Parent%A_ScriptHwnd% +hwndsHwnd
        OnMessage(sMsg := 0x8000, Func("RegisterSyncCallback_Msg"))
        sSendMessageW := DllCall("GetProcAddress", "ptr", DllCall("GetModuleHandle", "str", "user32.dll", "ptr"), "astr", "SendMessageW", "ptr")
    }

    if !(pcb := DllCall("GlobalAlloc", "uint", 0, "ptr", 96, "ptr"))
        throw
    DllCall("VirtualProtect", "ptr", pcb, "ptr", 96, "uint", 0x40, "uint*", 0)

    p := pcb
    if (A_PtrSize = 8)
    {
        /*
        48 89 4c 24 08  ; mov [rsp+8], rcx
        48 89 54'24 10  ; mov [rsp+16], rdx
        4c 89 44 24 18  ; mov [rsp+24], r8
        4c'89 4c 24 20  ; mov [rsp+32], r9
        48 83 ec 28'    ; sub rsp, 40
        4c 8d 44 24 30  ; lea r8, [rsp+48]  (arg 3, &params)
        49 b9 ..        ; mov r9, .. (arg 4, operand to follow)
        */
        p := NumPut(0x54894808244c8948, p+0)
        p := NumPut(0x4c182444894c1024, p+0)
        p := NumPut(0x28ec834820244c89, p+0)
        p := NumPut(  0xb9493024448d4c, p+0) - 1
        lParamPtr := p, p += 8

        p := NumPut(0xba, p+0, "char") ; mov edx, nmsg
        p := NumPut(sMsg, p+0, "int")
        p := NumPut(0xb9, p+0, "char") ; mov ecx, hwnd
        p := NumPut(sHwnd, p+0, "int")
        p := NumPut(0xb848, p+0, "short") ; mov rax, SendMessageW
        p := NumPut(sSendMessageW, p+0)
        /*
        ff d0        ; call rax
        48 83 c4 28  ; add rsp, 40
        c3           ; ret
        */
        p := NumPut(0x00c328c48348d0ff, p+0)
    }
    else ;(A_PtrSize = 4)
    {
        p := NumPut(0x68, p+0, "char")      ; push ... (lParam data)
        lParamPtr := p, p += 4
        p := NumPut(0x0824448d, p+0, "int") ; lea eax, [esp+8]
        p := NumPut(0x50, p+0, "char")      ; push eax
        p := NumPut(0x68, p+0, "char")      ; push nmsg
        p := NumPut(sMsg, p+0, "int")
        p := NumPut(0x68, p+0, "char")      ; push hwnd
        p := NumPut(sHwnd, p+0, "int")
        p := NumPut(0xb8, p+0, "char")      ; mov eax, &SendMessageW
        p := NumPut(sSendMessageW, p+0, "int")
        p := NumPut(0xd0ff, p+0, "short")   ; call eax
        p := NumPut(0xc2, p+0, "char")      ; ret argsize
        p := NumPut((InStr(Options, "C") ? 0 : ParamCount*4), p+0, "short")
    }
    NumPut(p, lParamPtr+0) ; To be passed as lParam.
    p := NumPut(&fn, p+0)
    p := NumPut(ParamCount, p+0, "int")
    return pcb
}

RegisterSyncCallback_Msg(wParam, lParam)
{
    if (A_Gui != "RegisterSyncCallback")
        return
    fn := Object(NumGet(lParam + 0))
    paramCount := NumGet(lParam + A_PtrSize, "int")
    params := []
    Loop % paramCount
        params.Push(NumGet(wParam + A_PtrSize * (A_Index-1)))
    return %fn%(params*)
}

start:
oRestart := []
CreateClass("Windows.Globalization.Language", ILanguageFactory := "{9B0252AC-0C27-44F8-B792-9793FB66C63E}", LanguageFactory)
CreateHString(lang, hString)
DllCall(NumGet(NumGet(LanguageFactory+0)+6*A_PtrSize), "ptr", LanguageFactory, "ptr", hString, "ptr*", Language)   ; CreateLanguage
DeleteHString(hString)
CreateClass("Windows.Media.SpeechRecognition.SpeechRecognizer", ISpeechRecognizerFactory := "{60C488DD-7FB8-4033-AC70-D046F64818E1}", SpeechRecognizerFactory)
CreateClass("Windows.Media.SpeechRecognition.SpeechRecognitionTopicConstraint", ISpeechRecognitionTopicConstraintFactory := "{6E6863DF-EC05-47D7-A5DF-56A3431E58D2}", SpeechRecognitionTopicConstraintFactory)
CreateHString("Dictation", hString)
DllCall(NumGet(NumGet(SpeechRecognitionTopicConstraintFactory+0)+6*A_PtrSize), "ptr", SpeechRecognitionTopicConstraintFactory, "int", Dictation := 1, "ptr", hString, "ptr*", SpeechRecognitionTopicConstraint)   ; SpeechRecognitionTopicConstraintFactory.Create
DeleteHString(hString)
SpeechRecognitionConstraint := ComObjQuery(SpeechRecognitionTopicConstraint, ISpeechRecognitionConstraint := "{79AC1628-4D68-43C4-8911-40DC4101B55B}")
DllCall(NumGet(NumGet(SpeechRecognitionConstraint+0)+7*A_PtrSize), "ptr", SpeechRecognitionConstraint, "int", 1)   ; SpeechRecognitionConstraint.put_IsEnabled

restart:
restart1:
if (A_ThisLabel = "restart")
   oRestart.Push(A_TickCount)
else if (A_ThisLabel = "restart1") and (oRestart.Count() = 0)
   return
if (restarting = 1)
   return
restarting := 1
oRestart.Pop()
Menu, Tray, Icon , %A_AhkPath%, 4, 1
ObjReleaseClose(SpeechContinuousRecognitionCompletedEvent)
ObjReleaseClose(SpeechContinuousRecognitionResultGeneratedEvent)
ObjReleaseClose(SpeechRecognizerStateChangedEvent)
if CompletedToken
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+15*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "int64", CompletedToken)   ; SpeechContinuousRecognitionSession.remove_Completed
   CompletedToken := ""
}
if ResultGeneratedToken
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+17*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "int64", ResultGeneratedToken)   ; SpeechContinuousRecognitionSession.remove_ResultGenerated
   ResultGeneratedToken := ""
}
if StateChangedToken
{
   DllCall(NumGet(NumGet(SpeechRecognizer+0)+16*A_PtrSize), "ptr", SpeechRecognizer, "int64", StateChangedToken)   ; SpeechRecognizer.remove_StateChanged
   StateChangedToken := ""
}
ObjReleaseClose(SpeechContinuousRecognitionSession)
ObjReleaseClose(SpeechRecognizer2)
ObjReleaseClose(SpeechRecognitionCompilationResult)
ObjReleaseClose(SpeechRecognizerTimeouts)
ObjReleaseClose(SpeechRecognitionConstraints)
ObjReleaseClose(SpeechRecognizer)
hr := DllCall(NumGet(NumGet(SpeechRecognizerFactory+0)+6*A_PtrSize), "ptr", SpeechRecognizerFactory, "ptr", Language, "ptr*", SpeechRecognizer, "uint")   ; SpeechRecognizerFactory.Create
if (hr != 0)
{
   if (hr = 0x800455BC)
      msgbox Specified language is not supported
   else
      msgbox SpeechRecognizerFactory.Create error %hr%
   exitapp
}
DllCall(NumGet(NumGet(SpeechRecognizer+0)+15*A_PtrSize), "ptr", SpeechRecognizer, "ptr", SpeechRecognizerStateChangedEvent := ISpeechRecognizerStateChangedEvent_new(), "int64*", StateChangedToken)   ; SpeechRecognizer.add_StateChanged
DllCall(NumGet(NumGet(SpeechRecognizer+0)+7*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognitionConstraints)   ; SpeechRecognizer.get_Constraints
DllCall(NumGet(NumGet(SpeechRecognitionConstraints+0)+13*A_PtrSize), "ptr", SpeechRecognitionConstraints, "ptr", SpeechRecognitionConstraint)   ; IVector.Append(T)
DllCall(NumGet(NumGet(SpeechRecognizer+0)+8*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognizerTimeouts)   ; SpeechRecognizer.get_Timeouts
DllCall(NumGet(NumGet(SpeechRecognizerTimeouts+0)+7*A_PtrSize), "ptr", SpeechRecognizerTimeouts, "int64", 0)   ; SpeechRecognizerTimeouts.put_InitialSilenceTimeout
DllCall(NumGet(NumGet(SpeechRecognizer+0)+10*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognitionCompilationResult)   ; SpeechRecognizer.CompileConstraintsAsync
WaitForAsync(SpeechRecognitionCompilationResult)
DllCall(NumGet(NumGet(SpeechRecognitionCompilationResult+0)+6*A_PtrSize), "ptr", SpeechRecognitionCompilationResult, "uint*", status)   ; SpeechRecognitionCompilationResult.get_Status
if (status != 0)
{
   if (status = 1)
      msgbox SpeechRecognitionCompilation error`nA topic constraint was set for an unsupported language.
   else if (status = 2)
      msgbox SpeechRecognitionCompilation error`nThe language of the speech recognizer does not match the language of a grammar.
   else if (status = 3)
      msgbox SpeechRecognitionCompilation error`nA grammar failed to compile.
   else if (status = 4)
      msgbox SpeechRecognitionCompilation error`nAudio problems caused recognition to fail.
   else if (status = 5)
      msgbox SpeechRecognitionCompilation error`nUser canceled recognition session.
   else if (status = 6)
      msgbox SpeechRecognitionCompilation error`nAn unknown problem caused recognition or compilation to fail.
   else if (status = 7)
      msgbox SpeechRecognitionCompilation error`nA timeout due to extended silence or poor audio caused recognition to fail.
   else if (status = 8)
      msgbox SpeechRecognitionCompilation error`nAn extended pause, or excessive processing time, caused recognition to fail.
   else if (status = 9)
      msgbox SpeechRecognitionCompilation error`nNetwork problems caused recognition to fail.
   else if (status = 10)
      msgbox SpeechRecognitionCompilation error`nLack of a microphone caused recognition to fail.
   else
      msgbox SpeechRecognitionCompilation error`n status %status%
   exitapp
}
SpeechRecognizer2 := ComObjQuery(SpeechRecognizer, ISpeechRecognizer2 := "{63C9BAF1-91E3-4EA4-86A1-7C3867D084A6}")
DllCall(NumGet(NumGet(SpeechRecognizer2+0)+6*A_PtrSize), "ptr", SpeechRecognizer2, "ptr*", SpeechContinuousRecognitionSession)   ; SpeechRecognizer2.get_ContinuousRecognitionSession
DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+14*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr", SpeechContinuousRecognitionCompletedEvent := ISpeechContinuousRecognitionCompletedEvent_new(), "int64*", CompletedToken)   ; SpeechContinuousRecognitionSession.add_Completed
DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+16*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr", SpeechContinuousRecognitionResultGeneratedEvent := ISpeechContinuousRecognitionResultGeneratedEvent_new(), "int64*", ResultGeneratedToken)   ; SpeechContinuousRecognitionSession.add_ResultGenerated
hr := DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+8*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr*", AsyncAction, "uint")   ; SpeechContinuousRecognitionSession.StartAsync
if (hr != 0)
{
   if (hr = 0x80045509)
      msgbox Error. Turn on Online speech recognition
   else
      msgbox SpeechContinuousRecognitionSession.StartAsync error %hr%
   exitapp
}
WaitForAsync(AsyncAction)
Menu, Tray, Icon , %A_AhkPath%, 1, 1
restarting := ""
SetTimer, restart1, -1
return
Example with using custom srgs file dictionary with Your own words.
Primitive grammar.xml can be like this

Code: Select all

<?xml version="1.0" encoding="UTF-8"?><grammar version="1.0" mode="voice" tag-format="semantics/1.0" xmlns="http://www.w3.org/2001/06/grammar" xml:lang="en-US" root="commands">
<rule id="commands">
   <one-of>
      <item>malcev</item>
      <item>autohotkey</item>
   </one-of>
</rule>
</grammar>

Code: Select all

#singleinstance, force
setbatchlines -1
lang := "en-us"
grammarfile := A_ScriptDir "\grammar.xml"
GoSub, start
return

ISpeechContinuousRecognitionResultGeneratedEvent_Invoke(this, sender, SpeechContinuousRecognitionResultGenerated)
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionResultGenerated+0)+6*A_PtrSize), "ptr", SpeechContinuousRecognitionResultGenerated, "ptr*", SpeechRecognitionResult)   ; SpeechContinuousRecognitionResultGenerated.get_Result
   DllCall(NumGet(NumGet(SpeechRecognitionResult+0)+7*A_PtrSize), "ptr", SpeechRecognitionResult, "ptr*", htext)   ; SpeechRecognitionResult.get_Text
   buffer := DllCall("Combase.dll\WindowsGetStringRawBuffer", "ptr", hText, "uint*", length, "ptr")
   text := StrGet(buffer, "utf-16")
   ObjReleaseClose(SpeechRecognitionResult)
   msgbox % text
   return
}

ISpeechContinuousRecognitionCompletedEvent_Invoke(this, sender, SpeechContinuousRecognitionCompleted)
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionCompleted+0)+6*A_PtrSize), "ptr", SpeechContinuousRecognitionCompleted, "uint*", status)   ; SpeechContinuousRecognitionCompleted.get_Status
   if (status != 0)
   {
      if (status = 1)
         msgbox A topic constraint was set for an unsupported language.
      else if (status = 2)
         msgbox The language of the speech recognizer does not match the language of a grammar.
      else if (status = 3)
         msgbox A grammar failed to compile.
      else if (status = 4)
         msgbox Audio problems caused recognition to fail.
      else if (status = 5)   ; User canceled recognition session.
      {
         settimer, restart, -1
         return
      }
      else if (status = 6)   ;  An unknown problem caused recognition or compilation to fail.
      {
         ; msgbox An unknown problem caused recognition or compilation to fail.
         reload
      }
      else if (status = 7)   ; A timeout due to extended silence or poor audio caused recognition to fail.
      {
         ; msgbox A timeout due to extended silence or poor audio caused recognition to fail.
         reload
      }
      else if (status = 8)   ; An extended pause, or excessive processing time, caused recognition to fail.
      {
         ; msgbox An extended pause, or excessive processing time, caused recognition to fail.
         reload
      }
      else if (status = 9)
         msgbox Network problems caused recognition to fail.
      else if (status = 10)
         msgbox Lack of a microphone caused recognition to fail.
      else
         msgbox error status %status%
      exitapp
   }
   return
}

ISpeechRecognizerStateChangedEvent_Invoke(this, sender, SpeechRecognizerStateChangedEventArgs)
{
   DllCall(NumGet(NumGet(SpeechRecognizerStateChangedEventArgs+0)+6*A_PtrSize), "ptr", SpeechRecognizerStateChangedEventArgs, "uint*", state)   ; SpeechRecognizerStateChangedEventArgs.get_State
   if (state = 0)  ; Idle
      settimer, restart, -1
   return
}

CreateClass(string, interface, ByRef Class)
{
   CreateHString(string, hString)
   VarSetCapacity(GUID, 16)
   DllCall("ole32\CLSIDFromString", "wstr", interface, "ptr", &GUID)
   result := DllCall("Combase.dll\RoGetActivationFactory", "ptr", hString, "ptr", &GUID, "ptr*", Class, "uint")
   if (result != 0)
   {
      if (result = 0x80004002)
         msgbox No such interface supported
      else if (result = 0x80040154)
         msgbox Class not registered
      else
         msgbox error: %result%
      ExitApp
   }
   DeleteHString(hString)
}

CreateHString(string, ByRef hString)
{
    DllCall("Combase.dll\WindowsCreateString", "wstr", string, "uint", StrLen(string), "ptr*", hString)
}

DeleteHString(hString)
{
   DllCall("Combase.dll\WindowsDeleteString", "ptr", hString)
}

WaitForAsync(ByRef Object)
{
   AsyncInfo := ComObjQuery(Object, IAsyncInfo := "{00000036-0000-0000-C000-000000000046}")
   loop
   {
      DllCall(NumGet(NumGet(AsyncInfo+0)+7*A_PtrSize), "ptr", AsyncInfo, "uint*", status)   ; IAsyncInfo.Status
      if (status != 0)
      {
         if (status != 1)
         {
            DllCall(NumGet(NumGet(AsyncInfo+0)+8*A_PtrSize), "ptr", AsyncInfo, "uint*", ErrorCode)   ; IAsyncInfo.ErrorCode
            msgbox AsyncInfo status error: %ErrorCode%
            ExitApp
         }
         ObjRelease(AsyncInfo)
         break
      }
      sleep 10
   }
   DllCall(NumGet(NumGet(Object+0)+8*A_PtrSize), "ptr", Object, "ptr*", ObjectResult)   ; GetResults
   ObjReleaseClose(Object)
   Object := ObjectResult
}

ObjReleaseClose(ByRef Object)
{
   if Object
   {
      if (Close := ComObjQuery(Object, IClosable := "{30D5A829-7FA4-4026-83BB-D75BAE4EA99E}"))
      {
         DllCall(NumGet(NumGet(Close+0)+6*A_PtrSize), "ptr", Close)   ; Close
         ObjRelease(Close)
      }
      ObjRelease(Object)
      Object := ""
   }
}


ISpeechRecognizerStateChangedEvent_new() {
   static VTBL := [ "QueryInterface"
                  , "AddRef"
                  , "Release"
                  , "Invoke" ]
        , heapSize := A_PtrSize*10
        , heapOffset := A_PtrSize*9
        
        , flags := (HEAP_GENERATE_EXCEPTIONS := 0x4) | (HEAP_NO_SERIALIZE := 0x1)
        , HEAP_ZERO_MEMORY := 0x8
   
   hHeap := DllCall("HeapCreate", "UInt", flags, "Ptr", 0, "Ptr", 0, "Ptr")
   addr := ISpeechRecognizerStateChangedEvent := DllCall("HeapAlloc", "Ptr", hHeap, "UInt", HEAP_ZERO_MEMORY, "Ptr", heapSize, "Ptr")
   addr := NumPut(addr + A_PtrSize, addr + 0)
   for k, v in VTBL
      addr := NumPut(RegisterSyncCallback("ISpeechRecognizerStateChangedEvent_" . v), addr + 0 )
   NumPut(hHeap, ISpeechRecognizerStateChangedEvent + heapOffset)
   Return ISpeechRecognizerStateChangedEvent
}

ISpeechRecognizerStateChangedEvent_QueryInterface(this, riid, ppvObject)
{
   static IID_IUnknown, IID_ISpeechRecognizerStateChangedEvent
   if (!VarSetCapacity(IID_IUnknown))
   {
      VarSetCapacity(IID_IUnknown, 16), VarSetCapacity(IID_ISpeechRecognizerStateChangedEvent, 16)
      DllCall("ole32\CLSIDFromString", "WStr", "{00000000-0000-0000-C000-000000000046}", "Ptr", &IID_IUnknown)
      DllCall("ole32\CLSIDFromString", "WStr", "{d1185e92-5c30-5561-b3e2-e82ddbd872c3}", "Ptr", &IID_ISpeechRecognizerStateChangedEvent)
   }
   if (DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_ISpeechRecognizerStateChangedEvent) || DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_IUnknown))
   {
      NumPut(this, ppvObject+0, "Ptr")
      ISpeechRecognizerStateChangedEvent_AddRef(this)
      return 0 ; S_OK
   }
   NumPut(0, ppvObject+0, "Ptr")
   return 0x80004002 ; E_NOINTERFACE
}

ISpeechRecognizerStateChangedEvent_AddRef(this) {
   static refOffset := A_PtrSize*8
   NumPut(refCount := NumGet(this + refOffset, "UInt") + 1, this + refOffset, "UInt")
   Return refCount
}

ISpeechRecognizerStateChangedEvent_Release(this) {
   static refOffset := A_PtrSize*8
        , heapOffset := A_PtrSize*9
   NumPut(refCount := NumGet(this + refOffset, "UInt") - 1, this + refOffset, "UInt")
   if (refCount = 0) {
      hHeap := NumGet(this + heapOffset)
      DllCall("HeapDestroy", "Ptr", hHeap)
   }
   Return refCount
}


ISpeechContinuousRecognitionCompletedEvent_new() {
   static VTBL := [ "QueryInterface"
                  , "AddRef"
                  , "Release"
                  , "Invoke" ]
        , heapSize := A_PtrSize*10
        , heapOffset := A_PtrSize*9
        
        , flags := (HEAP_GENERATE_EXCEPTIONS := 0x4) | (HEAP_NO_SERIALIZE := 0x1)
        , HEAP_ZERO_MEMORY := 0x8
   
   hHeap := DllCall("HeapCreate", "UInt", flags, "Ptr", 0, "Ptr", 0, "Ptr")
   addr := ISpeechContinuousRecognitionCompletedEvent := DllCall("HeapAlloc", "Ptr", hHeap, "UInt", HEAP_ZERO_MEMORY, "Ptr", heapSize, "Ptr")
   addr := NumPut(addr + A_PtrSize, addr + 0)
   for k, v in VTBL
      addr := NumPut(RegisterSyncCallback("ISpeechContinuousRecognitionCompletedEvent_" . v), addr + 0 )
   NumPut(hHeap, ISpeechContinuousRecognitionCompletedEvent + heapOffset)
   Return ISpeechContinuousRecognitionCompletedEvent
}

ISpeechContinuousRecognitionCompletedEvent_QueryInterface(this, riid, ppvObject)
{
   static IID_IUnknown, IID_ISpeechContinuousRecognitionCompletedEvent
   if (!VarSetCapacity(IID_IUnknown))
   {
      VarSetCapacity(IID_IUnknown, 16), VarSetCapacity(IID_ISpeechContinuousRecognitionCompletedEvent, 16)
      DllCall("ole32\CLSIDFromString", "WStr", "{00000000-0000-0000-C000-000000000046}", "Ptr", &IID_IUnknown)
      DllCall("ole32\CLSIDFromString", "WStr", "{8103c018-7952-59f9-9f41-23b17d6e452d}", "Ptr", &IID_ISpeechContinuousRecognitionCompletedEvent)
   }
   if (DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_ISpeechContinuousRecognitionCompletedEvent) || DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_IUnknown))
   {
      NumPut(this, ppvObject+0, "Ptr")
      ISpeechContinuousRecognitionCompletedEvent_AddRef(this)
      return 0 ; S_OK
   }
   NumPut(0, ppvObject+0, "Ptr")
   return 0x80004002 ; E_NOINTERFACE
}

ISpeechContinuousRecognitionCompletedEvent_AddRef(this) {
   static refOffset := A_PtrSize*8
   NumPut(refCount := NumGet(this + refOffset, "UInt") + 1, this + refOffset, "UInt")
   Return refCount
}

ISpeechContinuousRecognitionCompletedEvent_Release(this) {
   static refOffset := A_PtrSize*8
        , heapOffset := A_PtrSize*9
   NumPut(refCount := NumGet(this + refOffset, "UInt") - 1, this + refOffset, "UInt")
   if (refCount = 0) {
      hHeap := NumGet(this + heapOffset)
      DllCall("HeapDestroy", "Ptr", hHeap)
   }
   Return refCount
}


ISpeechContinuousRecognitionResultGeneratedEvent_new() {
   static VTBL := [ "QueryInterface"
                  , "AddRef"
                  , "Release"
                  , "Invoke" ]
        , heapSize := A_PtrSize*10
        , heapOffset := A_PtrSize*9
        
        , flags := (HEAP_GENERATE_EXCEPTIONS := 0x4) | (HEAP_NO_SERIALIZE := 0x1)
        , HEAP_ZERO_MEMORY := 0x8
   
   hHeap := DllCall("HeapCreate", "UInt", flags, "Ptr", 0, "Ptr", 0, "Ptr")
   addr := ISpeechContinuousRecognitionResultGeneratedEvent := DllCall("HeapAlloc", "Ptr", hHeap, "UInt", HEAP_ZERO_MEMORY, "Ptr", heapSize, "Ptr")
   addr := NumPut(addr + A_PtrSize, addr + 0)
   for k, v in VTBL
      addr := NumPut(RegisterSyncCallback("ISpeechContinuousRecognitionResultGeneratedEvent_" . v), addr + 0 )
   NumPut(hHeap, ISpeechContinuousRecognitionResultGeneratedEvent + heapOffset)
   Return ISpeechContinuousRecognitionResultGeneratedEvent
}

ISpeechContinuousRecognitionResultGeneratedEvent_QueryInterface(this, riid, ppvObject)
{
   static IID_IUnknown, IID_ISpeechContinuousRecognitionResultGeneratedEvent
   if (!VarSetCapacity(IID_IUnknown))
   {
      VarSetCapacity(IID_IUnknown, 16), VarSetCapacity(IID_ISpeechContinuousRecognitionResultGeneratedEvent, 16)
      DllCall("ole32\CLSIDFromString", "WStr", "{00000000-0000-0000-C000-000000000046}", "Ptr", &IID_IUnknown)
      DllCall("ole32\CLSIDFromString", "WStr", "{26192073-a2c9-527d-9bd3-911c05e0011e}", "Ptr", &IID_ISpeechContinuousRecognitionResultGeneratedEvent)
   }
   if (DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_ISpeechContinuousRecognitionResultGeneratedEvent) || DllCall("ole32\IsEqualGUID", "Ptr", riid, "Ptr", &IID_IUnknown))
   {
      NumPut(this, ppvObject+0, "Ptr")
      ISpeechContinuousRecognitionResultGeneratedEvent_AddRef(this)
      return 0 ; S_OK
   }
   NumPut(0, ppvObject+0, "Ptr")
   return 0x80004002 ; E_NOINTERFACE
}

ISpeechContinuousRecognitionResultGeneratedEvent_AddRef(this) {
   static refOffset := A_PtrSize*8
   NumPut(refCount := NumGet(this + refOffset, "UInt") + 1, this + refOffset, "UInt")
   Return refCount
}

ISpeechContinuousRecognitionResultGeneratedEvent_Release(this) {
   static refOffset := A_PtrSize*8
        , heapOffset := A_PtrSize*9
   NumPut(refCount := NumGet(this + refOffset, "UInt") - 1, this + refOffset, "UInt")
   if (refCount = 0) {
      hHeap := NumGet(this + heapOffset)
      DllCall("HeapDestroy", "Ptr", hHeap)
   }
   Return refCount
}



/*
    RegisterSyncCallback

    A replacement for RegisterCallback for use with APIs that will call
    the callback on the wrong thread.  Synchronizes with the script's main
    thread via a window message.

    This version tries to emulate RegisterCallback as much as possible
    without using RegisterCallback, so shares most of its limitations,
    and some enhancements that could be made are not.

    Other differences from v1 RegisterCallback:
      - Variadic mode can't be emulated exactly, so is not supported.
      - A_EventInfo can't be set in v1, so is not supported.
      - Fast mode is not supported (the option is ignored).
      - ByRef parameters are allowed (but ByRef is ignored).
      - Throws instead of returning "" on failure.
*/
RegisterSyncCallback(FunctionName, Options:="", ParamCount:="")
{
    if !(fn := Func(FunctionName)) || fn.IsBuiltIn
        throw Exception("Bad function", -1, FunctionName)
    if (ParamCount == "")
        ParamCount := fn.MinParams
    if (ParamCount > fn.MaxParams && !fn.IsVariadic || ParamCount+0 < fn.MinParams)
        throw Exception("Bad param count", -1, ParamCount)

    static sHwnd := 0, sMsg, sSendMessageW
    if !sHwnd
    {
        Gui RegisterSyncCallback: +Parent%A_ScriptHwnd% +hwndsHwnd
        OnMessage(sMsg := 0x8000, Func("RegisterSyncCallback_Msg"))
        sSendMessageW := DllCall("GetProcAddress", "ptr", DllCall("GetModuleHandle", "str", "user32.dll", "ptr"), "astr", "SendMessageW", "ptr")
    }

    if !(pcb := DllCall("GlobalAlloc", "uint", 0, "ptr", 96, "ptr"))
        throw
    DllCall("VirtualProtect", "ptr", pcb, "ptr", 96, "uint", 0x40, "uint*", 0)

    p := pcb
    if (A_PtrSize = 8)
    {
        /*
        48 89 4c 24 08  ; mov [rsp+8], rcx
        48 89 54'24 10  ; mov [rsp+16], rdx
        4c 89 44 24 18  ; mov [rsp+24], r8
        4c'89 4c 24 20  ; mov [rsp+32], r9
        48 83 ec 28'    ; sub rsp, 40
        4c 8d 44 24 30  ; lea r8, [rsp+48]  (arg 3, &params)
        49 b9 ..        ; mov r9, .. (arg 4, operand to follow)
        */
        p := NumPut(0x54894808244c8948, p+0)
        p := NumPut(0x4c182444894c1024, p+0)
        p := NumPut(0x28ec834820244c89, p+0)
        p := NumPut(  0xb9493024448d4c, p+0) - 1
        lParamPtr := p, p += 8

        p := NumPut(0xba, p+0, "char") ; mov edx, nmsg
        p := NumPut(sMsg, p+0, "int")
        p := NumPut(0xb9, p+0, "char") ; mov ecx, hwnd
        p := NumPut(sHwnd, p+0, "int")
        p := NumPut(0xb848, p+0, "short") ; mov rax, SendMessageW
        p := NumPut(sSendMessageW, p+0)
        /*
        ff d0        ; call rax
        48 83 c4 28  ; add rsp, 40
        c3           ; ret
        */
        p := NumPut(0x00c328c48348d0ff, p+0)
    }
    else ;(A_PtrSize = 4)
    {
        p := NumPut(0x68, p+0, "char")      ; push ... (lParam data)
        lParamPtr := p, p += 4
        p := NumPut(0x0824448d, p+0, "int") ; lea eax, [esp+8]
        p := NumPut(0x50, p+0, "char")      ; push eax
        p := NumPut(0x68, p+0, "char")      ; push nmsg
        p := NumPut(sMsg, p+0, "int")
        p := NumPut(0x68, p+0, "char")      ; push hwnd
        p := NumPut(sHwnd, p+0, "int")
        p := NumPut(0xb8, p+0, "char")      ; mov eax, &SendMessageW
        p := NumPut(sSendMessageW, p+0, "int")
        p := NumPut(0xd0ff, p+0, "short")   ; call eax
        p := NumPut(0xc2, p+0, "char")      ; ret argsize
        p := NumPut((InStr(Options, "C") ? 0 : ParamCount*4), p+0, "short")
    }
    NumPut(p, lParamPtr+0) ; To be passed as lParam.
    p := NumPut(&fn, p+0)
    p := NumPut(ParamCount, p+0, "int")
    return pcb
}

RegisterSyncCallback_Msg(wParam, lParam)
{
    if (A_Gui != "RegisterSyncCallback")
        return
    fn := Object(NumGet(lParam + 0))
    paramCount := NumGet(lParam + A_PtrSize, "int")
    params := []
    Loop % paramCount
        params.Push(NumGet(wParam + A_PtrSize * (A_Index-1)))
    return %fn%(params*)
}

start:
oRestart := []
CreateClass("Windows.Globalization.Language", ILanguageFactory := "{9B0252AC-0C27-44F8-B792-9793FB66C63E}", LanguageFactory)
CreateHString(lang, hString)
DllCall(NumGet(NumGet(LanguageFactory+0)+6*A_PtrSize), "ptr", LanguageFactory, "ptr", hString, "ptr*", Language)   ; CreateLanguage
DeleteHString(hString)
CreateClass("Windows.Media.SpeechRecognition.SpeechRecognizer", ISpeechRecognizerFactory := "{60C488DD-7FB8-4033-AC70-D046F64818E1}", SpeechRecognizerFactory)
CreateClass("Windows.Media.SpeechRecognition.SpeechRecognitionTopicConstraint", ISpeechRecognitionTopicConstraintFactory := "{6E6863DF-EC05-47D7-A5DF-56A3431E58D2}", SpeechRecognitionTopicConstraintFactory)
CreateHString("Dictation", hString)
DllCall(NumGet(NumGet(SpeechRecognitionTopicConstraintFactory+0)+6*A_PtrSize), "ptr", SpeechRecognitionTopicConstraintFactory, "int", Dictation := 1, "ptr", hString, "ptr*", SpeechRecognitionTopicConstraint)   ; SpeechRecognitionTopicConstraintFactory.Create
DeleteHString(hString)
SpeechRecognitionConstraint1 := ComObjQuery(SpeechRecognitionTopicConstraint, ISpeechRecognitionConstraint := "{79AC1628-4D68-43C4-8911-40DC4101B55B}")
DllCall(NumGet(NumGet(SpeechRecognitionConstraint1+0)+7*A_PtrSize), "ptr", SpeechRecognitionConstraint1, "int", 1)   ; SpeechRecognitionConstraint.put_IsEnabled
CreateClass("Windows.Storage.StorageFile", IStorageFileStatics := "{5984C710-DAF2-43C8-8BB4-A4D3EACFD03F}", StorageFileStatics)
CreateHString(grammarfile, hString)
DllCall(NumGet(NumGet(StorageFileStatics+0)+6*A_PtrSize), "ptr", StorageFileStatics, "ptr", hString, "ptr*", StorageFile)   ; StorageFile.GetFileFromPathAsync
WaitForAsync(StorageFile)
DeleteHString(hString)
CreateClass("Windows.Media.SpeechRecognition.SpeechRecognitionGrammarFileConstraint", ISpeechRecognitionGrammarFileConstraintFactory := "{3DA770EB-C479-4C27-9F19-89974EF392D1}", SpeechRecognitionGrammarFileConstraintFactory)
DllCall(NumGet(NumGet(SpeechRecognitionGrammarFileConstraintFactory+0)+6*A_PtrSize), "ptr", SpeechRecognitionGrammarFileConstraintFactory, "ptr", StorageFile, "ptr*", SpeechRecognitionGrammarFileConstraint)   ; SpeechRecognitionGrammarFileConstraint.Create
SpeechRecognitionConstraint2 := ComObjQuery(SpeechRecognitionGrammarFileConstraint, ISpeechRecognitionConstraint := "{79AC1628-4D68-43C4-8911-40DC4101B55B}")
DllCall(NumGet(NumGet(SpeechRecognitionConstraint2+0)+7*A_PtrSize), "ptr", SpeechRecognitionConstraint2, "int", 1)   ; SpeechRecognitionConstraint.put_IsEnabled

restart:
restart1:
if (A_ThisLabel = "restart")
   oRestart.Push(A_TickCount)
else if (A_ThisLabel = "restart1") and (oRestart.Count() = 0)
   return
if (restarting = 1)
   return
restarting := 1
oRestart.Pop()
Menu, Tray, Icon , %A_AhkPath%, 4, 1
ObjReleaseClose(SpeechContinuousRecognitionCompletedEvent)
ObjReleaseClose(SpeechContinuousRecognitionResultGeneratedEvent)
ObjReleaseClose(SpeechRecognizerStateChangedEvent)
if CompletedToken
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+15*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "int64", CompletedToken)   ; SpeechContinuousRecognitionSession.remove_Completed
   CompletedToken := ""
}
if ResultGeneratedToken
{
   DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+17*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "int64", ResultGeneratedToken)   ; SpeechContinuousRecognitionSession.remove_ResultGenerated
   ResultGeneratedToken := ""
}
if StateChangedToken
{
   DllCall(NumGet(NumGet(SpeechRecognizer+0)+16*A_PtrSize), "ptr", SpeechRecognizer, "int64", StateChangedToken)   ; SpeechRecognizer.remove_StateChanged
   StateChangedToken := ""
}
ObjReleaseClose(SpeechContinuousRecognitionSession)
ObjReleaseClose(SpeechRecognizer2)
ObjReleaseClose(SpeechRecognitionCompilationResult)
ObjReleaseClose(SpeechRecognizerTimeouts)
ObjReleaseClose(SpeechRecognitionConstraints)
ObjReleaseClose(SpeechRecognizer)
hr := DllCall(NumGet(NumGet(SpeechRecognizerFactory+0)+6*A_PtrSize), "ptr", SpeechRecognizerFactory, "ptr", Language, "ptr*", SpeechRecognizer, "uint")   ; SpeechRecognizerFactory.Create
if (hr != 0)
{
   if (hr = 0x800455BC)
      msgbox Specified language is not supported
   else
      msgbox SpeechRecognizerFactory.Create error %hr%
   exitapp
}
DllCall(NumGet(NumGet(SpeechRecognizer+0)+15*A_PtrSize), "ptr", SpeechRecognizer, "ptr", SpeechRecognizerStateChangedEvent := ISpeechRecognizerStateChangedEvent_new(), "int64*", StateChangedToken)   ; SpeechRecognizer.add_StateChanged
DllCall(NumGet(NumGet(SpeechRecognizer+0)+7*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognitionConstraints)   ; SpeechRecognizer.get_Constraints
loop 2
   DllCall(NumGet(NumGet(SpeechRecognitionConstraints+0)+13*A_PtrSize), "ptr", SpeechRecognitionConstraints, "ptr", SpeechRecognitionConstraint%A_Index%)   ; IVector.Append(T)
DllCall(NumGet(NumGet(SpeechRecognizer+0)+8*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognizerTimeouts)   ; SpeechRecognizer.get_Timeouts
DllCall(NumGet(NumGet(SpeechRecognizerTimeouts+0)+7*A_PtrSize), "ptr", SpeechRecognizerTimeouts, "int64", 0)   ; SpeechRecognizerTimeouts.put_InitialSilenceTimeout
DllCall(NumGet(NumGet(SpeechRecognizer+0)+10*A_PtrSize), "ptr", SpeechRecognizer, "ptr*", SpeechRecognitionCompilationResult)   ; SpeechRecognizer.CompileConstraintsAsync
WaitForAsync(SpeechRecognitionCompilationResult)
DllCall(NumGet(NumGet(SpeechRecognitionCompilationResult+0)+6*A_PtrSize), "ptr", SpeechRecognitionCompilationResult, "uint*", status)   ; SpeechRecognitionCompilationResult.get_Status
if (status != 0)
{
   if (status = 1)
      msgbox SpeechRecognitionCompilation error`nA topic constraint was set for an unsupported language.
   else if (status = 2)
      msgbox SpeechRecognitionCompilation error`nThe language of the speech recognizer does not match the language of a grammar.
   else if (status = 3)
      msgbox SpeechRecognitionCompilation error`nA grammar failed to compile.
   else if (status = 4)
      msgbox SpeechRecognitionCompilation error`nAudio problems caused recognition to fail.
   else if (status = 5)
      msgbox SpeechRecognitionCompilation error`nUser canceled recognition session.
   else if (status = 6)
      msgbox SpeechRecognitionCompilation error`nAn unknown problem caused recognition or compilation to fail.
   else if (status = 7)
      msgbox SpeechRecognitionCompilation error`nA timeout due to extended silence or poor audio caused recognition to fail.
   else if (status = 8)
      msgbox SpeechRecognitionCompilation error`nAn extended pause, or excessive processing time, caused recognition to fail.
   else if (status = 9)
      msgbox SpeechRecognitionCompilation error`nNetwork problems caused recognition to fail.
   else if (status = 10)
      msgbox SpeechRecognitionCompilation error`nLack of a microphone caused recognition to fail.
   else
      msgbox SpeechRecognitionCompilation error`n status %status%
   exitapp
}
SpeechRecognizer2 := ComObjQuery(SpeechRecognizer, ISpeechRecognizer2 := "{63C9BAF1-91E3-4EA4-86A1-7C3867D084A6}")
DllCall(NumGet(NumGet(SpeechRecognizer2+0)+6*A_PtrSize), "ptr", SpeechRecognizer2, "ptr*", SpeechContinuousRecognitionSession)   ; SpeechRecognizer2.get_ContinuousRecognitionSession
DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+14*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr", SpeechContinuousRecognitionCompletedEvent := ISpeechContinuousRecognitionCompletedEvent_new(), "int64*", CompletedToken)   ; SpeechContinuousRecognitionSession.add_Completed
DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+16*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr", SpeechContinuousRecognitionResultGeneratedEvent := ISpeechContinuousRecognitionResultGeneratedEvent_new(), "int64*", ResultGeneratedToken)   ; SpeechContinuousRecognitionSession.add_ResultGenerated
hr := DllCall(NumGet(NumGet(SpeechContinuousRecognitionSession+0)+8*A_PtrSize), "ptr", SpeechContinuousRecognitionSession, "ptr*", AsyncAction, "uint")   ; SpeechContinuousRecognitionSession.StartAsync
if (hr != 0)
{
   if (hr = 0x80045509)
      msgbox Error. Turn on Online speech recognition
   else
      msgbox SpeechContinuousRecognitionSession.StartAsync error %hr%
   exitapp
}
WaitForAsync(AsyncAction)
Menu, Tray, Icon , %A_AhkPath%, 1, 1
restarting := ""
SetTimer, restart1, -1
return
Last edited by malcev on 13 Jan 2022, 10:23, edited 1 time in total.
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: UWP continuous speech recognition API. Win10

21 Nov 2021, 19:13

Added some examples.
User avatar
gregster
Posts: 9224
Joined: 30 Sep 2013, 06:48

Re: UWP continuous speech recognition API. Win10

09 Jan 2022, 12:11

Very cool. Thank you. That's interesting.

A few questions if you don't mind: In examples 2 and 3, how to determine if the listening already timed out (which doesn't seem to take very long, perhaps 15 or 20 secs of silence) ?
And how could that duration be changed, or the listening restarted as soon as the timeout occurs? So far, my frankensteined attempts error out a bit too often.

The documentation on the Microsoft page seems quite vast - so any additional insights you can provide about usage options - and your code - would surely be valuable for the AHK community (incl. me ;) ). Thank you!
malcev
Posts: 1769
Joined: 12 Aug 2014, 12:37

Re: UWP continuous speech recognition API. Win10

13 Jan 2022, 10:30

You cannot set duration of timeout.
Also this api is quiet strange.
We need to reload SpeechRecognizer not only when it comes to idle state, but also when We switch to a window of different process.
I added reload event, but didnot test it much, therefore it can be buggy.
When SpeechRecognizer reloads tray icon becomes red.
Also I set reloading script when such errors occurs:

Code: Select all

      else if (status = 6)   ;  An unknown problem caused recognition or compilation to fail.
      {
         ; msgbox An unknown problem caused recognition or compilation to fail.
         reload
      }
      else if (status = 7)   ; A timeout due to extended silence or poor audio caused recognition to fail.
      {
         ; msgbox A timeout due to extended silence or poor audio caused recognition to fail.
         reload
      }
      else if (status = 8)   ; An extended pause, or excessive processing time, caused recognition to fail.
      {
         ; msgbox An extended pause, or excessive processing time, caused recognition to fail.
         reload
      }
User avatar
gregster
Posts: 9224
Joined: 30 Sep 2013, 06:48

Re: UWP continuous speech recognition API. Win10

14 Jan 2022, 19:49

Thanks a lot! I will have a look.

Yep, I also noticed that a reload/restart was necessary after switching to windows of a different process, and I tried to work around it with timed restarts all over the place - and also occasional reloads, especially in case of errors. I got it working okay-ish, at least in the short run. Your updated code is probably more coherent and efficient. I will report back, if I get any new insights.
william_ahk
Posts: 639
Joined: 03 Dec 2018, 20:02

Re: UWP continuous speech recognition API. Win10

09 Mar 2024, 08:56

Very nice! This is a lot more accurate than SAPI.SpInprocRecognizer:dance:
Thanks for implementing this! :thumbup:
peculiar_x
Posts: 22
Joined: 07 Dec 2022, 01:20

Re: UWP continuous speech recognition API. Win10

05 Sep 2024, 02:44

Has anybody ported this magical piece of script to v2?

Return to “Scripts and Functions (v1)”

Who is online

Users browsing this forum: No registered users and 116 guests