Change Text-To-Speech-Voice (in Windows) Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
Tiramisu
Posts: 3
Joined: 20 Aug 2017, 22:33

Change Text-To-Speech-Voice (in Windows)

20 Aug 2017, 22:55

Does anybody know how to change the default Text-To-Speech voice in Windows 7/10 automatically with an AHK script? I have tried to use ControlSend, ControlClick and Control-commands for the Speech Properties window, but it does not react. It only reacts when I open the window manually and e.g. type in the up or down key myself to change the voice in the dropdown list, but it does not work with any of my scripts.

Please help!
qwerty12
Posts: 468
Joined: 04 Mar 2016, 04:33
Contact:

Re: Change Text-To-Speech-Voice (in Windows)  Topic is solved

21 Aug 2017, 13:43

You can try the following to change it without needing to manipulate the Speech control panel window itself:

Code: Select all

#NoEnv  ; Recommended for performance and compatibility with future AutoHotkey releases.

if (SUCCEEDED(SpGetCategoryFromId(SPCAT_VOICES := "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices", cpSpObjectTokenCategory)))
{
	hr := DllCall(NumGet(NumGet(cpSpObjectTokenCategory+0)+18*A_PtrSize), "Ptr", cpSpObjectTokenCategory, "Ptr", 0, "Ptr", 0, "Ptr*", cpSpEnumTokens)

	if (SUCCEEDED(hr))
	{
		hr := DllCall(NumGet(NumGet(cpSpEnumTokens+0)+8*A_PtrSize), "Ptr", cpSpEnumTokens, "UInt*", tokenCount)
		if (SUCCEEDED(hr))
		{
			voices := Object()
			Loop %tokenCount% {
				hr := DllCall(NumGet(NumGet(cpSpEnumTokens+0)+7*A_PtrSize), "Ptr", cpSpEnumTokens, "UInt", A_Index - 1, "Ptr*", pToken)
				if (FAILED(hr)) {
					MsgBox Bailing out
					ExitApp 1
				}
				hr := DllCall(NumGet(NumGet(pToken+0)+6*A_PtrSize), "Ptr", pToken, "Ptr", 0, "Ptr*", pszValue)
				if (FAILED(hr)) {
					MsgBox Bailing out
					ExitApp 2
				}
				hr := DllCall(NumGet(NumGet(pToken+0)+16*A_PtrSize), "Ptr", pToken, "Ptr*", pszCoMemTokenId)
				if (FAILED(hr)) {
					MsgBox Bailing out
					ExitApp 3
				}
				voices[StrGet(pszCoMemTokenId, "UTF-16")] := StrGet(pszValue, "UTF-16")
				DllCall("ole32\CoTaskMemFree", "Ptr", pszValue)
				DllCall("ole32\CoTaskMemFree", "Ptr", pszCoMemTokenId)
				ObjRelease(pToken)
			}
			prompt := "Pick a voice by its number:"
			for k, v in voices
				prompt .= "`r`n" . A_Index . ": " v
			InputBox, TheChosenOne,, %prompt%
			if (ErrorLevel == 0) {
				for k, v in voices {
					if (A_Index == TheChosenOne) {
						hr := DllCall(NumGet(NumGet(cpSpObjectTokenCategory+0)+19*A_PtrSize), "Ptr", cpSpObjectTokenCategory, "WStr", k)
						break
					}
				}
			}
		}
		ObjRelease(cpSpEnumTokens)
	}

	ObjRelease(cpSpObjectTokenCategory)
}

SpGetCategoryFromId(pszCategoryId, ByRef ppCategory, fCreateIfNotExist := False)
{
    static CLSID_SpObjectTokenCategory := "{A910187F-0C7A-45AC-92CC-59EDAFB77B53}"
		  ,ISpObjectTokenCategory      := "{2D3D3845-39AF-4850-BBF9-40B49780011D}"

	hr := 0
	try {
		cpTokenCategory := ComObjCreate(CLSID_SpObjectTokenCategory, ISpObjectTokenCategory)
    } catch e {
		; No, A_LastError or ErrorLevel doesn't contain the error code on its own and I CBA to use CoCreateInstance directly
		if (RegExMatch(e.Message, "0[xX][0-9a-fA-F]+", errCode)) { ; https://stackoverflow.com/a/9221391
			hr := errCode + 0
		} else {
			hr := 0x80004005
		}
	}

    if (SUCCEEDED(hr))
    {
		hr := DllCall(NumGet(NumGet(cpTokenCategory+0)+15*A_PtrSize), "Ptr", cpTokenCategory, "WStr", pszCategoryId, "Int", fCreateIfNotExist)
    }
    
    if (SUCCEEDED(hr))
    {
        ppCategory := cpTokenCategory
    } 
	else
	{
		if (cpTokenCategory)
			ObjRelease(cpTokenCategory)
	}

	return hr
}

; https://github.com/maul-esel/AHK-Util-Funcs/blob/master/FAILED.ahk
SUCCEEDED(hr)
{
	return hr != "" && hr >= 0x00
}

FAILED(hr)
{
	return hr == "" || hr < 0
}
Tiramisu
Posts: 3
Joined: 20 Aug 2017, 22:33

Re: Change Text-To-Speech-Voice (in Windows)

21 Aug 2017, 15:42

Hey, wow! Your script worked perfectly qwerty! It would be nice if I could fully understand your code some day, but as long as it works I will use it for ATC voices in Xplane 11. Thanks!
qwerty12
Posts: 468
Joined: 04 Mar 2016, 04:33
Contact:

Re: Change Text-To-Speech-Voice (in Windows)

21 Aug 2017, 16:49

Tiramisu wrote:Hey, wow! Your script worked perfectly qwerty! It would be nice if I could fully understand your code some day, but as long as it works I will use it for ATC voices in Xplane 11. Thanks!
Hey,

Glad to hear it works :-) A lot of that relies around the official speech interface, which is provided over COM. To this day I don't know what COM actually is, so I'll leave you with a quote:
https://www.codeproject.com/Articles/633/Introduction-to-COM-What-It-Is-and-How-to-Use-It wrote:COM is, simply put, a method for sharing binary code across different applications and languages. This is unlike the C++ approach, which promotes reuse of source code. ATL is a perfect example of this. While source-level reuse works fine, it only works for C++. It also introduces the possibility of name collisions, not to mention bloat from having multiple copies of the code in your projects.
I don't know if any of this will actually help, but:

SpGetCategoryFromId is my attempt at a copy of a function found in the Windows SDK. It tries to create the COM object tasked with managing all these voices. Back at the top of the script, the first DllCall line asks said object to return another object - a token enumerator - that can be used to go through a list of "tokens" representing the available voices on your PC.

The next DllCall simply asks the token enumerator to get the amount of available tokens/voices. voices is an associative array (I'm sure there's more efficient options here, but whatever).

Inside the Loop, we ask the token enumerator object to return one token. With this token, I get the ID of the voice/token (used to store your chosen voice setting in the registry) and the descriptive name you see in the control panel. The descriptive name gets added to the voices array with the ID as the key.
voices[StrGet(pszCoMemTokenId, "UTF-16")] := StrGet(pszValue, "UTF-16") means new entry in the voices array with the ID as the key (the voices[StrGet(pszCoMemTokenId, "UTF-16")] part) and the description as the value (the := StrGet(pszValue, "UTF-16") part)
The CoTaskMemFree lines release the memory for the ID and description strings belonging to COM itself and the token object itself is released. This is repeated for the amount of tokens tokenCount says the system has.

Now, at this point, I don't know what voices you want so I ended up asking you with InputBox. For all the voices in said array, this loop gets the descriptive names of all of them, prepends each one with a number, and then tacks them onto the prompt shown by InputBox.

If you clicked OK, it goes through the voices array again until the index of an item matches the number you entered. If that is the case, it asks the token manager to set the new default voice from the ID that's part of the item.

This part is quite inefficient, but by keeping the loop logic seperate, it means it can be quite easily changed to fit your needs. For example, to swap between two voices every time the script is ran (something I didn't do because I'm unsure of what's on your system and I thought a more generic script was the way to go), you could replace this section:

Code: Select all

			prompt := "Pick a voice by its number:"
			for k, v in voices
				prompt .= "`r`n" . A_Index . ": " v
			InputBox, TheChosenOne,, %prompt%
			if (ErrorLevel == 0) {
				for k, v in voices {
					if (A_Index == TheChosenOne) {
						hr := DllCall(NumGet(NumGet(cpSpObjectTokenCategory+0)+19*A_PtrSize), "Ptr", cpSpObjectTokenCategory, "WStr", k)
						break
					}
				}
			}
with something like this:

Code: Select all

			hr := DllCall(NumGet(NumGet(cpSpObjectTokenCategory+0)+20*A_PtrSize), "Ptr", cpSpObjectTokenCategory, "Ptr*", currVoicePtr) ; ask token manager for ID of current T2S voice
			if (SUCCEEDED(hr)) {
				currentVoiceID := StrGet(currVoicePtr, "UTF-16") ; get an actual string that's usable from within AHK
				DllCall("ole32\CoTaskMemFree", "Ptr", currVoicePtr) ; AHK sets aside its own memory for the resulting string so get COM to free the memory for its copy
				
				currentVoiceName := voices[currentVoiceID] ; look back at the voices array and get the descriptive name for the voice that is currently set
				if (currentVoiceName == "Microsoft Hazel Desktop - English (Great Britain)") ; simple swap
					newVoiceName := "Microsoft Zira Desktop - English (United States)"
				else if (currentVoiceName == "Microsoft Zira Desktop - English (United States)")
					newVoiceName :=  "Microsoft Hazel Desktop - English (Great Britain)"

				if (newVoiceName) { ; should be unset if the active voice isn't Zira or Hazel
					for k, v in voices { ; go through the array
						if (newVoiceName == v) { ; if the new voice name matches a name already present in the value part of the array *exactly* for each voice
							DllCall(NumGet(NumGet(cpSpObjectTokenCategory+0)+19*A_PtrSize), "Ptr", cpSpObjectTokenCategory, "WStr", k) ; then get its ID (the key part) and set it as the new voice
							break
						}
					}
				}
			}
(These are the only two voices I have installed on my system - change the Hazel and Zira part as necessary :-))

It could be sped up a bit (some things aren't efficient) but unless you have tons and tons of installed voices, you should be OK...

It's a bit of the usual AHK path, but I wouldn't have been able to write this script were it not thanks to this tutorial: https://maul-esel.github.io/tutorials/C ... faces.html
Tiramisu
Posts: 3
Joined: 20 Aug 2017, 22:33

Re: Change Text-To-Speech-Voice (in Windows)

22 Aug 2017, 15:47

Ok, that tutorial makes it much clearer for me. Thx!
Btw. do you know any methods to change the pitch of the default speech voice (or at least its speed/volume)? E.g. the programm eSpeak allows me to use some pitch settings like "8kHz 8 Bit Mono", which sounds a lot more like an ATC tone. Now unfortunately I was not able to use eSpeak for my ATC plugin in Xplane, but if I could synthesize the default text to speech voice directly with an AHK script, that would be just awesome.
qwerty12
Posts: 468
Joined: 04 Mar 2016, 04:33
Contact:

Re: Change Text-To-Speech-Voice (in Windows)

22 Aug 2017, 17:35

I'm sorry, I haven't a clue. I don't use MS's (or anybody else's) text-to-speech. The only thing I could find on pitch suggests that it's up to your application to adjust it for itself only (if my quick reading of MSDN proved correct). Your question interested me because the only answers I could find in the Internet were about changing the voice for one text-to-speech session.
Guest

Re: Change Text-To-Speech-Voice (in Windows)

23 Aug 2017, 00:09

Ok, never mind! You already helped me a lot. I am using your script with Voice Attack and now I can talk to 3 different TTS voices in the ATC. It is really immersive. :thumbup:

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: No registered users and 363 guests