AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

"Transform Unicode" fails with MS Word

 
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Bug Reports
View previous topic :: View next topic  
Author Message
Laszlo



Joined: 14 Feb 2005
Posts: 3877
Location: Pittsburgh

PostPosted: Thu Jan 24, 2008 12:11 am    Post subject: "Transform Unicode" fails with MS Word Reply with quote

"Transform, Unicode" fails in MS Word after save/restore ClipBoardAll

It could be another MS Word peculiarity... Run the following script (I tried under Vista-32):
Code:
#u::
   Transform UC, Unicode
   MsgBox %UC%
Return

#z::
   ClipA := ClipBoardAll
   Sleep 50
   ClipBoard := ClipA
Return

When I copy some Unicode text to ClipBoard in Word-2007 (e.g. Ű, entered with Alt+NumPad 0368) and Press Win-U, I see in the message box the corresponding UTF-8 characters, as expected (Ű). It can be repeated. However, after saving and restoring ClipBoardAll with Win-Z, Transform Unicode fails: the message box only shows “?”. The original content of clipboard is still there, it can be pasted, but AHK cannot transform it to UTF-8 anymore. It is the same with "FileAppend %ClipboardAll%, c:\clip" followed by "FileRead ClipBoard, *c c:\clip" in the Win-Z hotkey subroutine.

If you copy some Unicode text to Notepad, select/copy it from there, Win-Z does not prevent AHK from transforming it to UTF-8. It looks like Word attaches some special clipboard formats, which AHK cannot handle properly.
Back to top
View user's profile Send private message Visit poster's website
YMP



Joined: 23 Dec 2006
Posts: 263
Location: Russia

PostPosted: Thu Jan 24, 2008 6:43 pm    Post subject: Reply with quote

I could not reproduce question marks in the message box, however results were really different before and after #z. I tested it on XP, copying Russian text from Avant Browser (I don't have Word).

Perhaps my observations can help in some way to understand the problem, perhaps not, I'm not sure. That's what was on the clipboard before #z:
Quote:

DataObject
CF_TEXT
CF_UNICODETEXT
HTML Format
Rich Text Format
Ole Private Data
CF_LOCALE
CF_OEMTEXT

And after #z:
Quote:

DataObject
CF_TEXT
HTML Format
Rich Text Format
Ole Private Data
CF_LOCALE
CF_OEMTEXT
CF_UNICODETEXT

You see that CF_UNICODETEXT has moved beyond CF_LOCALE. I'm not 100% sure, but from my experiments with the clipboard, I think that text formats displayed below CF_LOCALE are not actually present there.

Win32 Programmer's Reference wrote:

The operating system performs implicit data format conversions between certain clipboard formats when an application calls the GetClipboardData function. For example, if the CF_OEMTEXT format is on the clipboard, a window can retrieve data in the CF_TEXT format. The format on the clipboard is converted to the requested format on demand.
<...>
If the operating system provides an automatic type conversion for a particular clipboard format, there is no advantage to placing the conversion format(s) on the clipboard.


I suspect that AutoHotkey saves only the first plain text format it finds on the clipboard and drops all the rest. When you copy from Notepad, it's CF_UNICODETEXT that comes first, so all is OK later. With Avant Browser it's CF_TEXT. When it is later converted to Unicode, the code page used for that depends on CF_LOCALE, which in turn depends on the language of the window where you copied. So you may get not the Unicode you put if you copied from a window switched to English and the copied characters didn't belong to the 1252 charset.

That's how it goes on my system. If yours is not multilingual, then I don't know. However it seems that ClipboardAll is not literally All.
Back to top
View user's profile Send private message
Laszlo



Joined: 14 Feb 2005
Posts: 3877
Location: Pittsburgh

PostPosted: Thu Jan 24, 2008 7:21 pm    Post subject: Reply with quote

Thx. This can very well be the reason. If it was, the problem could be easy to fix (if there was anyone to still work on AHK).
Back to top
View user's profile Send private message Visit poster's website
Chris
Site Admin


Joined: 02 Mar 2004
Posts: 10450

PostPosted: Sun Mar 02, 2008 11:34 pm    Post subject: Reply with quote

This probably has something to do with the fixes in the AutoHotkey code to work around ClipboardAll's problems with MS Word. I seem to remember that these fixes omit certain parts (formats) of the clipboard when MS Word or Excel is involved.

There might not be a solution to this other than to remove those fixes. However, it's probably best to leave them in because I seem to remember the behavior they fix was worse.
Back to top
View user's profile Send private message Send e-mail
Laszlo



Joined: 14 Feb 2005
Posts: 3877
Location: Pittsburgh

PostPosted: Mon Mar 03, 2008 12:13 am    Post subject: Reply with quote

Thanks, Chris, for the reply. This behavior, unfortunately, prevents AHK scripts to implement good Unicode keyboard managers. I know of no other universal way to get the character at the caret, but to select it and copy it to the clipboard. To be able to process it, we have to transform the Unicode character to UTF-8, which fails if we save the clipboard before use and restore it afterwards.

Hopefully, someone will write a workaround using Windows system calls.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Bug Reports All times are GMT
Page 1 of 1

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group