AutoHotkey Community

It is currently May 26th, 2012, 9:34 pm

All times are UTC [ DST ]




Post new topic Reply to topic  [ 9 posts ] 
Author Message
PostPosted: August 8th, 2009, 1:48 pm 
Offline

Joined: June 11th, 2005, 9:34 am
Posts: 264
Location: England ish
It's kinda weird, and I can't quite figure out why it's happening,

instead of the ßäöü characters I get

ö = ö
ü = ü
ß = ß
ä = ä

Here are two examples.

This works great.
Code:
msg = In der Hölle!
msgbox, %msg%



This shows funny:
Code:
FileAppend, In der Hölle!, temp.txt
FileRead, umlaute, temp.txt
MsgBox %umlaute%



---------------------------
iTunesAutoLyrics.ahk
---------------------------
In der Hölle!!
---------------------------
OK
---------------------------

or in my script it's also often shown as "In der Hölle!"

a lot of the time I get stuff like this:
weiß
plötzlich
Hölle!
zu mir hält
falschem Stück



Any thoughts?
Help or any known workarounds would really be appreciated! :D




Here is the full script btw, that really reflects upon the problem, if you see, in the message box that comes up, you get a lot of weird characters instead of the umlauts.. ?


Code:


URLDownloadToFile,http://lyricwiki.org/Farin_Urlaub:Dusche, TEMPiTunesAutoLyricsCurrentsong.xml
Sleep 100
FileRead, RawXMLFileVAR, TEMPiTunesAutoLyricsCurrentsong.xml


Loop, parse, RawXMLFileVAR, `n, `r  ; Specifying `n prior to `r allows both Windows and Unix files to be parsed.
{
  FailedString = There is currently no text in this page
  IfInString, A_LoopField,%FailedString% 
    {
    qtt(":-(")
    Return
    }
  IfInString, A_LoopField,class`='lyricbox'  ;note I'm escaping the equal sign.
    RawLineWithSongLyric = %A_LoopField%

}


Needle = class`='lyricbox'

PositionOfNeedle := InStr(RawLineWithSongLyric, Needle)
FinalPositionOfNeedle := (PositionOfNeedle + 17)

StringTrimLeft, RawlineMinusStuffAtTheBeggining, RawLineWithSongLyric, %FinalPositionOfNeedle%

StringReplace, SongLyricFinal, RawlineMinusStuffAtTheBeggining, <br />, `n, All

msgbox, %SongLyricFinal%


_________________
::
I Have Spoken
::


Last edited by TheLeO on August 9th, 2009, 3:57 pm, edited 1 time in total.

Report this post
Top
 Profile  
Reply with quote  
PostPosted: August 8th, 2009, 2:07 pm 
Offline

Joined: May 27th, 2007, 9:41 am
Posts: 4999
TheLeO wrote:
Code:
FileAppend, In der Hölle!, temp.txt
FileRead, umlaute, temp.txt
MsgBox %umlaute%
This works also great for me, make sure the AHK script is in ASCII not utf-8/unicode. If you use notepad++ for example you can change the format via the format menu :-)

_________________
AHK FAQ
TF : Text files & strings lib, TF Forum


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 8th, 2009, 3:20 pm 
Offline

Joined: February 12th, 2007, 7:54 am
Posts: 2462
I suppose the opposite. Assuming you're in the German locale, I suspect the downloaded xml file is in UTF-8, not in German codepage. MsgBox cannot display properly texts in UTF-8.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 8th, 2009, 8:27 pm 
Offline
User avatar

Joined: October 7th, 2006, 8:45 am
Posts: 3330
Location: Simi Valley, CA
Coincidentally, I posted a function yesterday that can read plain text files in UTF-8 format. The link is here. Non-ansi characters are converted into &#12345; format. You may be interested to know that  is the BOM that denotes UTF-8 format in a text file.

Edit:] I updated my 'FileReadU' function to have a 'manual override' for the file's encoding.

_________________
Ternary (a ? b : c) guide     TSV Table Manipulation Library
Post code inside [code][/code] tags!


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 9th, 2009, 11:52 am 
Offline

Joined: June 11th, 2005, 9:34 am
Posts: 264
Location: England ish
Sean wrote:
I suppose the opposite. Assuming you're in the German locale, I suspect the downloaded xml file is in UTF-8, not in German codepage. MsgBox cannot display properly texts in UTF-8.



Thanks for the reply guys.

I think the msgbox thing might be the issue. However, I can't "send" the variable correctly either.. which is really restricting me...
I need to be able to send the content of the variable into a standard control somehow....

For example, try the function below, it will demonstrate my point:

Open a notepad and press f9
it will paste the text in-correctly.

>>However<<<
It used the same variable to write into lyrics.txt, if you open that, the text is displayed correctly.

e.g
text that is sent looks like this: (i.e corrupted format)
"
Sie sollen brennenSie sollen brennenIn der HölleStirbStirb, Fernseher, stirbStirb
"

The text in the lyrics.txt looks like this: (format is intact)
"
Sie sollen brennen!
Sie sollen brennen!
In der Hölle!
Stirb!
Stirb, Fernseher, stirb!
Stirb!
"


I'm running an English/us version of windows 7 and I type my text using a uk keyboard (if that makes any difference?) the website where the lyrics are downloaded from, is in English as well, i.e:
http://lyricwiki.org/Farin_Urlaub:Dusche



I tried the FileReadU, but it didn't seem to make a difference, i.e the text is read correctly, but it's not sent correctly..?

This is a "bang-head-on-Wall" type situation for me, so close to the goal but I can't quite get there... ..>??< frustrating..

---edit
I also tried:

ControlSend, RichEdit20W1, %SongLyricFinal%, 88
into the iTunes control that holds the lyrics.

But I get a mess:
N der hLle
Tirb, FErnseher, stirb

Instead of:
In der Hölle!
Stirb, Fernseher, stirb!

:(
open note pad, and then press f9. then open the lyrics.txt in the script folder to compare.
Code:
return

f9::

;---------- download file
URLDownloadToFile,http://lyricwiki.org/Farin_Urlaub:Dusche, TEMPiTunesAutoLyricsCurrentsong.xml
Sleep 100

;----------- read xml file
FileRead, RawXMLFileVAR, TEMPiTunesAutoLyricsCurrentsong.xml

;RawXMLFileVAR := FileReadU("TEMPiTunesAutoLyricsCurrentsong.xml", ForceType="Auto: Unicode" )
;RawXMLFileVAR := FileReadU("TEMPiTunesAutoLyricsCurrentsong.xml", ForceType="Auto: UTF-8" )


;---------------- retrieve the line with the lyric
Loop, parse, RawXMLFileVAR, `n, `r  ; Specifying `n prior to `r allows both Windows and Unix files to be parsed.
{
  FailedString = There is currently no text in this page
  IfInString, A_LoopField,%FailedString% 
    {
    qtt(":-(")
    Return
    }
  IfInString, A_LoopField,class`='lyricbox'  ;note I'm escaping the equal sign.
    RawLineWithSongLyric = %A_LoopField%

}
;---------------------- format it correctly.
Needle = class`='lyricbox'
PositionOfNeedle := InStr(RawLineWithSongLyric, Needle)
FinalPositionOfNeedle := (PositionOfNeedle + 17)
StringTrimLeft, RawlineMinusStuffAtTheBeggining, RawLineWithSongLyric, %FinalPositionOfNeedle%
StringReplace, SongLyricFinal, RawlineMinusStuffAtTheBeggining, <br />, `n, All



;--------------->>>>>>>>>>>. THE OUT PUT PART<<<<<<<<<<<<<<<<<<<<<<,
FileAppend, %SongLyricFinal%, Lyrics.txt
Clipboard = %SongLyricFinal%
send, %SongLyricFinal%
;msgbox, %SongLyricFinal%

_________________
::
I Have Spoken
::


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 9th, 2009, 2:10 pm 
Offline

Joined: February 12th, 2007, 7:54 am
Posts: 2462
Your post appears confusing. So, you're in English locale, nevertheless can read texts in German? I think all these are essentially reduced to: UNICODE vs ANSI. If you need to handle texts in locale-free manner, all have to be managed in UNICODE. Unfortunately, however, AHK is not an Unicode app, it'll be tied/limited to the ANSI codepage currently selected. So, you cannot send every kind of texts as you want with AHK's built-in Send... functions, especially to an Unicode window. My suggestion in this case is: first convert the text from UTF-8 to UTF-16 then send it, e.g. using SendInput API with the flag KEYEVENTF_UNICODE.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 9th, 2009, 3:36 pm 
Offline

Joined: December 23rd, 2006, 6:02 pm
Posts: 424
Location: Russia
Yes, the downloaded file is in UTF-8. I suggest to try converting its text to ANSI (windows-1252) after FileRead. At least, after I did that, umlauts were sent to Notepad correctly.
Code:
f9::

;---------- download file
URLDownloadToFile,http://lyricwiki.org/Farin_Urlaub:Dusche, TEMPiTunesAutoLyricsCurrentsong.xml
Sleep 100

;----------- read xml file
FileRead, RawXMLFileVAR, TEMPiTunesAutoLyricsCurrentsong.xml

;--------— convert from UTF-8 to ANSI -----—

RawLen := StrLen(RawXMLFileVAR)
BufSize := (RawLen + 1) * 2
VarSetCapacity(Buf, BufSize, 0)

DllCall("MultiByteToWideChar", "uint", 65001, "int", 0, "str", RawXMLFileVAR
                             , "int", -1, "uint", &Buf, "uint", RawLen + 1)
DllCall("WideCharToMultiByte", "uint", 1252, "int", 0, "uint", &Buf, "int", -1
                             , "str", RawXMLFileVAR, "uint", RawLen + 1
                             , "int", 0, "int", 0)

;---------------- retrieve the line with the lyric
Loop, parse, RawXMLFileVAR, `n, `r  ; Specifying `n prior to `r allows both Windows and Unix files to be parsed.
{
  FailedString = There is currently no text in this page
  IfInString, A_LoopField,%FailedString% 
    {
    qtt(":-(")
    Return
    }
  IfInString, A_LoopField,class`='lyricbox'  ;note I'm escaping the equal sign.
    RawLineWithSongLyric = %A_LoopField%

}
;---------------------- format it correctly.
Needle = class`='lyricbox'
PositionOfNeedle := InStr(RawLineWithSongLyric, Needle)
FinalPositionOfNeedle := (PositionOfNeedle + 17)
StringTrimLeft, RawlineMinusStuffAtTheBeggining, RawLineWithSongLyric, %FinalPositionOfNeedle%
StringReplace, SongLyricFinal, RawlineMinusStuffAtTheBeggining, <br />, `n, All



;--------------->>>>>>>>>>>. THE OUT PUT PART<<<<<<<<<<<<<<<<<<<<<<,
FileAppend, %SongLyricFinal%, Lyrics.txt
Clipboard = %SongLyricFinal%
send, %SongLyricFinal%
;msgbox, %SongLyricFinal%


Last edited by YMP on August 12th, 2009, 4:50 pm, edited 1 time in total.

Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 9th, 2009, 3:55 pm 
Offline

Joined: June 11th, 2005, 9:34 am
Posts: 264
Location: England ish
YMP wrote:
Yes, the downloaded file is in UTF-8. I suggest to try converting its text to ANSI (windows-1252) after FileRead. At least, after I did that, umlauts were sent to Notepad correctly.
Code:
f9::

;---------- download file
URLDownloadToFile,http://lyricwiki.org/Farin_Urlaub:Dusche, TEMPiTunesAutoLyricsCurrentsong.xml
Sleep 100

;----------- read xml file
FileRead, RawXMLFileVAR, TEMPiTunesAutoLyricsCurrentsong.xml

;--------— convert from UTF-8 to ANSI -----—

RawLen := StrLen(RawXMLFileVAR)
BufSize := (RawLen + 1) * 2
VarSetCapacity(Buf, BufSize, 0)

DllCall("MultiByteToWideChar", "uint", 65001, "int", 0, "str", RawXMLFileVAR
                             , "int", -1, "uint", &Buf, "uint", BufSize)
DllCall("WideCharToMultiByte", "uint", 1252, "int", 0, "uint", &Buf, "int", -1
                             , "str", RawXMLFileVAR, "uint", RawLen + 1
                             , "int", 0, "int", 0)

;---------------- retrieve the line with the lyric
Loop, parse, RawXMLFileVAR, `n, `r  ; Specifying `n prior to `r allows both Windows and Unix files to be parsed.
{
  FailedString = There is currently no text in this page
  IfInString, A_LoopField,%FailedString% 
    {
    qtt(":-(")
    Return
    }
  IfInString, A_LoopField,class`='lyricbox'  ;note I'm escaping the equal sign.
    RawLineWithSongLyric = %A_LoopField%

}
;---------------------- format it correctly.
Needle = class`='lyricbox'
PositionOfNeedle := InStr(RawLineWithSongLyric, Needle)
FinalPositionOfNeedle := (PositionOfNeedle + 17)
StringTrimLeft, RawlineMinusStuffAtTheBeggining, RawLineWithSongLyric, %FinalPositionOfNeedle%
StringReplace, SongLyricFinal, RawlineMinusStuffAtTheBeggining, <br />, `n, All



;--------------->>>>>>>>>>>. THE OUT PUT PART<<<<<<<<<<<<<<<<<<<<<<,
FileAppend, %SongLyricFinal%, Lyrics.txt
Clipboard = %SongLyricFinal%
send, %SongLyricFinal%
;msgbox, %SongLyricFinal%



Thank you soooo much. that did it!!!!!!

_________________
::
I Have Spoken
::


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: August 12th, 2009, 4:52 pm 
Offline

Joined: December 23rd, 2006, 6:02 pm
Posts: 424
Location: Russia
I forgot that the size of the Unicode buffer should be specified in wide characters and not in bytes. Fixed that.


Report this post
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC [ DST ]


Who is online

Users browsing this forum: Bing [Bot], BrandonHotkey, JSLover, virpara and 64 guests


You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group