[SOLVED] How can a unicode ligature character be found and replaced with a script? Topic is solved
Re: How can a unicode ligature character be found and replaced with a script?
.
Last edited by Klarion on 20 Apr 2019, 00:50, edited 1 time in total.
Re: How can a unicode ligature character be found and replaced with a script?
.
Last edited by Klarion on 20 Apr 2019, 00:50, edited 1 time in total.
Re: How can a unicode ligature character be found and replaced with a script?
Thanks for the help.
Win 10 Professional 64bit 21H2 16Gb Ram AHK current as of 2021-12-26 .
Re: How can a unicode ligature character be found and replaced with a script?
In case you run into such a problem once again:
Copy the character you want to replace into an AHK Unicode script and use a simple StrReplace():
Copy the character you want to replace into an AHK Unicode script and use a simple StrReplace():
Code: Select all
#NoEnv
; example: "fine"
SearchText := "fi"
ReplaceText := "fi"
Original =
(
Soon after receiving the instructions to advance, Taylor had
given notice of his orders to influential citizens of Matamoros
then at Corpus Christi, explaining that his march would be
entirely pacific, and that he expected the pending questions to
be settled by negotiation; and similar assurances were con-
veyed to the Mexican customhouse office at “ Brazos Santiago,”
near Point Isabel‘ March 8 a more formal announcement
appeared in General Orders No. 30. Taylor here expressed
the hope that his movement would be “beneficial to all con-
cerned,” insisted upon a scrupulous regard for the civil and
religious rights of the people, and commanded that everything
required for the use of the army should be paid for “at the
highest market price." These orders, which merely antici-
pated instructions then on their way from Washington, were
translated into Spanish, and placed in circulation along the
border.<ref>18</ref>
)
Replaced := StrReplace(Original, SearchText, ReplaceText, Counter)
MsgBox, 0, Replaced %Counter% occurrences of %SearchText%, %Replaced%
Re: How can a unicode ligature character be found and replaced with a script?
.
Last edited by Klarion on 20 Apr 2019, 00:54, edited 2 times in total.
Re: How can a unicode ligature character be found and replaced with a script?
.
Last edited by Klarion on 20 Apr 2019, 00:49, edited 1 time in total.
Re: How can a unicode ligature character be found and replaced with a script?
@just me Thanks for the advice. Please check my opening post in which my script is displayed. This is exactly what I did, and it doesn't work.
Perhaps it's my global settings at the top of the page that cause the problem, so I am pasting it here:
Perhaps it's my global settings at the top of the page that cause the problem, so I am pasting it here:
Code: Select all
#SingleInstance force
#NoEnv
#Warn
SetWorkingDir, D:\ahk-functions-library
~LWin::vk07
~RWin::Return
~AppsKey::Return
#Include D:\ahk-functions-library\eng-pr.ahk
Win 10 Professional 64bit 21H2 16Gb Ram AHK current as of 2021-12-26 .
Re: How can a unicode ligature character be found and replaced with a script?
Much thanks again. Contrary to what you think, I have been analyzing your code step by step and looking up everything in the AHK Help. I am slow because my programming interest is focused on what I need for proofreading. I understood your concept but the output is empty. Understood the concatenation and the filter ranges filtering the output, but nothing seems to be passing to the function because the (x) is empty.
Here is my latest effort with msgboxes placed to follow the values.
Code: Select all
;test.ahk
;alt-u ligature "fi" or - chr(0xEF)chr(0xAC)chr(0x81) or {U+FB01}
!u::
autotrim, on
clipboard =
dirtyText =
cleanResult =
sendinput, ^a^c
clipwait, 0
dirtyText = %clipboard%
Msgbox, % dirtyText ;INPUT OK
clipwait, 0
Loop, Parse, % dirtyText
;msgbox, "loopfield", %A_LoopField% ; THIS IS OK
cleanResult . = A_LoopField ~ = "[\x{0000}-\x{007F}]" ? A_LoopField : NormalizationFormKC(A_LoopField)
msgbox, "cleanresult", %cleanResult% ;NOTHING
sendinput, %cleanResult%
return
NormalizationFormKC(x)
msgbox, 1, "x_value", %x%
{
a := StrLen(x) * 6
VarSetCapacity(b, a)
DllCall("Normaliz.dll\NormalizeString", "int", 5, "wstr", x, "int", StrLen(x), "ptr", &b, "int", a)
Return StrGet(&b, a, "UTF-16")
}
Win 10 Professional 64bit 21H2 16Gb Ram AHK current as of 2021-12-26 .
Re: How can a unicode ligature character be found and replaced with a script? Topic is solved
Using MsgBox to debug is useful and you have pin pointed the problematic line.
cleanResult . = A_LoopField ~ = "[\x{0000}-\x{007F}]" ? A_LoopField : NormalizationFormKC(A_LoopField)
You have a syntax error there, you use . = and ~ = which won't work. You need to remove the space, so
.= and ~=
So change that line and it should work (or at least that code should produce a result, if it is the desired result might be another question)
cleanResult . = A_LoopField ~ = "[\x{0000}-\x{007F}]" ? A_LoopField : NormalizationFormKC(A_LoopField)
You have a syntax error there, you use . = and ~ = which won't work. You need to remove the space, so
.= and ~=
So change that line and it should work (or at least that code should produce a result, if it is the desired result might be another question)
Re: How can a unicode ligature character be found and replaced with a script?
NormalizeString() should work with the whole string at once:
@Klarion:
Code: Select all
#NoEnv
DirtyText =
(
Soon after receiving the instructions to advance, Taylor had
given notice of his orders to influential citizens of Matamoros
then at Corpus Christi, explaining that his march would be
entirely pacific, and that he expected the pending questions to
be settled by negotiation; and similar assurances were con-
veyed to the Mexican customhouse office at “ Brazos Santiago,”
near Point Isabel‘ March 8 a more formal announcement
appeared in General Orders No. 30. Taylor here expressed
the hope that his movement would be “beneficial to all con-
cerned,” insisted upon a scrupulous regard for the civil and
religious rights of the people, and commanded that everything
required for the use of the army should be paid for “at the
highest market price." These orders, which merely antici-
pated instructions then on their way from Washington, were
translated into Spanish, and placed in circulation along the
border.<ref>18</ref>
)
CleanResult := NormalizeFormKC(DirtyText)
MsgBox, 0, CleanResult, %CleanResult%
ExitApp
NormalizeFormKC(Str)
{
; docs.microsoft.com/en-us/windows/desktop/api/winnls/nf-winnls-normalizestring
; MsgBox, 1, Str Value, %Str%
Length := DllCall("Normaliz.dll\NormalizeString", "Int", 5, "WStr", Str, "Int", -1, "Ptr", 0, "Int", 0, "Int")
If (Length > 0)
{
VarSetCapacity(Out, Length * 2, 0)
DllCall("Normaliz.dll\NormalizeString", "Int", 5, "WStr", Str, "Int", -1, "Ptr", &Out, "Int", Length, "Int")
Return StrGet(&Out, "UTF-16")
}
}
Do you run AHK Unicode? I copied the text from your post and StrReplace() is working for me. The character code is 0xFB01 for UTF-16, 0xEFAC81 is UTF-8.ineuw wrote:Thanks for the advice. Please check my opening post in which my script is displayed. This is exactly what I did, and it doesn't work.
@Klarion:
It's just an option for people like you not willing to learn about C/C++ function calls.We, usually, call that kind of programming style as ReInventing the Wheels
Re: [SOLVED] How can a unicode ligature character be found and replaced with a script?
@ineuw
very nice
I am asking you "kindlly" to delete all of my codes from your computer and your head and do not use it ever
I do not want you to use my code
If you want to do something, write your own code
Regards
very nice
I am asking you "kindlly" to delete all of my codes from your computer and your head and do not use it ever
I do not want you to use my code
If you want to do something, write your own code
Regards
Re: [SOLVED] How can a unicode ligature character be found and replaced with a script?
@Klarion, will do as you wish, but I have no clue what happened here. Before marking the topic solved, I posted a thank you note to everyone who helped but this note is missing. Can you please explain what got you so upset?
Win 10 Professional 64bit 21H2 16Gb Ram AHK current as of 2021-12-26 .
Re: [SOLVED] How can a unicode ligature character be found and replaced with a script?
Not sure if this might help others
Code: Select all
SearchText := "\x{FB00}"
ReplaceText := "ff"
txtHold := RegExReplace(txtHold, SearchText, ReplaceText)
SearchText := "\x{FB01}"
ReplaceText := "fi"
txtHold := RegExReplace(txtHold, SearchText, ReplaceText)
SearchText := "\x{FB02}"
ReplaceText := "fl"
txtHold := RegExReplace(txtHold, SearchText, ReplaceText)
SearchText := "\x{FB03}"
ReplaceText := "ffi"
txtHold := RegExReplace(txtHold, SearchText, ReplaceText)
SearchText := "\x{FB04}"
ReplaceText := "ffl"
txtHold := RegExReplace(txtHold, SearchText, ReplaceText)
SearchText := "\x{FB05}"
ReplaceText := "ft"
txtHold := RegExReplace(txtHold, SearchText, ReplaceText)
SearchText := "\x{FB06}"
ReplaceText := "st"
txtHold := RegExReplace(txtHold, SearchText, ReplaceText)
Re: [SOLVED] How can a unicode ligature character be found and replaced with a script?
Or more directly
Code: Select all
;Replace ligatures
txtHold := RegExReplace(txtHold, "\x{FB00}", "ff")
txtHold := RegExReplace(txtHold, "\x{FB01}", "fi")
txtHold := RegExReplace(txtHold, "\x{FB02}", "fl")
txtHold := RegExReplace(txtHold, "\x{FB03}", "ffi")
txtHold := RegExReplace(txtHold, "\x{FB04}", "ffl")
txtHold := RegExReplace(txtHold, "\x{FB05}", "ft")
txtHold := RegExReplace(txtHold, "\x{FB06}", "st")
Re: [SOLVED] How can a unicode ligature character be found and replaced with a script?
Thanks for your solution, and happy holidays.
Win 10 Professional 64bit 21H2 16Gb Ram AHK current as of 2021-12-26 .
Who is online
Users browsing this forum: downstairs, OrangeCat and 179 guests