_wcsrev problem Topic is solved

Get help with using AutoHotkey (v2 or newer) and its commands and hotkeys
Descolada
Posts: 1202
Joined: 23 Dec 2021, 02:30

_wcsrev problem

10 Oct 2022, 13:02

Does somebody know why the following doesn't work?

Code: Select all

MsgBox("Forward: " (str := "💩emoji"))
DllCall("msvcrt\_wcsrev", "str", str, "CDecl str")
MsgBox("Reversed: " str)
I was under the impression that the poop emoji is a UTF-16 character, so it should be supported...
The script is saved in UTF-8-BOM
iseahound
Posts: 1472
Joined: 13 Aug 2016, 21:04
Contact:

Re: _wcsrev problem  Topic is solved

10 Oct 2022, 17:26

It is working correctly, as you can see when the reversed string is passed back in to get the original data.

The poop emoji uses more than one UTF-16 unit of data (💩). If you imagine 0xFF as one byte, then this character takes up 3 bytes, or 2 "UTF-16" units. A wstr is a unicode string of size 2 bytes. The algorithm preforms a simple reversal of each character unit and does not take into account multi byte characters. In this case you see two boxes, because it is being separated into two due to the naïveté of the algorithm used.

---------------------------
💩
---------------------------
<U+1F4A9> PILE OF POO
---------------------------
OK
---------------------------

Also I don't know if that function is safe, it returns allocated memory which should be manually freed?
safetycar
Posts: 435
Joined: 12 Aug 2017, 04:27

Re: _wcsrev problem

11 Oct 2022, 08:33

Not my thread but it made me think about it.
There's that part of reasoning there, but is there an easy way to get over it? Because for example, StrLen(Str) reports length 7. And StrSplit breaks the emoji in 2 parts too. Probably because of the same, but is the workaround too complicated?
Descolada
Posts: 1202
Joined: 23 Dec 2021, 02:30

Re: _wcsrev problem

11 Oct 2022, 09:05

@iseahound, thanks for the explanation, the implementation of _wcsrev itself seems to be the problem indeed.
About the safety of the function - why do you think it returns allocated memory? I thought it acts directly on value directed to by the pointer, and since AHK owns the value and knows about it then it doesn't need to be freed.

@safetycar, that is a good observation: StrLen("💩emoji") == 7. Though I don't get the behavior you mentioned with StrSplit: MsgBox(StrSplit("💩emoji", "mo")[1]) is displayed correctly for me. It does affect SubStr though: SubStr("💩", 2) == �. Damn Unicode :D
safetycar
Posts: 435
Joined: 12 Aug 2017, 04:27

Re: _wcsrev problem

11 Oct 2022, 09:12

Descolada wrote:
11 Oct 2022, 09:05
StrSplit: MsgBox(StrSplit("💩emoji", "mo")[1]) is displayed correctly for me
I meant splitting by chars like:

Code: Select all

arr := StrSplit("💩")
MsgBox(arr[1] "`n" arr[2])
It's also curious that concatenating without the line jump it gets displayed as a single icon.
iseahound
Posts: 1472
Joined: 13 Aug 2016, 21:04
Contact:

Re: _wcsrev problem

11 Oct 2022, 09:40

For StrLenUnicode see: viewtopic.php?p=106284#p106284

You'll have to dig deep into Unicode normalization. Lexikos mentions combining characters, and I haven't looked into it but I image he means ŝ̷̨̢͔̭̜̖͈̥͕̭͓̹̞͌̓̂́́̊̈́̈́͜ơ̵̟̱͉͓̺̪̜͂̾͗̈m̴͎͇̰̼̖̣̈́͜ͅͅę̷̧̝̬͈̣̝͌͆̅͂̋̂͐̈͂̇̈̏͜ͅt̵͍͓̠͔̳̱̘̤̪̦̟͕̿̈́͒͛̐̍̐̔̕͘͘͠͠͝h̶̡̞̉̏̾̕͝į̸̧̡̧̛̥̹̘͓͇̘͔͙̰͍̬̓̃̈́̇̽́̿̈͝ń̵̬̪͔͇̺̮͉͚̩̈́͛͛̇́̑̈̽͜͠͠g̸̢̭̼̯͖͚̗̹̯̹̰͗̽ ̸̨͍͔̱̰̠͖̳̠̪͉̒̏̾͂͊͛̃͋̏͐̉͜l̸̢̨̛̝̦̩̜̤͉̟̗̦͈̗̦̱̂̀͊͋̈͆̄͌͝͠͠į̶̢̳͈̫͕̬̮͎̝̾̓͆͑́͌͌̈̓̄̕͝͠͝k̴̗̤͇̦̩͓̫̠̗̍̾̈́͐́̏͑ͅe̷̢̢̬̣͚̗͕̟͆̅̀͆̾͛̃̆͌̑̂͗͝ͅ ̵̦̥͔̠͉̖͉̭̗̼̒̿͜t̶̨̤̝̭̭̼̫͔̄̈̋̐̀̄̅̽͐̏̋͘͠͝h̶̬̬̖̾į̷̦͓̩̺̘̗̩̼̘͖̞̤̜̇̉̓͂́͐̉͐̽͝ś̶͓̟̲̀̊̚͠ or ก็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊๊ก็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็็
safetycar
Posts: 435
Joined: 12 Aug 2017, 04:27

Re: _wcsrev problem

11 Oct 2022, 09:52

Thanks for the link, I'm seeing there that regex handles this things better.
This is able to recognize the emoji without breaking it: Msgbox RegExReplace("💩", "(.)", "<$1>")
Helgef
Posts: 4709
Joined: 17 Jul 2016, 01:02
Contact:

Re: _wcsrev problem

14 Oct 2022, 11:54

Descolada wrote:
11 Oct 2022, 09:05
I thought it acts directly on value directed to by the pointer, and since AHK owns the value and knows about it then it doesn't need to be freed.
This is correct.
DllCall("msvcrt\_wcsrev", "str", str, "CDecl str")
It is not necessary to specify str for the return value, [edit: see lexikos' correction below] in fact it is a bit wasteful since it will yield a copy of the string and then immediately discard it.

Cheers.
Last edited by Helgef on 15 Oct 2022, 11:18, edited 1 time in total.
lexikos
Posts: 9690
Joined: 30 Sep 2013, 04:07
Contact:

Re: _wcsrev problem

15 Oct 2022, 03:31

Helgef wrote:
14 Oct 2022, 11:54
It is not necessary to specify str for the return value, in fact it is a bit wasteful since it will yield a copy of the string and then immediately discard it.
DllCall doesn't copy the string or allocate any memory; it just returns the pointer. Because the call is the last operation of an expression statement (i.e. the value isn't going to be used), no memory is allocated and the string isn't copied.

If there was more to the expression, the string would be copied into temporary memory, since the expression evaluator must assume that the original string could be freed or overwritten as a side-effect of subsequent operations.
Helgef
Posts: 4709
Joined: 17 Jul 2016, 01:02
Contact:

Re: _wcsrev problem

15 Oct 2022, 11:17

Very good, my mistake :thumbup:

Cheers.

Return to “Ask for Help (v2)”

Who is online

Users browsing this forum: kunkel321, shipaddicted, Spikea and 65 guests