Custom string sorter has performance issues. Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Custom string sorter has performance issues.

12 May 2023, 16:57

Today was apparently the day when I had enough of my computer telling me I can't spell, so I made a custom string sorter :P I made it in Python, since I'm not that good at ahk, and then attempted to port it. This is where it the troubles began. I finally got it to work, but it is not running smoothly. It's about three orders of magnitude slower than my Python code, so I suspect that something is getting lost in translation. The function:

Code: Select all

CustomSort(ListofStrings, MyOrder){
	; Create a copy of the list so it doesn't get edited
	ListofStringsCopy := ListofStrings.Clone()
	; Create a dictionary with scores for each character in MyOrder
	scores := {}
	For index, char in MyOrder
		scores[char] := index
	
	; Sort the array using bubble sort
	n := ListofStrings.Length()
	While (n > 0) {
		newn := 0
		Loop, % n-1 {
			i := A_Index
			s1 := ListofStringsCopy[i]
			s2 := ListofStringsCopy[i+1]
			len_s1 := StrLen(s1)
			len_s2 := StrLen(s2)
			len_min := len_s1 < len_s2 ? len_s1 : len_s2
			temp := 0
			Loop, % len_min {
				c1 := SubStr(s1, A_Index, 1)
				c2 := SubStr(s2, A_Index, 1)
				If (!scores.HasKey(c1) && !scores.HasKey(c2))
					Continue
				If (!scores.HasKey(c1)) {
					temp := 1
					Break
				}
				If (!scores.HasKey(c2)) {
					temp := -1
					Break
				}
				If (scores[c1] < scores[c2]) {
					temp := -1
					Break
				}
				If (scores[c1] > scores[c2]) {
					temp := 1
					Break
				}
			}
			If (!temp)
				temp := len_s1 - len_s2
				
			If (temp > 0) {
				; Swap items
				ListofStringsCopy[i] := s2
				ListofStringsCopy[i+1] := s1
				newn := i + 1
			}
		}
		n := newn
	}
Return ListofStringsCopy
}
It's just a function that sorts strings like in dictionaries/libraries (physical ones) and lets you decide which characters go in what order. Stupid example:

Code: Select all

MyOrder := ["@", "B", "C"]
ListofStrings := ["B@c", "Bb", "b@BC"]

CustomSort(ListofStrings, MyOrder)

>> ["b@BC", "B@c", "Bb"] 
This might seem useless if you are from an English speaking country, but I'm guessing that Swedish is not the only language that gets mangled when trying to sort strings, so there might be some interest in a function like this, if it were a bit more efficient. I don't actually need the function, since I have a Python version that takes thousands more words before getting sluggish, but I'm very curious as to what went wrong when trying to port it. I know bubble sort is not a very fast algorithm, but I gave up on trying to write an implementation of quicksort after a while. Btw, the Ahk version is not 3 orders of magnitude slower than the Python quicksort one. I made a somewhat fair comparison and timed the Python version also using bubble sort. I'm just curious to find out where I went wrong, but if you see any usefulness in the function, feel free to improve it.

If you are from an English speaking country and wanna relate to my dilemma, you can imagine most of your apps and algorithms telling you that "e" comes before "a" in the alphabet :headwall:
teadrinker
Posts: 4358
Joined: 29 Mar 2015, 09:41
Contact:

Re: Custom string sorter has performance issues.

12 May 2023, 23:26

Perhaps this approach will work faster:

Code: Select all

CustomSort(ListofStrings, MyOrder) {
    static order := ""
    if (order = "") {
        order := MyOrder
        stringToSort := ""
        for k, v in ListofStrings {
            stringToSort .= (stringToSort = "" ? "" : "`n") . v
        }
        Sort, stringToSort, F %A_ThisFunc%
        order := ""
        return StrSplit(stringToSort, "`n")
    }
    for k, v in ["ListofStrings", "MyOrder"] {
        for i, j in order {
            %v% := StrReplace(%v%, j, Chr(i))
        }
    }
    return ListofStrings > MyOrder ? 1 : ListofStrings < MyOrder ? -1 : 0
}
I think it could still be improved.
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Re: Custom string sorter has performance issues.

13 May 2023, 12:40

@teadrinker One order of magnitude down, two to go ;) Unfortunally I found a sorting error in both mine and your function when testing them now. Given the order and list below, both our functions will sort the list wrong.

Code: Select all

MyOrder := ["Å", "Ä", "Ö"]
ListofStrings := ["å", "Å", "ä", "Ä", "Ö", "ö"]
"ListofStrings" is already sorted, according to "MyOrder", but both gets messed up when going through our functions. My Python function did not have this issue, so I'll have to go over the logic of my sort again.
teadrinker
Posts: 4358
Joined: 29 Mar 2015, 09:41
Contact:

Re: Custom string sorter has performance issues.

13 May 2023, 15:08

Kotte wrote: "ListofStrings" is already sorted
Why then is å in front of Å, while Ö is in front of ö? What is the principle here?
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Re: Custom string sorter has performance issues.

13 May 2023, 15:41

@teadrinker Ah, maybe should have clarified that å, ä, ö are the lowercase version of Å, Ä, Ö respectively. So since å and Å is the same letter the order remains unchanged. The "ListofStrings := ["å", "Å", "ä", "Ä", "Ö", "ö"]" was an example of a input to the sorter that gets sorted wrong. So it is an example of an input that should be returned as is.
teadrinker
Posts: 4358
Joined: 29 Mar 2015, 09:41
Contact:

Re: Custom string sorter has performance issues.  Topic is solved

13 May 2023, 18:08

If MyOrder is always uppercase, this should work:

Code: Select all

CustomSort(ListofStrings, MyOrder) {
    static order := ""
    if (order = "") {
        order := MyOrder
        stringToSort := ""
        for k, v in ListofStrings {
            stringToSort .= (stringToSort = "" ? "" : "`n") . v
        }
        Sort, stringToSort, F %A_ThisFunc%
        order := ""
        return StrSplit(stringToSort, "`n")
    }
    for k, v in ["ListofStrings", "MyOrder"] {
        %v% := Format("{:U}", %v%)
        for i, j in order {
            %v% := StrReplace(%v%, j, Chr(i))
        }
    }
    return ListofStrings >= MyOrder ? 1 : -1
}
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Re: Custom string sorter has performance issues.

14 May 2023, 03:49

@teadrinker Ah, of course! I dropped the key sensitivity part when trying to implement quicksort and then forgot to take it back when going back to bubble sort. Thanks man!
teadrinker
Posts: 4358
Joined: 29 Mar 2015, 09:41
Contact:

Re: Custom string sorter has performance issues.

14 May 2023, 14:38

However, this fix slows down the script, as the test showed.
just me
Posts: 9511
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Custom string sorter has performance issues.

15 May 2023, 03:20

@Kotte,
... This might seem useless if you are from an English speaking country, but I'm guessing that Swedish is not the only language that gets mangled when trying to sort strings, ...
What about the CL option of the Sort command?
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Re: Custom string sorter has performance issues.

17 May 2023, 01:18

@just me Only works for English, as far as I can tell.
just me
Posts: 9511
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Custom string sorter has performance issues.

17 May 2023, 02:49

@Kotte,

it works for the 'current user's locale'. Did you choose English as your default locale?
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Re: Custom string sorter has performance issues.

17 May 2023, 06:27

@just me Yeah, changing locale changes the sorting, but doesn't solve the problem since that sorting is also wrong. It might have to do with the ASCII codes for the last three characters. The alphabetical order is "Å Ä Ö", but the order of their ASCII codes (DEC) are "Ä Å Ö", Å = 197, Ä = 196, Ö = 214.
just me
Posts: 9511
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Custom string sorter has performance issues.

17 May 2023, 07:02

@Kotte,
I thought that the locale also defines the sorting order. Using the German locale sorts the ä (228) directly after the a (97).
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Re: Custom string sorter has performance issues.

19 May 2023, 05:42

@just me Yeah, it seems to define a different sorting order, just not the right one. But I believe this is not really an issue in this case, since the idea is to sort a list based on the sorting order given as an input. If the sorting relied on locale, wouldn't that cause the function to return different things for different locales?
just me
Posts: 9511
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Custom string sorter has performance issues.

19 May 2023, 08:40

@Kotte,
If the sorting relied on locale, wouldn't that cause the function to return different things for different locales?
Yes, of course.
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Re: Custom string sorter has performance issues.

19 May 2023, 13:10

@just me Well that would kinda defeat the "custom" part of the function, which is its entire reason for existing.
just me
Posts: 9511
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Custom string sorter has performance issues.

20 May 2023, 04:42

@Kotte, you need the function if you want to sort the strings in an arbitrary order. Why do you need this?
Kotte
Posts: 46
Joined: 03 May 2021, 08:33

Re: Custom string sorter has performance issues.

21 May 2023, 07:36

@just me What drove me to writing it was that I was having trouble with Swedish characters getting sorted wrong in a Python GUI. I found a solution but remembered that I had had similar problems before and also have needed to sort strings containing non-standard character representations. Instead of solving the individual problem again I thought I'd just write a function that takes the sorting order as input. The Python function's overhead is small enough that it doesn't really affect the overall performance, for my usecases, but when I tried to implement it in AutoHotkey it got way slower and I got curious as to why.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Bing [Bot], CoffeeChaton, GEOVAN, LazyVovchik, peter_ahk, rubeusmalfoy, supplementfacts and 113 guests