Page 1 of 1

map() is so slow?

Posted: 17 Feb 2024, 11:12
by cgx5871
map is so slow
I read a 5000 line text.
Using map to assign a value takes 2000 milliseconds.
Use variables, only 100 milliseconds

Code: Select all

Contents := FileRead(File,"UTF-8")
mp:=Map()
Loop Parse,Contents,'`n', "`r"
{
	s := StrSplit(A_LoopField, '`t', " `t")
	if s.Length>1
	{
		s[1]:=A_LoopField			;  100ms
		; mp[s[1]]:=A_LoopField		;~ 2000ms+
	}
}

Re: map() is so slow?

Posted: 17 Feb 2024, 12:09
by Rohwedder
Hallo,
I don't see any real difference here when I hold down Hotkey q:

Code: Select all

#Requires AutoHotkey v2.0
DllCall("QueryPerformanceFrequency","Int64*", &Freq:=0)
q::
{
Static Time:=0, Count:=0, mp:=Map()
Contents := FileRead("D:\zusätzliches\Adresse.txt","UTF-8")
DllCall("QueryPerformanceCounter","Int64*", &Count1:=0)
; Time duration measurement of this:
Loop Parse,Contents,'`n', "`r"
{
	s := StrSplit(A_LoopField, '`t', " `t")
	if s.Length>1
	{
		; s[1]:=A_LoopField			;  121?? µs
		mp[s[1]]:=A_LoopField		;  116?? µs
	}
}
; Multiple execution increases accuracy by means of averaging!
DllCall("QueryPerformanceCounter","Int64*", &Count2:=0)
ToolTip "Time: " Round((Time+=(Count2-Count1)/Freq*1.E6)/++Count) " µs"
}

Re: map() is so slow?

Posted: 17 Feb 2024, 12:31
by mikeyww

Code: Select all

#Requires AutoHotkey v2.0
filePath := 'test3.txt'
mp       := Map()
start    := A_TickCount
FileEncoding 'UTF-8'
Loop Read filePath {
 s := StrSplit(A_LoopReadLine, '`t', ' `t')
 If s.Length > 1
  mp[s[1]] := A_LoopReadLine
}
MsgBox A_TickCount - start ' ms', 'Elapsed time (N=' mp.Count ')', 'Iconi'
image240217-1231-001.png
Output
image240217-1231-001.png (13.94 KiB) Viewed 429 times

Re: map() is so slow?

Posted: 17 Feb 2024, 12:37
by cgx5871
@mikeyww
I was wrong, it was 80000
File size = 1.25mb
image.png
image.png (29.13 KiB) Viewed 424 times

Re: map() is so slow?

Posted: 17 Feb 2024, 12:40
by mikeyww
281 ms here. If your CPU is busy doing other things, your mileage may vary.

Re: map() is so slow?

Posted: 17 Feb 2024, 12:47
by geek
This is what my HashMap library is for viewtopic.php?f=83&t=124727

By replacing map with hashmap, it completes on my system in 200ms instead of 2000ms, against the same file (downloaded from https://github.com/KyleBing/rime-wubi86-jidian). Even faster if I avoid using StrSplit:

Code: Select all

#Requires AutoHotkey v2.0
#Include "C:\Users\User\git\HashMap.ahk\Dist\HashMap.ahk"
filePath := "C:\Users\User\Downloads\rime-wubi86-jidian-master\wubi86_jidian.dict.yaml"
mp       := HashMap()
start    := A_TickCount
FileEncoding 'UTF-8'
Loop Read filePath {
	if p := InStr(A_LoopReadLine, "`t")
		mp[SubStr(A_LoopReadLine, 1, p - 1)] := A_LoopReadLine
}
MsgBox A_TickCount - start ' ms', 'Elapsed time (N=' mp.Count ')', 'Iconi'

Re: map() is so slow?

Posted: 17 Feb 2024, 12:53
by cgx5871
@mikeyww

Code: Select all

  mp["abc"] := A_LoopReadLine
I found the reason.
When I use “abc”,TickCount = 140ms
It’s because s[1] is a Chinese character
Why is the map so slow for Chinese characters, and it’s only a single Chinese character?
Don't quite understand.

Re: map() is so slow?

Posted: 17 Feb 2024, 13:01
by mikeyww
I added a Chinese character to each line, with no change in results.

Re: map() is so slow?

Posted: 17 Feb 2024, 13:14
by cgx5871
@mikeyww
@geek
I uploaded the file, you can test it.
It shouldn't be a computer problem

Code: Select all

#Requires AutoHotkey v2.0
filePath := 'C:\Users\Enigma\AppData\Roaming\Rime\Dict\wubi86_jidian.dict.yaml'
mp       := Map()
start    := A_TickCount
FileEncoding 'UTF-8'
Loop Read filePath {
 s := StrSplit(A_LoopReadLine, '`t', ' `t')
 If s.Length > 1
  mp[s[1]] := A_LoopReadLine
}
MsgBox A_TickCount - start ' ms', 'Elapsed time (N=' mp.Count ')', 'Iconi'

Re: map() is so slow?

Posted: 17 Feb 2024, 15:00
by flyingDman
I would not use Loop Read for that.

Re: map() is so slow?  Topic is solved

Posted: 17 Feb 2024, 15:11
by iseahound
I tested this using FileOpen and got the same exact results. Unfortunately, it's a tradeoff between the O(log n) access of a map vs the linear access of an array. Now if you wanted to do aaaa, you could use the file pointer and binary search as I do here:

Code: Select all

UnicodeData("🍅")

UnicodeData(s, filepath := "UnicodeData.txt", show := True) {
   if (s == "")
      return

   ; Download UnicodeData from backup sites if needed.
   if not FileExist(filepath)
      try Download "https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt", filepath
      catch
         try Download "http://www.unicode.org/Public/UNIDATA/UnicodeData.txt", filepath
         catch
            Download "https://raw.githubusercontent.com/latex3/unicode-data/main/UnicodeData.txt", filepath

   ; Open UnicodeData.txt
   database := FileOpen(filepath, "r`n", "UTF-8")

   ; Binary Search.
   l := 0                                                    ; lower bound
   h := database.length                                      ; higher bound
   while (n := (l + h) / 2, n != l && n != h) {              ; Allow 0.5 so it breaks after this loop.
      database.Seek(n)                                       ; Move file pointer to middle of file.
      (database.Pos != 0) && database.ReadLine()             ; Ensure a full line can be read below.
      row := database.ReadLine()                             ; Read a full line of text.
      codepoint := RegExReplace(row, "^(.*?);.*$", "0x$1")   ; Extract and convert the unicode hex to decimal.

      ; Limit min or max bound of binary search.
      if (Ord(s) < codepoint)
         h := Floor(n)                                       ; Converges h == l
      else if (Ord(s) > codepoint)
         l := Ceil(n)                                        ; Converges h == l
      else
         break
   }

   database.Close()
   r := StrSplit(row, ";")                                   ; Split row into an array.
   desc := (r[2] = "<control>") ? r[11] : r[2]               ; Retrieve alternate description if control character.

   if not show
      return Format("<U+{:X}> " desc, Ord(s))

   MsgBox Format(" <U+{:X}> " desc, Ord(s)), " " Chr(Ord(s)) ; Show a Message Box.
}
Actually, a Map can't handle repeated keys, so I assume you're doing a lookup like aaaa where you can take advantage of the fact that the data comes pre-sorted already! :)