Dead hotstring locator
Posted: 04 Apr 2023, 14:10
I have a personal version of AutoCorrect.ahk that I've customized over the years. I wanted a way to locate duplicate hotstring triggers and also locate no-end-char sub-trigger-strings that would prevent longer trigger strings from being usable. For example, given these two,
you can never expand the Biz one, because as soon as you press the second "c" then Wap expands. And, of course, when there are duplicates, only the first one is seen.
Running the below script as it is written, yields:
It's noteworthy that AHK can have "context specific" hotkeys. As in
https://www.autohotkey.com/docs/v1/Hotstrings.htm#variant
The above script can't differentiate those, and will flag them as dupes. It does ignore ;commented-out lines, but only per line. It can't ignore hotstrings in
/*
block comments
*/
I processed AutoCorrect.ahk (the original one). If anyone is curious:
fyi before processing it, I ;by-line-commented-out the ambiguous items.
Edit:
-It doesn't work well with triplicates.... Those get flagged as three different duplicate pairs. Might try to fix that in the future.
Edit 4-5-2023: I see an error in the autocorrect "unreachable" list...
The top one doesn't make the bottom one unreachable. If it was a "middle-of-word" replacement (:*?:) it would make it redundant though. I will fix this in the next version. Fixed. (It was two of the regexes in the inner loop.)
Maybe will add a third check for redundant items...
Edit 2 on 4-5-2023: I rearranged the logic parts in above ver 4-5-2023 b, circumventing the need for a couple of the regexes. By doing this I was able to shave a minute off of the time needed to process Autocorrect.ahk. (from 3.15 minutes, to 2.14). Also I simplified the regex pattern. It occurred to me that I don't need to ensure that the hotstring options are valid... I just need to differentiate them from the rest of the hotstring.
-Side note: With the new logic, I'm assuming that ANY line starting with a semicolon is a hotstring (or at least a hotstring trigger). Hopefully this is a valid assumption(?)
Edit 3 on 4-5-2023: Used the regex from here viewtopic.php?f=76&t=115786&p=516098#p516098 in the above
ver 4-5-2023-c. It speeded things up a bit.
Mildly interesting notes: My timings for processing Autocorrect.ahk:
The original 2007 Autocorrect.ahk has 5295 lines (though not all are hotstrings).
-The first version of this script took 3.15 minutes.
-Changing the logic so If InStr circumvented non-hotstring lines: 2.14 mins.
-Updated regex so that unneeded groups were no longer captured: 1:81 mins.
-Added "S)" regex option (Studies the pattern to try improve its performance). It wasn't able to improve this particular regex though: 1.82 mins.
-Ran it again the same way: 1.82 mins again.
-Took the "S)" back off: 1.75 this time.
Code: Select all
::ccc::Biz ;unreachable
:*:cc::Wap; makes above unreachable
Running the below script as it is written, yields:
Spoiler
Code: Select all
#SingleInstance, Force
; A script to find dead hotstrings. Ver 4-5-2023-c.
; for AHK v1
varMain =
(
:T:aaa::blah
::bbb::blah
::ccc::Biz ;unreachable
::aaa::blua
:*:cc::Wap; makes above unreachable
::ddd::bluab
::eee::Boo
::ccc::Biz
::ccc::Biz
::fff::blah
::ggg::Foo ;unreachable
::eee::Bang
:*B0:gg::Bar; makes above unreachable
)
;~ MainList := "Autocorrect.ahk" ; Must be in same folder as this script.
;~ FileRead, varMain, %MainList% ; load entire file contents into variable.
regex := "(:(?:\*|\?|\w)*:)(..*)::(.*?)(?=\s;|$)\s*(;.*)?" ; swagfag-based
;~ OutArrays
;~ 1 = :options:
;~ 2 = trigger
;~ 3 = expansion when present
;~ 4 = comment when present
duplicates := "" ; Checking for trigger string, not expansion text.
dupeCount := 0
unreachable := "" ; This means that an ':*:immediate trigger' string is blocking it.
unreachCount := 0
StartTime := A_TickCount
;MsgBox, varMain is: |%varMain%|
Loop, parse, varMain, `n, `r ; Check contents line-by-line.
{ ; 88888888888888888 OUTER LOOP 88888888888888888888888888
If (SubStr(A_LoopField, "1", "1")!=":") ; If line does not start with colon, then skip rest of loop.
continue
Else
{
;MsgBox, A_LoopField is: |%A_LoopField%|
RegExMatch(A_LoopField, regex, Hotty)
LineNum := A_Index ; Save index of outer loop before A_Index gets reused below.
LineText := A_LoopField ; Save text before A_LoopField gets reused below.
varMain := StrReplace(varMain,A_LoopField,"",,1) ; Remove Hotty from list so we don't compare it with itself.
Loop, parse, varMain, `n, `r ; Check contents line-by-line.
{ ; ########## INNER LOOP #####################
If (SubStr(A_LoopField, "1", "1")!=":") ; If line does not start with colon, then skip rest of loop.
continue
else
RegExMatch(A_LoopField, regex, subHotty)
if (subHotty2 = Hotty2)
{
duplicates := duplicates . "lines " . LineNum . "`t" . Hotty1 . "`t" . Hotty2 . "::" . Hotty3 . Hotty4 . "`n
-and " . A_Index . "`t" . subHotty1 . "`t" . subHotty2 . "::" . subHotty3 . subHotty4 . "`n"
dupeCount++
}
else if (instr(subHotty1,"*") && RegExMatch(Hotty2, "^" . subHotty2 . ".*"))
or (instr(Hotty1,"*") && RegExMatch(subHotty2, "^" . Hotty2 . ".*"))
{
unreachable := unreachable . "lines " . LineNum . "`t" . Hotty1 . "`t" . Hotty2 . "::" . Hotty3 . Hotty4 . "`n
-and " . A_Index . "`t" . subHotty1 . "`t" . subHotty2 . "::" . subHotty3 . subHotty4 . "`n"
unreachCount++
}
;soundbeep, , 250
continue
} ; #######################################
}
} ; 8888888888888888888888888888888888
ElapsedTime := (A_TickCount - StartTime) / 60000
ElapsedTime := Round(ElapsedTime, 2)
dupeSum := % " ---- " dupeCount " Duplicates: ---- `n" duplicates
unreachSum := % " ---- " unreachCount " Unreachable items: --- `n" unreachable
MsgBox, Search of %MainList% took %ElapsedTime% minutes.`n---------------------------`n%dupeSum%`n%unreachSum%
Esc::ExitApp
https://www.autohotkey.com/docs/v1/Hotstrings.htm#variant
The above script can't differentiate those, and will flag them as dupes. It does ignore ;commented-out lines, but only per line. It can't ignore hotstrings in
/*
block comments
*/
I processed AutoCorrect.ahk (the original one). If anyone is curious:
fyi before processing it, I ;by-line-commented-out the ambiguous items.
Spoiler
Edit:
-It doesn't work well with triplicates.... Those get flagged as three different duplicate pairs. Might try to fix that in the future.
Edit 4-5-2023: I see an error in the autocorrect "unreachable" list...
Code: Select all
lines 264 :*: lsit::list
-and 3818 :: realsitic::realistic
Maybe will add a third check for redundant items...
Edit 2 on 4-5-2023: I rearranged the logic parts in above ver 4-5-2023 b, circumventing the need for a couple of the regexes. By doing this I was able to shave a minute off of the time needed to process Autocorrect.ahk. (from 3.15 minutes, to 2.14). Also I simplified the regex pattern. It occurred to me that I don't need to ensure that the hotstring options are valid... I just need to differentiate them from the rest of the hotstring.
-Side note: With the new logic, I'm assuming that ANY line starting with a semicolon is a hotstring (or at least a hotstring trigger). Hopefully this is a valid assumption(?)
Edit 3 on 4-5-2023: Used the regex from here viewtopic.php?f=76&t=115786&p=516098#p516098 in the above
ver 4-5-2023-c. It speeded things up a bit.
Mildly interesting notes: My timings for processing Autocorrect.ahk:
The original 2007 Autocorrect.ahk has 5295 lines (though not all are hotstrings).
-The first version of this script took 3.15 minutes.
-Changing the logic so If InStr circumvented non-hotstring lines: 2.14 mins.
-Updated regex so that unneeded groups were no longer captured: 1:81 mins.
-Added "S)" regex option (Studies the pattern to try improve its performance). It wasn't able to improve this particular regex though: 1.82 mins.
-Ran it again the same way: 1.82 mins again.
-Took the "S)" back off: 1.75 this time.