[SOLVED] Any faster ways than InStr() to search huge text?

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

[SOLVED] Any faster ways than InStr() to search huge text?

02 Mar 2015, 11:11

I'm trying to get the exclude list, by comparing sub list and full list.

For example:
Full List: contains 100,000 numbers (00000,00001,00002...99999). 599999 characters.
Sub List: contains 60,000 numbers (00000,00001,00002...59999). 359999 characters.
I want to get the remaining numbers (from 60000 to 99999). It takes about 50 seconds in my computer. Is there some faster ways, like using MCode?
Test code:

Code: Select all

CreateTestData(100000Numbers, 10) ; contains 100000 numbers (00000,...,99999)
CreateTestData(60000Numbers, 6) ; contains 60000 numbers (00000,...,59999)

startTime := A_TickCount

	40000Numbers := ""
	Loop, Parse, 100000Numbers, CSV ; loop 100000 times
		If !InStr(60000Numbers, A_LoopField)
			40000Numbers .= "," A_LoopField

MsgBox, % (A_TickCount-startTime)//1000 ; it takes 51 seconds in my computer
Return

; ===========================================================
CreateTestData(ByRef result, n) {
	result := ""
	Loop, % n
	{
		n1 := A_Index - 1
		Loop, 10
		{
			n2 := A_Index - 1
			Loop, 10
			{
				n3 := A_Index - 1
				Loop, 10
				{
					n4 := A_Index - 1
					Loop, 10
					{
						n5 := A_Index - 1
						result .= "," n1 n2 n3 n4 n5
					}
				}
			}
		}
	}

	result := Trim(result, ",")
}
thanks.
Last edited by tmplinshi on 02 Mar 2015, 12:08, edited 1 time in total.
strobo
Posts: 125
Joined: 30 Sep 2013, 15:24

Re: Any faster ways than InStr() to search huge text?

02 Mar 2015, 11:45

It is easiest to setup the blacklist as a "set".

Code: Select all

list := {} ; incoming data
loop 10000
	list[a_index] := a_index
blacklist := {}
loop 6000
	blacklist[a_index] := true
diff := {} ; filtered data
k := 1
for i, v in list
	if (!blacklist[v])
		diff[k] := v, k++
msgbox,% diff[1] . ", " . diff[2] . ", ..."
But object indices are case insensitive (i.e. the blacklist filters strings regardless of the case). When this is unwanted one can use a comobject scripting dictionary or, maybe better, try out Coco's wrapper around this dictionary type.
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

Re: Any faster ways than InStr() to search huge text?

02 Mar 2015, 12:07

wow! 0 second! Thank you so much, strobo!

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Rohwedder, Tvlao and 325 guests