How to improve the efficiency of querying the general table based on sublists?

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
gdqb521
Posts: 13
Joined: 15 Aug 2015, 08:01

How to improve the efficiency of querying the general table based on sublists?

Post by gdqb521 » 24 Aug 2022, 19:22

Description of the problem: There are 3 fields and 63 rows in the subtable. To find the row data in the total table (10 fields and 100002 rows) that contains three fields of the subtable, and copy it to a new document.
The three fields of the subtable correspond to the 2nd, 6th, and 7th fields of the general table, respectively.
The main question is how to improve the efficiency of search? Based on the sample data on my computer, it takes about 5 seconds to filter through the traversal loop. When the data increases, the filtering time also increases exponentially. Is there a more efficient way to deal with it?
The number of filtered rows is 1027 rows.
data:
data.zip
(1005.74 KiB) Downloaded 19 times

Code: Select all

T_start:=A_TickCount		
fulllist:=A_ScriptDir "\fulllist.csv"
sublist:=A_ScriptDir "\sublist.csv"
SplitPath, fulllist, name, dir, ext, name_no_ext, Drive
Outputfile:=dir "\out.csv"
FileDelete,%outputfile%
FileRead,fulllist0,%fulllist%
ofulllist:=StrSplit(fulllist0,"`n")
out:=""
loop, Read, %sublist%
{
	Array6:= StrSplit(A_LoopReadLine,"`,")
	for index, element in ofulllist
	{
		if (InStr(ofulllist[A_Index],Array6[1]) && InStr(ofulllist[A_Index],Array6[2]) && InStr(ofulllist[A_Index],Array6[3]) )
			out.= ofulllist[A_Index] "`n"
	}
}
FileAppend, %out%,%Outputfile%,utf-8
MsgBox, % A_TickCount-T_start "ms"     ;7326ms
ExitApp
Last edited by gdqb521 on 24 Aug 2022, 19:59, edited 1 time in total.

User avatar
boiler
Posts: 17404
Joined: 21 Dec 2014, 02:44

Re: How to improve the efficiency of querying the general table based on sublists?

Post by boiler » 24 Aug 2022, 19:29

I haven't unzipped your zip file (and am not going to), so I don't know if it contains the script, but it looks like it might just be the data and not the script. You should post your script not in a zip file so that others can suggest how improve the efficiency.

My general advice would be to use the right tool for the job. It's really a job for a relational database. It would be possible to interface with the db and run queries from an AHK script if that's what you wanted to do.

User avatar
flyingDman
Posts: 2848
Joined: 29 Sep 2013, 19:01

Re: How to improve the efficiency of querying the general table based on sublists?

Post by flyingDman » 24 Aug 2022, 22:41

You might improve by a second or 2 with this:

Code: Select all

T_start:=A_TickCount
fulllist:=A_ScriptDir "\fulllist.csv"
sublist:=A_ScriptDir "\sublist.csv"
SplitPath, fulllist, name, dir, ext, name_no_ext, Drive	; ?
outputfile:=dir "\out.csv"								; A_ScriptDir?
FileDelete,%outputfile%
FileAppend, ,%outputfile%,utf-8
FileRead,fulllist0,%fulllist%
ofulllist:=StrSplit(fulllist0,"`n")
loop, Read, %sublist%, %outputfile%
	{
	Array6 := StrSplit(A_LoopReadLine,",")
	for x,y in ofulllist
		if (InStr(y,Array6[1]) && InStr(y,Array6[2]) && InStr(y,Array6[3]))
			FileAppend, % y "`n"
	}
MsgBox, % A_TickCount-T_start "ms"     ;7326ms
ExitApp
on my laptop it runs in ~1875ms
14.3 & 1.3.7

Post Reply

Return to “Ask for Help (v1)”