remove lines present in both long sorted lists
Re: remove lines present in both long sorted lists
To remove from both strings, set var3 equal to var1. When you remove a string from var2, remove it from var3. When I tested this, the time elapsed was the same.
It's possible that your supercomputer is old, and you just need to get a new one.
It's possible that your supercomputer is old, and you just need to get a new one.
Re: remove lines present in both long sorted lists
@Terka
Code: Select all
SetBatchLines, -1
C := FileOpen("LISTC.txt", "w") ; Output
D := {}
TK := A_TickCount
Loop, Read, LISTA.txt ; Input A
D["" A_LoopReadLine ""] := ""
Loop, Read, LISTB.txt ; Input B
D["" A_LoopReadLine ""] := ""
For Line in D
C.Write(Line "`n")
C := ""
Msgbox, % (A_TickCount - TK) / 1000 " s"
Last edited by Smile_ on 04 Dec 2022, 10:24, edited 4 times in total.
Re: remove lines present in both long sorted lists
It's fast, but I don't think it meets the goal.
Re: remove lines present in both long sorted lists
Is yours doing a union? I think it's supposed to be removing the duplicates.
This revision does seem to get the elapsed time down to approximately zero seconds.
This revision does seem to get the elapsed time down to approximately zero seconds.
Code: Select all
SetBatchLines -1
dir = %A_ScriptDir%
out := FileOpen(dir "\result.txt", "w `n")
line := {}
start := A_TickCount
Loop, Read, %dir%\LISTA.txt
line[A_LoopReadLine] := True
Loop, Read, %dir%\LISTB.txt
(!line.HasKey(A_LoopReadLine)) && out.WriteLine(A_LoopReadLine)
out := ""
MsgBox, 64, Elapsed time, % A_TickCount - start " ms"
Re: remove lines present in both long sorted lists
You don't have to worry about duplication, because keys values are overwritten when there is an already defined key (as I noticed).
Re: remove lines present in both long sorted lists
It seems that your approach pools the lines instead of omitting any of them. My script omits the "intersecting" lines.
Re: remove lines present in both long sorted lists
Maybe. This is what I understood.
Would like to remove lines that are identical in both lists.
Re: remove lines present in both long sorted lists
mikeyww, You have bug in Your code
Code: Select all
a := 0
b := 00
line := {}
line[a] := True
msgbox % line.HasKey(b)
Re: remove lines present in both long sorted lists
Fair enough.
Credit to Smile_ & malcev.
Code: Select all
SetBatchLines -1
dir = %A_ScriptDir%
out := FileOpen(dir "\result.txt", "w `n")
line := {}
start := A_TickCount
Loop, Read, %dir%\LISTA.txt
line[A_LoopReadLine ""] := True
Loop, Read, %dir%\LISTB.txt
(!line.HasKey(A_LoopReadLine "")) && out.WriteLine(A_LoopReadLine)
out := ""
MsgBox, 64, Elapsed time, % A_TickCount - start " ms"
Re: remove lines present in both long sorted lists
Another thing try with this example: (Got blank result)
LISTA.txt:
LISTB.txt:
Supposed to give this right?
LISTA.txt:
Code: Select all
1
2
4
5
11
12
Code: Select all
1
2
11
12
Code: Select all
4
5
Last edited by Smile_ on 04 Dec 2022, 11:54, edited 1 time in total.
Re: remove lines present in both long sorted lists
No. The script removes from list B the lines that are in list A.
When the script loops through list B, it writes the output only if list A did not contain that line (key).
When the script loops through list B, it writes the output only if list A did not contain that line (key).
Re: remove lines present in both long sorted lists
Yes I got you, LISTB.txt has no more identical lines from LISTA.txt, so they are totally different lines.
I mean with "Supposed to give this right?" what the OP wants
I mean with "Supposed to give this right?" what the OP wants
So I thought he would like to remove identical lines from both sides (LineA.txt & LineB.txt) and leave only lines that are not.Hi all, have 2 lists, both sorted. Would like to remove lines that are identical in both lists.
Last edited by Smile_ on 04 Dec 2022, 12:14, edited 1 time in total.
Re: remove lines present in both long sorted lists
Yes, you are right. I showed an example for one side. The other side could be done in another 32 ms! Or could have an array with the intersection, and then write the lines without those.
Technique below: for each text line, a new array dimension is added, representing each list. After lists are read, text lines in the array with two items in the second dimension are present in both lists. Therefore, the output is written only when the number of items in the second dimension is less than two, because that means that the text line is not present in both lists.
Technique below: for each text line, a new array dimension is added, representing each list. After lists are read, text lines in the array with two items in the second dimension are present in both lists. Therefore, the output is written only when the number of items in the second dimension is less than two, because that means that the text line is not present in both lists.
Code: Select all
SetBatchLines -1
dir = %A_ScriptDir%
out := [], line := [], start := A_TickCount
For each, fn in input := ["LISTA", "LISTB"] {
Loop, Read, %dir%\%fn%.txt
line[A_LoopReadLine "", fn] := True
out[fn] := FileOpen(dir "\" fn "-result.txt", "w `n")
}
For each, fn in input
Loop, Read, %dir%\%fn%.txt
(line[A_LoopReadLine ""].Count() < input.Count()) && out[fn].WriteLine(A_LoopReadLine)
out := ""
MsgBox, 64, Time elapsed, % A_TickCount - start " ms"
Re: remove lines present in both long sorted lists
My naive approach which probably sucks:
Code: Select all
SetBatchLines -1
; Assumes sorted arrays with no duplicates
arr1 := [1,2,4,5,11,12], arr2 := [1,2,11,12], new1 := [], new2 := []
;arr1 := ["line 1", "line 2", "line3"], arr2 := ["line 1", "line 2", "line8"]
i := 1, j := 1
Loop {
if (i > arr1.MaxIndex()) {
j--
loop, % arr2.MaxIndex()-j
new2.Push(arr2[j+A_Index])
break
}
if (j > arr2.MaxIndex()) {
i--
loop, % arr1.MaxIndex()-i
new1.Push(arr1[i+A_Index])
break
}
if (arr1[i] == arr2[j])
i++, j++
else {
if arr1[i] < arr2[j]
new1.Push(arr1[i]), i++
else
new2.Push(arr2[j]), j++
}
}
out1 := ""
for _, v in new1
out1 .= v ", "
out1 := SubStr(out1, 1, -2)
out2 := ""
for _, v in new2
out2 .= v ", "
out2 := SubStr(out2, 1, -2)
MsgBox, % "First array: " out1 "`nSecond array: " out2
Re: remove lines present in both long sorted lists
For some reason, it did not seem to work when I tried it with text files.
Re: remove lines present in both long sorted lists
@mikeyww 400ms, great!! thank you wery much!
if you would need to learn some sport send me a message, can help.
if you would need to learn some sport send me a message, can help.
Re: remove lines present in both long sorted lists
Even if you tried to teach me, it wouldn't help in my case!
Re: remove lines present in both long sorted lists
everybody is an expert in another area