Page 1 of 2
detect/Remove duplicate values in array
Posted: 09 Nov 2017, 15:04
by derz00
Hello,
I would like to remove some duplicates in an array (or rather create a new array with a single list with only one occurrence of each name in the list.)
I have seen the function InStr but this is an array, not a string. I have seen HasKey but this is the value, not the key. Can anyone help me out?
Re: detect/Remove duplicate values in array
Posted: 09 Nov 2017, 15:29
by Exaskryz
This is probably a rather inefficient way, but I've done this (or something like it; not sure what script of mine did this...) or something like it in the past:
Code: Select all
object:=[], secondobject:=[]
object.Push("a","b","c","c","b","d")
Loop % object.Length()
{
value:=Object.RemoveAt(1) ; otherwise Object.Pop() would work from right to left
Loop % secondobject.Length()
If (value=secondobject[A_Index])
Continue 2 ; jump to the top of the outer loop, we found a duplicate, discard it and move on
secondobject.Push(value)
}
MsgBox % secondobject.Length()
Loop % secondobject.Length()
MsgBox % secondobject[A_Index]
return
Re: detect/Remove duplicate values in array
Posted: 09 Nov 2017, 15:38
by derz00
Interesting. I also, since posting, thought of this. I would be real interested in knowing if there is a better way.
Code: Select all
RemoveDup(obj) {
for i, value in obj
str.=value "`n"
nodupArray:={}
nodup:=""
loop parse, str, `n
if !InStr(nodup, A_LoopField)
{
nodup.=A_LoopField "`n"
nodupArray.Push(A_LoopField)
}
Return nodupArray
}
Re: detect/Remove duplicate values in array
Posted: 09 Nov 2017, 17:26
by Helgef
If you don't mind it being sorted, try this,
Code: Select all
uniqueArr(arr, del := ""){
return sortArr(arr, (del!=""?"D" del : "D" chr(1)) . " U C")
}
sortArr(arr,opt:=""){
; Sort array using sort options
; https://autohotkey.com/docs/commands/Sort.htm
static guess:=0
local delimiter,k,v,str
if guess
VarSetCapacity(str,arr.length()*guess,0) ; Speed can be improved by making a guess on the needed size.
RegExMatch(opt,"O)\bD(.)\b",delimiter) ? delimiter:=delimiter[1] : (delimiter:="`n", opt.=" D`n")
for k, v in arr
str.=v . delimiter
str:=RTrim(str,delimiter)
Sort,str, % opt
return StrSplit(str,delimiter)
}
; Example,
for k, v in uniqueArr(["a","a","x","x","b",1,2,3,3,4,9,9,9,9,9,9,9,9])
str .= v "`n"
msgbox % str
Edit: Also if you don't mind case insensitivity,
Code: Select all
unique(arr){
local temp := [], out := []
local k, v
for k, v in arr
temp[v] := ""
for k in temp
out[A_Index] := k
return out
}
Re: detect/Remove duplicate values in array
Posted: 09 Nov 2017, 17:41
by teadrinker
Or like this:
Code: Select all
arr := ["a","b","c","c","b","d"]
newArr := [], testArr := []
for k, v in arr
if !testArr.HasKey(v)
testArr[v] := true, newArr.Push(v)
for k, v in newArr
MsgBox, % v
Re: detect/Remove duplicate values in array
Posted: 09 Nov 2017, 20:00
by jeeswg
A warning about the case where an object key is called 'HasKey':
Code: Select all
q:: ;arrays and HasKey
;if an array has a key called HasKey,
;HasKey() will fail, so use ObjHasKey instead
oArray := {hello:0}
MsgBox, % oArray.HasKey("hello") ;1
oArray := {Haskey:0}
MsgBox, % oArray.HasKey("HasKey") ;(blank)
MsgBox, % oArray.HasKey("abc") ;(blank)
MsgBox, % ObjHasKey(oArray, "HasKey") ;1
MsgBox, % ObjHasKey(oArray, "abc") ;0
return
Example scripts to remove duplicates from an array, case sensitive and case insensitive, and that handle numeric v. string keys/values.
Code: Select all
q:: ;array - remove duplicates (case insensitive)
oArray := ["a","B","c","A","B","C",1,1.0,"1","1.0"]
oArray2 := [], oTemp := {}
for vKey, vValue in oArray
{
if (ObjGetCapacity([vValue], 1) = "") ;is numeric
{
if !ObjHasKey(oTemp, vValue+0)
oArray2.Push(vValue+0), oTemp[vValue+0] := ""
}
else
{
if !ObjHasKey(oTemp, "" vValue)
oArray2.Push("" vValue), oTemp["" vValue] := ""
}
}
vOutput := ""
for vKey, vValue in oArray2
vOutput .= vKey " " vValue "`r`n"
MsgBox, % vOutput
return
w:: ;array - remove duplicates (case sensitive)
oArray := ["a","B","c","A","B","C",1,1.0,"1","1.0"]
oArray2 := [], oTemp := ComObjCreate("Scripting.Dictionary")
for vKey, vValue in oArray
if !oTemp.Exists(vValue)
oArray2.Push(vValue), oTemp.Item(vValue) := ""
vOutput := ""
for vKey, vValue in oArray2
vOutput .= vKey " " vValue "`r`n"
MsgBox, % vOutput
return
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 04:22
by Helgef
@
Exaskryz,
pop is faster.
Edit: Although removeat is slower, Helgef is even slower, I finally realised you use removeat instead of pop because the former maintains the original order
@
derz00, you need to delimit
A_LoopField, otherwise you will find that eg,
instr(nodup, "a") is true for
nodup := "aa", hence something like this might fix that,
Code: Select all
RemoveDup(obj) {
for i, value in obj
str.=value "`n"
nodupArray:={}
nodup:= "`n" ; Added delimiter
loop parse, str, `n
if !InStr(nodup, "`n" A_LoopField "`n" ) ; Added delimiter
{
nodup.=A_LoopField "`n"
nodupArray.Push(A_LoopField)
}
Return nodupArray
}
@
teadrinker,
You could use
objhaskey, as pointed out by
jeeswg.
@
jeeswg .
Fyi, your second script behaves differently on v2. (The first one too, but that is more obvious)
Cheers.
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 05:02
by teadrinker
I've seen objhaskey and similar functions for the first time. Where are they described?
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 05:04
by Helgef
I think you will find if you search in the help file index. I see my link went to the method.
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 05:13
by teadrinker
Hmm, for me the link leads to Object.HasKey(Key).
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 05:23
by teadrinker
Found:
Each method also has an equivalent function, which can be used to bypass any custom behaviour implemented by the object -- it is recommended that these functions only be used for that purpose.
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 06:15
by Helgef
Thanks
teadrinker.
It could be worth to note that using the functions can improve performance, presumably because the functions doesn't imply any array look-ups. Also, there are a few other
ObjXXX functions which have their own
spot in the documentation,
Code: Select all
objAddRef()
objRelease()
objBindMethod()
objRawSet()
Cheers.
Edit:
@
derz00, I'm sorry that your topics is sligthly derailing
jeeswg wrote:@Helgef: Are you serious about the approaches, both of them, not being two-way compatible? Did you try to find a fix? I'll look into it
I am serious
.
Two-way compability, isn't something I consider needs to be fixed, so no, I didn't try. And I don't think you can, since, eg,
obj[1 ""] := obj[1] := value yields two key / value pairs (one integer key and one string key) in v1, while in
v2 it yields only an integer key. For the first script, you can use
try - catch to avoid the exception when you try to call a non-existent method.
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 06:16
by jeeswg
@Helgef: Are you serious about the approaches, both of them, not being two-way compatible? Did you try to find a fix? I'll look into it.
@teadrinker: I try to list the ObjXXX functions at both of these links:
[note: I intend to improve the objects tutorial significantly in future.]
jeeswg's objects tutorial - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=29232
list of every command/function/variable from across all versions - AutoHotkey Community
https://autohotkey.com/boards/viewtopic ... 42#p131642
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 06:52
by teadrinker
Thanks, it's very informative!
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 08:34
by derz00
Helgef wrote:
Edit:
@ derz00, I'm sorry that your topics is sligthly derailing :oops:
I don't forgive you, because I am thankful for it! :thumbup: My fears quickly died that I might not get a response. :) I should bring all these suggestions together into a dependable function and post it.
Re: detect/Remove duplicate values in array
Posted: 10 Nov 2017, 19:33
by jeeswg
For the HasKey warning script: AHK v2 is more strict and the script ends. As Helgef said, you can use 'try' to avoid this.
For the remove duplicates scripts, they are both working on both AHK v1 and AHK v2. But there are some differences relating to key names that look numeric.
The results I'm getting with AHK v2 here are surprising.
How can I get a numeric key name '1', and a string key name '1', in the same array?
Code: Select all
oArray := {}
oArray[1] := "a"
oArray[1.0] := "b"
oArray["1"] := "c"
oArray["1.0"] := "d"
vOutput := ""
for vKey, vValue in oArray
vOutput .= vKey " " vValue "`r`n"
MsgBox, % vOutput
;MsgBox(vOutput)
return
;AHK v1
;1 a
;1 c
;1.0 d
;AHK v2
;1 c
;1.0 d
[EDIT:] Further tests on AHK v2:
Code: Select all
;AHK v2
MsgBox(Type(1)) ;Integer
MsgBox(Type("1")) ;String (as expected)
MsgBox(Type(1+0)) ;Integer
oArray := {}
oArray[1] := ""
for vKey, vValue in oArray
MsgBox Type(vKey) ;Integer
oArray := ""
oArray := {}
oArray["1"] := ""
for vKey, vValue in oArray
MsgBox Type(vKey) ;Integer (surprising)
oArray := ""
return
Re: detect/Remove duplicate values in array
Posted: 11 Nov 2017, 04:09
by Helgef
jeeswg wrote:; Integer (surprising)
Please refer to the
v2 documentation,
objects -> keys.
Cheers
Re: detect/Remove duplicate values in array
Posted: 11 Nov 2017, 04:16
by jeeswg
Thanks so much, it's such a relief to have an explanation for what was going on. This is exactly the sort of behaviour I thought AHK v2 was supposed to eliminate. Do you find this behaviour surprising? How can I specify a number stored as a string? Do you know? Thanks.
I'll reserve judgement for now, but presently this behaviour is very concerning.
[EDIT:] Here's a legitimate usage scenario that gets messed up by the 'string looks number' assumption. It works fine in AHK v1, but not in AHK v2.
Code: Select all
q::
oArray := {1:0,2:0,3:0,1a:0,2a:0,3a:0}
vOutput := ""
for vKey, vValue in oArray
vOutput .= vKey " " vValue "`r`n"
MsgBox(vOutput)
;MsgBox, % vOutput
oArray := {"1":0,"2":0,"3":0,1a:0,2a:0,3a:0}
vOutput := ""
for vKey, vValue in oArray
vOutput .= vKey " " vValue "`r`n"
MsgBox(vOutput)
;MsgBox, % vOutput
return
;for the 2nd example:
;AHK v1
1 0
1a 0
2 0
2a 0
3 0
3a 0
;AHK v2
;1 0
;2 0
;3 0
;1a 0
;2a 0
;3a 0
[EDIT:] And another one, re. dealing with hex strings generally:
Code: Select all
q::
oArray := {"00":0,"40":64,"80":128,"C0":196}
vOutput := ""
for vKey, vValue in oArray
vOutput .= vKey " " vValue "`r`n"
MsgBox(vOutput)
;MsgBox, % vOutput
return
;AHK v1
;00 0
;40 64
;80 128
;C0 196
;AHK v2
;40 64
;80 128
;00 0
;C0 196
Re: detect/Remove duplicate values in array
Posted: 11 Nov 2017, 08:14
by Helgef
If I had a dollar for every second I waited for your scripts to finish before I realise I didn't hit
q...
. Shame on me for not learning,
but indenting the hotkey routine makes it clearer, imho.
jeeswg wrote:This is exactly the sort of behaviour I thought AHK v2 was supposed to eliminate.
It does eliminate this sort of code:
Code: Select all
if !ObjHasKey(oTemp, "" vValue)
oArray2.Push("" vValue), oTemp["" vValue] := ""
Although I can appreciate the fun of tinkering with script language specifics, it is probably
better if the above is never necessary.
jeeswg wrote:Do you find this behaviour surprising?
I do not find it surprising that the behaviour is according to the documentation.
I do not find it surprising that v2
behaves differently from v1.
I do not find it surprising that the choice to change the behaviour was made. Not claiming the following is the (only) reason for the change, but, integer keys performs better than string key, and takes less space (on average, I guess). If it looks like a number, perhaps it is a number
jeeswg wrote:How can I specify a number stored as a string? Do you know? Thanks.
If you are desperate,
Code: Select all
arr[key:="01"] := "" ; type(key) = string, key+0 = "01" + 0 = 1
jeeswg wrote: Here's a legitimate usage scenario that gets messed up by the 'string looks number' assumption. It works fine in AHK v1, but not in AHK v2.
legitimate, sure, but I think it is rare enough to not cast any doubts on wether the change in handling keys causes an unacceptable loss of feature vs the improvements implied by the change.
I'll reserve judgement for now, but presently this behaviour is very concerning.
That is sound, I will do that too, however, I am not presently concerned. But I would like to hear any arguments, both pros and cons. This thread is probably not the place though.
Cheers.
Re: detect/Remove duplicate values in array
Posted: 11 Nov 2017, 09:04
by jeeswg
- I want an object with obj[1] and obj["1"], can you achieve that?
- Do you have any examples of hotkey labels and indentation. I haven't seen anyone consistently do this on the forum, so I couldn't imitate the style if I wanted to. And I have some grey areas re. (a) what's considered standard, (b) rules for automating indentation.
- While writing those 2 lines of codes, I was actually pleased that I could handle all of the issues to do with string v. numeric key names so succinctly and effectively, and with no ambiguity.
- Re. surprise, I think it's been changed because 'users might be stupid', and I respect the intent, but it could quite easily cause more mistakes and not fewer. Anyone who uses arrays has to understand that there are string keys and numeric keys, it's fundamental. In situations like this you improve the documentation, you don't 'improve' (dumb-down and over-complicate) the language. It will annoy the power users and it won't help the newbies.
- Re. surprise. Almost everything brought in in AHK v1.1, i.e. by lexikos, has remained consistent in AHK v2, this would be an exception.
- I like to use integer keys also, but sometimes you intend them to be numbers stored as strings for good reasons, especially when you are handling strings and doing loops and sorts. Crucially, I at least need *a* way to do it, which I haven't so far seen.
[EDIT:]
- Ultimately, the number one thing that annoys and confuses both power users and newbies is ambiguity. I.e. fiddliness and special exceptions. Things that are clear and consistent work best.
- Currently I have no complaints about how AHK v1 handles the string/numeric key issue.
- One solution when handling strings could be to use a prefix character, so what you gain with integers, you lose with strings.
- Like I said the jury is still out, it partly depends on if there are ways to directly create key names e.g. '1', '2', '3', which are numbers stored as strings.
- There are situations in Explorer where you in one object you refer to items by name and by number at the same time e.g. the nth file (integer key), and by name (string key), however, you may have a folder, with a name comprised solely of digits e.g. a datestamp, or simply the folders named '1', '2' etc.
- You could also have issues relating to numbers with/without leading zeros.
[EDIT:] So Helgef, you've revealed to me (1) the deref in force an expression/can be an expression issue, (2) the no assume local issue, and now (3) the string keys in AHK v2 issue. Hmm, what next?