detect/Remove duplicate values in array

Get help with using AutoHotkey and its commands and hotkeys
User avatar
derz00
Posts: 497
Joined: 02 Feb 2016, 17:54
GitHub: derz00
Location: Middle of the round cube

detect/Remove duplicate values in array

09 Nov 2017, 15:04

Hello,

I would like to remove some duplicates in an array (or rather create a new array with a single list with only one occurrence of each name in the list.)

I have seen the function InStr but this is an array, not a string. I have seen HasKey but this is the value, not the key. Can anyone help me out?
try it and see
...
User avatar
Exaskryz
Posts: 2876
Joined: 17 Oct 2015, 20:28

Re: detect/Remove duplicate values in array

09 Nov 2017, 15:29

This is probably a rather inefficient way, but I've done this (or something like it; not sure what script of mine did this...) or something like it in the past:

Code: Select all

object:=[], secondobject:=[]
object.Push("a","b","c","c","b","d")
Loop % object.Length()
{
value:=Object.RemoveAt(1) ; otherwise Object.Pop() would work from right to left
Loop % secondobject.Length()
If (value=secondobject[A_Index])
    Continue 2 ; jump to the top of the outer loop, we found a duplicate, discard it and move on
secondobject.Push(value)
}

MsgBox % secondobject.Length()
Loop % secondobject.Length()
MsgBox % secondobject[A_Index]
return
User avatar
derz00
Posts: 497
Joined: 02 Feb 2016, 17:54
GitHub: derz00
Location: Middle of the round cube

Re: detect/Remove duplicate values in array

09 Nov 2017, 15:38

Interesting. I also, since posting, thought of this. I would be real interested in knowing if there is a better way.

Code: Select all

RemoveDup(obj) {
for i, value in obj
	str.=value "`n"
nodupArray:={}
nodup:=""
loop parse, str, `n
	if !InStr(nodup, A_LoopField)
	{
		nodup.=A_LoopField "`n"
		nodupArray.Push(A_LoopField)
	}
Return nodupArray
}
try it and see
...
Helgef
Posts: 3703
Joined: 17 Jul 2016, 01:02
Contact:

Re: detect/Remove duplicate values in array

09 Nov 2017, 17:26

If you don't mind it being sorted, try this,

Code: Select all

uniqueArr(arr, del := ""){
	return sortArr(arr, (del!=""?"D" del : "D" chr(1)) . " U C")
}
sortArr(arr,opt:=""){
	; Sort array using sort options
	; https://autohotkey.com/docs/commands/Sort.htm
	static guess:=0
	local delimiter,k,v,str
	if guess
		VarSetCapacity(str,arr.length()*guess,0)	; Speed can be improved by making a guess on the needed size.
	RegExMatch(opt,"O)\bD(.)\b",delimiter) ? delimiter:=delimiter[1] : (delimiter:="`n", opt.=" D`n")
	for k, v in arr
		str.=v . delimiter
	str:=RTrim(str,delimiter)
	Sort,str, % opt
	return StrSplit(str,delimiter)
}
; Example,
for k, v in uniqueArr(["a","a","x","x","b",1,2,3,3,4,9,9,9,9,9,9,9,9])
	str .= v "`n" 
msgbox % str
Edit: Also if you don't mind case insensitivity,

Code: Select all

unique(arr){
	local temp := [], out := []
	local k, v
	for k, v in arr
		temp[v] := ""
	for k in temp
		out[A_Index] := k
	return out
}
teadrinker
Posts: 669
Joined: 29 Mar 2015, 09:41
Contact:

Re: detect/Remove duplicate values in array

09 Nov 2017, 17:41

Or like this:

Code: Select all

arr := ["a","b","c","c","b","d"]
newArr := [], testArr := []

for k, v in arr
   if !testArr.HasKey(v)
      testArr[v] := true, newArr.Push(v)
   
for k, v in newArr
   MsgBox, % v
User avatar
jeeswg
Posts: 6371
Joined: 19 Dec 2016, 01:58
Location: UK

Re: detect/Remove duplicate values in array

09 Nov 2017, 20:00

A warning about the case where an object key is called 'HasKey':

Code: Select all

q:: ;arrays and HasKey
;if an array has a key called HasKey,
;HasKey() will fail, so use ObjHasKey instead
oArray := {hello:0}
MsgBox, % oArray.HasKey("hello") ;1
oArray := {Haskey:0}
MsgBox, % oArray.HasKey("HasKey") ;(blank)
MsgBox, % oArray.HasKey("abc") ;(blank)
MsgBox, % ObjHasKey(oArray, "HasKey") ;1
MsgBox, % ObjHasKey(oArray, "abc") ;0
return
Example scripts to remove duplicates from an array, case sensitive and case insensitive, and that handle numeric v. string keys/values.

Code: Select all

q:: ;array - remove duplicates (case insensitive)
oArray := ["a","B","c","A","B","C",1,1.0,"1","1.0"]
oArray2 := [], oTemp := {}
for vKey, vValue in oArray
{
	if (ObjGetCapacity([vValue], 1) = "") ;is numeric
	{
		if !ObjHasKey(oTemp, vValue+0)
			oArray2.Push(vValue+0), oTemp[vValue+0] := ""
	}
	else
	{
		if !ObjHasKey(oTemp, "" vValue)
			oArray2.Push("" vValue), oTemp["" vValue] := ""
	}
}
vOutput := ""
for vKey, vValue in oArray2
	vOutput .= vKey " " vValue "`r`n"
MsgBox, % vOutput
return

w:: ;array - remove duplicates (case sensitive)
oArray := ["a","B","c","A","B","C",1,1.0,"1","1.0"]
oArray2 := [], oTemp := ComObjCreate("Scripting.Dictionary")
for vKey, vValue in oArray
	if !oTemp.Exists(vValue)
		oArray2.Push(vValue), oTemp.Item(vValue) := ""
vOutput := ""
for vKey, vValue in oArray2
	vOutput .= vKey " " vValue "`r`n"
MsgBox, % vOutput
return
Last edited by jeeswg on 10 Nov 2017, 19:15, edited 1 time in total.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
Helgef
Posts: 3703
Joined: 17 Jul 2016, 01:02
Contact:

Re: detect/Remove duplicate values in array

10 Nov 2017, 04:22

@ Exaskryz, pop is faster. Edit: Although removeat is slower, Helgef is even slower, I finally realised you use removeat instead of pop because the former maintains the original order :oops:
@ derz00, you need to delimit A_LoopField, otherwise you will find that eg, instr(nodup, "a") is true for nodup := "aa", hence something like this might fix that,

Code: Select all

RemoveDup(obj) {
	for i, value in obj
		str.=value "`n"
	nodupArray:={}
	nodup:= "`n" 									; Added delimiter
	loop parse, str, `n
		if !InStr(nodup,  "`n"  A_LoopField "`n" )	; Added delimiter
		{
			nodup.=A_LoopField "`n"
			nodupArray.Push(A_LoopField)
		}
	Return nodupArray
}
@ teadrinker, :thumbup: You could use objhaskey, as pointed out by jeeswg.
@ jeeswg :thumbup: . Fyi, your second script behaves differently on v2. (The first one too, but that is more obvious)
Cheers.
Last edited by Helgef on 10 Nov 2017, 18:32, edited 1 time in total.
teadrinker
Posts: 669
Joined: 29 Mar 2015, 09:41
Contact:

Re: detect/Remove duplicate values in array

10 Nov 2017, 05:02

I've seen objhaskey and similar functions for the first time. Where are they described?
Helgef
Posts: 3703
Joined: 17 Jul 2016, 01:02
Contact:

Re: detect/Remove duplicate values in array

10 Nov 2017, 05:04

I think you will find if you search in the help file index. I see my link went to the method.
teadrinker
Posts: 669
Joined: 29 Mar 2015, 09:41
Contact:

Re: detect/Remove duplicate values in array

10 Nov 2017, 05:13

Hmm, for me the link leads to Object.HasKey(Key).
teadrinker
Posts: 669
Joined: 29 Mar 2015, 09:41
Contact:

Re: detect/Remove duplicate values in array

10 Nov 2017, 05:23

Found:
Each method also has an equivalent function, which can be used to bypass any custom behaviour implemented by the object -- it is recommended that these functions only be used for that purpose.
Helgef
Posts: 3703
Joined: 17 Jul 2016, 01:02
Contact:

Re: detect/Remove duplicate values in array

10 Nov 2017, 06:15

Thanks teadrinker.
It could be worth to note that using the functions can improve performance, presumably because the functions doesn't imply any array look-ups. Also, there are a few other ObjXXX functions which have their own spot in the documentation,

Code: Select all

objAddRef()
objRelease()
objBindMethod()
objRawSet()
Cheers.
Edit:
@ derz00, I'm sorry that your topics is sligthly derailing :oops:
jeeswg wrote:@Helgef: Are you serious about the approaches, both of them, not being two-way compatible? Did you try to find a fix? I'll look into it
I am serious :beard:. Two-way compability, isn't something I consider needs to be fixed, so no, I didn't try. And I don't think you can, since, eg, obj[1 ""] := obj[1] := value yields two key / value pairs (one integer key and one string key) in v1, while in v2 it yields only an integer key. For the first script, you can use try - catch to avoid the exception when you try to call a non-existent method.
Last edited by Helgef on 10 Nov 2017, 06:51, edited 1 time in total.
User avatar
jeeswg
Posts: 6371
Joined: 19 Dec 2016, 01:58
Location: UK

Re: detect/Remove duplicate values in array

10 Nov 2017, 06:16

@Helgef: Are you serious about the approaches, both of them, not being two-way compatible? Did you try to find a fix? I'll look into it.

@teadrinker: I try to list the ObjXXX functions at both of these links:
[note: I intend to improve the objects tutorial significantly in future.]
jeeswg's objects tutorial - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=29232
list of every command/function/variable from across all versions - AutoHotkey Community
https://autohotkey.com/boards/viewtopic ... 42#p131642
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
derz00
Posts: 497
Joined: 02 Feb 2016, 17:54
GitHub: derz00
Location: Middle of the round cube

Re: detect/Remove duplicate values in array

10 Nov 2017, 08:34

Helgef wrote: Edit:
@ derz00, I'm sorry that your topics is sligthly derailing :oops:
I don't forgive you, because I am thankful for it! :thumbup: My fears quickly died that I might not get a response. :) I should bring all these suggestions together into a dependable function and post it.
try it and see
...
User avatar
jeeswg
Posts: 6371
Joined: 19 Dec 2016, 01:58
Location: UK

Re: detect/Remove duplicate values in array

10 Nov 2017, 19:33

For the HasKey warning script: AHK v2 is more strict and the script ends. As Helgef said, you can use 'try' to avoid this.

For the remove duplicates scripts, they are both working on both AHK v1 and AHK v2. But there are some differences relating to key names that look numeric.

The results I'm getting with AHK v2 here are surprising.
How can I get a numeric key name '1', and a string key name '1', in the same array?

Code: Select all

oArray := {}
oArray[1] := "a"
oArray[1.0] := "b"
oArray["1"] := "c"
oArray["1.0"] := "d"
vOutput := ""
for vKey, vValue in oArray
	vOutput .= vKey " " vValue "`r`n"
MsgBox, % vOutput
;MsgBox(vOutput)
return

;AHK v1
;1 a
;1 c
;1.0 d

;AHK v2
;1 c
;1.0 d
[EDIT:] Further tests on AHK v2:

Code: Select all

;AHK v2
MsgBox(Type(1)) ;Integer
MsgBox(Type("1")) ;String (as expected)
MsgBox(Type(1+0)) ;Integer

oArray := {}
oArray[1] := ""
for vKey, vValue in oArray
	MsgBox Type(vKey) ;Integer
oArray := ""

oArray := {}
oArray["1"] := ""
for vKey, vValue in oArray
	MsgBox Type(vKey) ;Integer (surprising)
oArray := ""
return
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
Helgef
Posts: 3703
Joined: 17 Jul 2016, 01:02
Contact:

Re: detect/Remove duplicate values in array

11 Nov 2017, 04:09

jeeswg wrote:; Integer (surprising)
Please refer to the v2 documentation, objects -> keys.

Cheers
User avatar
jeeswg
Posts: 6371
Joined: 19 Dec 2016, 01:58
Location: UK

Re: detect/Remove duplicate values in array

11 Nov 2017, 04:16

Thanks so much, it's such a relief to have an explanation for what was going on. This is exactly the sort of behaviour I thought AHK v2 was supposed to eliminate. Do you find this behaviour surprising? How can I specify a number stored as a string? Do you know? Thanks.

I'll reserve judgement for now, but presently this behaviour is very concerning.

[EDIT:] Here's a legitimate usage scenario that gets messed up by the 'string looks number' assumption. It works fine in AHK v1, but not in AHK v2.

Code: Select all

q::
oArray := {1:0,2:0,3:0,1a:0,2a:0,3a:0}
vOutput := ""
for vKey, vValue in oArray
	vOutput .= vKey " " vValue "`r`n"
MsgBox(vOutput)
;MsgBox, % vOutput

oArray := {"1":0,"2":0,"3":0,1a:0,2a:0,3a:0}
vOutput := ""
for vKey, vValue in oArray
	vOutput .= vKey " " vValue "`r`n"
MsgBox(vOutput)
;MsgBox, % vOutput
return

;for the 2nd example:
;AHK v1
1 0
1a 0
2 0
2a 0
3 0
3a 0

;AHK v2
;1 0
;2 0
;3 0
;1a 0
;2a 0
;3a 0
[EDIT:] And another one, re. dealing with hex strings generally:

Code: Select all

q::
oArray := {"00":0,"40":64,"80":128,"C0":196}
vOutput := ""
for vKey, vValue in oArray
	vOutput .= vKey " " vValue "`r`n"
MsgBox(vOutput)
;MsgBox, % vOutput
return

;AHK v1
;00 0
;40 64
;80 128
;C0 196

;AHK v2
;40 64
;80 128
;00 0
;C0 196
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
Helgef
Posts: 3703
Joined: 17 Jul 2016, 01:02
Contact:

Re: detect/Remove duplicate values in array

11 Nov 2017, 08:14

If I had a dollar for every second I waited for your scripts to finish before I realise I didn't hit q... :D. Shame on me for not learning, but indenting the hotkey routine makes it clearer, imho.
jeeswg wrote:This is exactly the sort of behaviour I thought AHK v2 was supposed to eliminate.
It does eliminate this sort of code:

Code: Select all

if !ObjHasKey(oTemp, "" vValue)
		oArray2.Push("" vValue), oTemp["" vValue] := ""
Although I can appreciate the fun of tinkering with script language specifics, it is probably better if the above is never necessary.
jeeswg wrote:Do you find this behaviour surprising?
I do not find it surprising that the behaviour is according to the documentation.
I do not find it surprising that v2 behaves differently from v1.
I do not find it surprising that the choice to change the behaviour was made. Not claiming the following is the (only) reason for the change, but, integer keys performs better than string key, and takes less space (on average, I guess). If it looks like a number, perhaps it is a number ;)
jeeswg wrote:How can I specify a number stored as a string? Do you know? Thanks.
If you are desperate,

Code: Select all

arr[key:="01"] := "" ; type(key) = string, key+0 = "01" + 0 = 1

jeeswg wrote: Here's a legitimate usage scenario that gets messed up by the 'string looks number' assumption. It works fine in AHK v1, but not in AHK v2.
legitimate, sure, but I think it is rare enough to not cast any doubts on wether the change in handling keys causes an unacceptable loss of feature vs the improvements implied by the change.
I'll reserve judgement for now, but presently this behaviour is very concerning.
That is sound, I will do that too, however, I am not presently concerned. But I would like to hear any arguments, both pros and cons. This thread is probably not the place though.

Cheers.
User avatar
jeeswg
Posts: 6371
Joined: 19 Dec 2016, 01:58
Location: UK

Re: detect/Remove duplicate values in array

11 Nov 2017, 09:04

- I want an object with obj[1] and obj["1"], can you achieve that?
- Do you have any examples of hotkey labels and indentation. I haven't seen anyone consistently do this on the forum, so I couldn't imitate the style if I wanted to. And I have some grey areas re. (a) what's considered standard, (b) rules for automating indentation.
- While writing those 2 lines of codes, I was actually pleased that I could handle all of the issues to do with string v. numeric key names so succinctly and effectively, and with no ambiguity.
- Re. surprise, I think it's been changed because 'users might be stupid', and I respect the intent, but it could quite easily cause more mistakes and not fewer. Anyone who uses arrays has to understand that there are string keys and numeric keys, it's fundamental. In situations like this you improve the documentation, you don't 'improve' (dumb-down and over-complicate) the language. It will annoy the power users and it won't help the newbies.
- Re. surprise. Almost everything brought in in AHK v1.1, i.e. by lexikos, has remained consistent in AHK v2, this would be an exception.
- I like to use integer keys also, but sometimes you intend them to be numbers stored as strings for good reasons, especially when you are handling strings and doing loops and sorts. Crucially, I at least need *a* way to do it, which I haven't so far seen.

[EDIT:]
- Ultimately, the number one thing that annoys and confuses both power users and newbies is ambiguity. I.e. fiddliness and special exceptions. Things that are clear and consistent work best.
- Currently I have no complaints about how AHK v1 handles the string/numeric key issue.
- One solution when handling strings could be to use a prefix character, so what you gain with integers, you lose with strings.
- Like I said the jury is still out, it partly depends on if there are ways to directly create key names e.g. '1', '2', '3', which are numbers stored as strings.
- There are situations in Explorer where you in one object you refer to items by name and by number at the same time e.g. the nth file (integer key), and by name (string key), however, you may have a folder, with a name comprised solely of digits e.g. a datestamp, or simply the folders named '1', '2' etc.
- You could also have issues relating to numbers with/without leading zeros.

[EDIT:] So Helgef, you've revealed to me (1) the deref in force an expression/can be an expression issue, (2) the no assume local issue, and now (3) the string keys in AHK v2 issue. Hmm, what next?
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA

Return to “Ask For Help”

Who is online

Users browsing this forum: Bing [Bot], stargate and 144 guests