VarSetCapacity

Discuss the future of the AutoHotkey language
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

VarSetCapacity

13 May 2019, 01:50

What is the plan for VarSetCapacity?

Is it to be similar to this? Or something different?

Code: Select all

size/length:
SizeOf(Var) [bytes]
StrLen(Var) [chars] [already exists]
VarSetSize(Var, Size, FillByte) [bytes]
StrSetLen(Var, Len, FillChar) [chars]

capacity:
VarGetCapacity(Var) [bytes]
StrGetCapacity(Var) [chars]
VarSetCapacity(Var, Capacity, FillByte) [bytes] [already exists]
StrSetCapacity(Var, Capacity, FillChar) [chars]
Note: perhaps all of the 'Set' functions should maintain existing data.

Based on:
conversion logic, v1 = -> v1 := -> v2, two-way compatibility - Page 7 - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=37&t=27069&p=215816#p215816
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
Helgef
Posts: 4031
Joined: 17 Jul 2016, 01:02
Contact:

Re: VarSetCapacity

13 May 2019, 04:32

VarSetCapacity wrote:Deprecated: This function may be changed or removed in a future version of the program.
src
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: VarSetCapacity

13 May 2019, 09:06

How are we then to create a variable of a certain size for structs or for long strings, without using a Buffer object. We could rename the function, but to what? It *sets the capacity of a variable*.

Code: Select all

VarSetCapacity(vOutput, 1000*2)
Loop, 100
	vOutput .= "abcdefghij"
MsgBox, % vOutput

VarSetCapacity(vData, 16, 0)
Loop, 4
	NumPut(0x11*A_Index, &vData+A_Index*4-4, "UInt")
MsgBox, % Format("0x{:016X}{:016X}", NumGet(&vData+8, "UInt64"), NumGet(&vData, "UInt64"))

I read this:
VarSetCapacity - Syntax & Usage | AutoHotkey v2
https://lexikos.github.io/v2/docs/commands/VarSetCapacity.htm#Remarks
VarSetCapacity controls the size of a variable's string buffer, which should not be used to create structures or store other binary data. A future update might remove this function or change it to accept the new size in characters rather than bytes.
VarSetCapacity sets the capacity of a *variable*, not a *string*, so why would it be measured in characters? The natural unit for a variable's contents is bytes. (And ListVars lists variables, not strings only.)

If VarSetCapacity used characters (please not this), then what XXXSetCapacity function would handle bytes?

VarSetCapacity can be changed behind the scenes, but if it can create a variable of a certain size for structs or for long strings, and if people don't have to update VarSetCapacity in their old scripts, then everybody who uses it will be happy.

If for some reason I had to replace 'VarSetCapacity(' with another string, that wouldn't be so bad.

I had thought for AHK v1 (and v2): VarSetCapacity for bytes, StrSetCapacity for strings (to avoid having to check A_IsUnicode).
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
Helgef
Posts: 4031
Joined: 17 Jul 2016, 01:02
Contact:

Re: VarSetCapacity

13 May 2019, 09:38

jeeswg wrote:
13 May 2019, 09:06
I read this:
VarSetCapacity controls the size of a variable's string buffer
And then you write this,
VarSetCapacity sets the capacity of a *variable*, not a *string*
:?:
The natural unit for a variable's contents is bytes.
You should read about variables in Concepts and conventions.
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: VarSetCapacity

13 May 2019, 10:09

As long as VarSetCapacity can set the capacity of a variable (to at least the required capacity), ready for binary data/a string, and can do what it's documented to do, none of the other details are of any significance from the user's standpoint. It's a black box.

Integers/floats/strings/binary data use bytes. SizeOf, to get the size of a variable, uses bytes. You were probably thinking 'oh but objects', fair point, although object references and object data require bytes.

If you saw a 'VarGetCapacity' function, in a programming language, what unit would you expect it to return? And SizeOf is effectively 'VarGetSize'. The natural unit for SizeOf is bytes.

You should read about [insert inane recommendation here].
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
lexikos
Posts: 6668
Joined: 30 Sep 2013, 04:07
GitHub: Lexikos

Re: VarSetCapacity

13 May 2019, 16:55

How are we then to create a variable of a certain size for structs [...], without using a Buffer object.
You are not.
VarSetCapacity wrote:The Buffer object offers superior clarity and flexibility when dealing with binary data, structures, DllCall and similar. For instance, a Buffer object can be assigned to a property or array element or be passed to or returned from a function without copying its contents.
NumPut wrote:Deprecated: If Target is a variable such as MyVar and it contains a string (not a pure number), the address and size of the variable's string buffer is used. Passing a variable containing a string may be treated as an error in a future release. Buffer objects should be used for structs and other binary data.
File.RawRead wrote:Reading into a Buffer created by BufferAlloc is recommended. [...]
Deprecated: To read into a variable, pass a variable which is empty or contains a string. If Bytes is omitted, it defaults to the capacity of the variable. If Bytes is greater than the capacity of the variable, the variable is expanded. After the data is read, the variable's length is set to the string length of the data (rounded up to a whole number).
Using a Buffer adds clarity, both to the code and to the interface: if you pass a variable, that variable's value is used. Currently, either the variable's value or the variable's buffer is used, and it is impossible to use the return value of a function, property or array element in the same manner, even if a function just accepts the variable and returns it directly. This is inflexible and counter-intuitive.

VarSetCapacity sets the capacity of a *variable*, not a *string*, so why would it be measured in characters?
The capacity of a variable for what? It sets the variable's capacity to hold a string. The variable's string buffer.

The problem is this concept of a string buffer tightly bound to the variable. It cannot be transferred to any other container. It implies that the string data type is copied on assignment, which could be very inefficient. My intention is to decouple the value (a string or string buffer) from the container (a variable, array element or property) for clarity and to enable future optimization. Without this and Buffer, there may be:

Code: Select all

VarSetByteCapacity(var, bytes [, byteFill])
VarSetCharCapacity(var, chars [, charFill])
myObject.SetPropertyByteCapacity(name, bytes [, byteFill])
myObject.SetPropertyCharCapacity(name, chars [, charFill])
myArray.SetItemByteCapacity(index, bytes [, byteFill])
myArray.SetItemCharCapacity(index, chars [, charFill])
myMap.SetItemByteCapacity(index, bytes [, byteFill])
myMap.SetItemCharCapacity(index, chars [, charFill])
VarGetByteCapacity(var)
VarGetCharCapacity(var)
myObject.GetPropertyByteCapacity(name)
myObject.GetPropertyCharCapacity(name)
myArray.GetItemByteCapacity(index)
myArray.GetItemCharCapacity(index)
myMap.GetItemByteCapacity(index)
myMap.GetItemCharCapacity(index)
instead of:

Code: Select all

var := BufferAlloc(bytes [, byteFill])  ; Hypothetical. Currently no byteFill parameter.
bytes := var.Size
var.Size := bytes  ; or var.SetCapacity(bytes [, byteFill])

var := StrBuffer(chars [, charFill])  ; Hypothetical.
chars := var.Size
var.Size := chars  ; or var.SetCapacity(chars [, charFill])

myObject.%name% := var
myArray[name] := var
myMap[name] := var
Composability is key.
jeeswg wrote:[...] if people don't have to update VarSetCapacity in their old scripts, then everybody who uses it will be happy.
I'll be happier if you stop flogging this dead horse. The use of VarSetCapacity is a very minor thing to update, with clear benefits. No one has to update VarSetCapacity or anything else in their old scripts, unless they want to use v2. If they use v2, they will need to update the script, in ways sometimes much more significant than replacing VarSetCapacity with BufferAlloc.

Code: Select all

vData := BufferAlloc(16)
Loop, 4
	NumPut(0x11*A_Index, vData, A_Index*4-4, "UInt")
MsgBox Format("0x{:016X}{:016X}", NumGet(vData, 8, "UInt64"), NumGet(vData, "UInt64"))
Notice I only changed the first line, removed & and replaced + with ,. Using the offset parameter and passing a variable reference (in v1) or Buffer (in v2) is safer in this case, as NumPut/NumGet will perform bound checking, which prevents you from making mistakes that are difficult to debug. It also performs better, and is arguably more readable. If you had done it that way in the first place, only the first line would have needed to change.
User avatar
jeeswg
Posts: 6904
Joined: 19 Dec 2016, 01:58
Location: UK

Re: VarSetCapacity

06 Jun 2019, 20:10

- I'm grateful for your interesting and comprehensive summary.

- Here's how I've seen the situation:
- I was happy with VarSetCapacity for *structs*.
- I wanted 'StrSetCapacity' or similar, for *big strings*: to remove the A_IsUnicode check, for readability, and for clarifying where structs/big strings had been created.

- I had written something similar to BufferAlloc, but still preferred VarSetCapacity 99% of the time:
- Func(&vData+4) beats Func(oData.Ptr+4) for readability, for functions that need pointers (only NumGet/NumPut have an Offset parameter).
- ... I'd be happy to rewrite old code for a benefit, or even where it's cost neutral, but removing & and adding .Ptr etc in every script would be painstaking/careful work for less readable code.
- I almost never need to pass binary data around a script. And can use XXXPut/XXXGet functions to achieve this.
- I have never needed a safety check of address plus size. So for me, that doesn't count as an argument.
- (I'm not too fond of 'smart parameters', e.g. NumPut/NumGet/StrPut/StrGet. Something which BufferAlloc appears to be encouraging the use of: a separate Offset parameter. I'd rather parameters had a consistent location, and were either used or not used. Are they common in any programming languages? Does the parameter checking affect performance? I'd considered A_ZBIndex/A_ZeroIndex, as an aid to removing the NumGet/NumPut Offset parameter.)

- If VarSetCapacity were removed, I might create 'VarSetSize' (or 'StrSetSize') like so:

Code: Select all

JEE_VarSetSize(ByRef vVar, oParams*)
{
	if oParams.Length()
	{
		vSize := oParams.1
		vFillByte := oParams.2 ? oParams.2 : 0
		vChars := Ceil(vSize / (A_IsUnicode?2:1))
		vVar := Format("{:" vChars "}", "")
		DllCall("ntdll\RtlFillMemory", "Ptr",&vVar, "UPtr",vSize, "UChar",vFillByte)
	}
	return StrLen(vVar)*(A_IsUnicode?2:1)
}
JEE_VarSetSize(vData, 16)
MsgBox(JEE_VarSetSize(vData))
- Something like that would be useful built-in, or VarSetCapacity maintained, to clearly signal when structs were created, or if need be:

Code: Select all

;if StrRept and A_ChrSize were added:
vData := StrRept("_", vSize/A_ChrSize+1)

- A key need re. data is handling for hex/base64.
- Would it be worth submitting these as pull requests?
- Otherwise you're welcome to adapt the code or suggest edits in the thread.
C++: AHK source code: Base64Get/Base64Put and HexGet/HexPut - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=75&t=64694

- I think that rearranging the order of the NumPut parameters would be a bit of a 'disaster', and forum members would be dealing with the confusion forever afterwards.
- I would suggest an additional function be created if desired.
- One alternative, would be to allow the Number parameter (and Type parameter) to accept an array.
NumPut(Number, VarOrAddress [, Offset := 0][, Type := "UPtr"])
- E.g. Number [1, 2, 3, 4], plus Type "Int" would write 4 Ints.

The use of VarSetCapacity is a very minor thing to update, with clear benefits. No one has to update VarSetCapacity or anything else in their old scripts, unless they want to use v2.
...
Notice I only changed the first line, removed & and replaced + with ,.
- Replacing strings with Buffer objects, is probably the most involved automated conversion task I've yet to see. It would affect VarSetCapacity/DllCall/NumGet/NumPut/custom function calls and other miscellaneous lines. Any dynamic references to variables adding to the complexity.
- I did the cost-benefit analysis, and decided it's better to use string buffers to store binary data, 99% of the time. E.g. as mentioned above, it would just decrease script readability to convert the scripts.
- I fully welcome the addition of BufferAlloc, e.g. for passing binary data via function return values, and for passing pointers instead of data.

- One curio, oBuffer[4] might be handy cf. oBuffer.Ptr+4, although possibly this idea was already in the aether.
- I'd be interested in any advantages of StrBuffer over current string handling, but am fine with the current string functionality.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
Helgef
Posts: 4031
Joined: 17 Jul 2016, 01:02
Contact:

Re: VarSetCapacity

07 Jun 2019, 00:33

I have never needed a safety check of address plus size. So for me, that doesn't count as an argument.
since you pass &var, it would never apply. It is a good feature.

Return to “AutoHotkey v2 Development”

Who is online

Users browsing this forum: No registered users and 29 guests