Need experts opinion/advice on Random command

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Need experts opinion/advice on Random command

Post by SKAN » 04 May 2021, 07:42

When was the Random command last updated?
I see only the following in chm doc.
 
1.0.46.15 - May 9, 2007
Fixed FileInstall causing the Random command to become non-random in compiled scripts. [thanks Velocity]
 
I need some clarification on the following quotes from doc.
 

Code: Select all

Random, , NewSeed
 
IIUC, the concept of Seed is gone from API from Windows 8. Windows now automatically does it.
 
The lowest allowed value is -2147483648
the largest number will be 2147483647
 
Why not a higher range?
 
If either Min or Max contains a decimal point, the end result will be a floating point number in the format set by SetFormat.
 
SetFormat is deprecated. If Random was function, then I would use my RoundT() to control the precision.
Without SetFormat, I'm stuck at max precision of 0.6. Is there any other way, that I am not seeing?
 
occasionally a result can be slightly greater than the specified Max (this is caused in part by the imprecision inherent in floating point numbers).
 
So: If max is critical to me I have to do an additional validation with Min()?. Why can't Min() be done by AHK itself?
 
This function uses the Mersenne Twister random number generator, MT19937, written I need to send by Takuji Nishimura and Makoto Matsumoto, Shawn Cokus, Matthe Bellew and IsakuWada.
....
When you use this, send an email to: [email protected] with an appropriate reference to your work.
 
The license for all my code is PD-Self
So: If I include Random in any of my functions ( for example: RandowPW() ),
I need to send email, as well as put an note in my topic that any users of my function should do the same? :roll:
 
Do NOT use for CRYPTOGRAPHY without securely hashing several returned values together,
otherwise the generator state can be learned after reading 624 consecutive values.
 
That's very discouraging. I have my own Random function, but I don't know enough about this subject.
So, If I share my native implementation of Random I need to use a more stronger disclaimer? :(
 
How fantastic is this MT19937? Can't it be replaced? Can't we do it with Windows API?.
If there is any hesitation owing to a a breaking change, why not an additional Random function that doesn't impose such a usage condition?

Thanks.

User avatar
jNizM
Posts: 3183
Joined: 30 Sep 2013, 01:33
Contact:

Re: Need experts opinion/advice on Random command

Post by jNizM » 04 May 2021, 08:42

Not sure if useful RtlRandomEx function (MSDN) (returns a random number in the range [0..MAXLONG-1]) (But MAXLONG = 2147483647)
or BCryptGenRandom function (MSDN) (cryptographically secure pseudorandom number generator)

CryptGenRandom (Wiki)
[AHK] v2.0.5 | [WIN] 11 Pro (Version 22H2) | [GitHub] Profile

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Need experts opinion/advice on Random command

Post by SKAN » 04 May 2021, 09:07

Hi @jNizM
 
I already wrote: IIUC, the concept of Seed is gone from API from Windows 8. Windows now automatically does it.
 

Code: Select all

#NoEnv
#Warn
#SingleInstance, Force

; XP+
DllCall("Advapi32.dll\SystemFunction036", "UIntP",Num:=0, "Int",4)
MsgBox % Num

; Vista+
DllCall("Bcrypt.dll\BCryptGenRandom", "Ptr",0, "UIntP",Num:=0, "Int",4, "Int",0x2)
MsgBox % Num
About RtlGenRandom ( SystemFunction036 )
That function is deprecated, but will not be removed from API because CRT function rand_s uses it.

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Need experts opinion/advice on Random command

Post by SKAN » 05 May 2021, 03:10

Random was inherited from AU2?
https://www.autoitscript.com/autoit3/docs/functions/Random.htm
 
Ok. I will write/post a CSPRNG version using BCryptGenRandom


Any thoughts/advise please.
@lexikos
@just me
@Helgef

just me
Posts: 9424
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Need experts opinion/advice on Random command

Post by just me » 05 May 2021, 05:04

Hi SKAN,

the Random command syntax was most probably inherited from AutoIt 2. AHK's Mersenn Twister code is located in mt19937ar-cok.h/cpp. That's all I guess/know.

lexikos
Posts: 9559
Joined: 30 Sep 2013, 04:07
Contact:

Re: Need experts opinion/advice on Random command

Post by lexikos » 05 May 2021, 06:00

It is before my time.

I have a note to self to look into replacing MT19937 with one of those system functions, to reduce code size. I don't know whether there would be any disadvantages or observable differences in behaviour.

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Need experts opinion/advice on Random command

Post by SKAN » 05 May 2021, 08:24

@just me
@lexikos

I thank you both for the replies.
lexikos wrote:
05 May 2021, 06:00
I don't know whether there would be any disadvantages or observable differences in behaviour.
I'm still testing. Both look same to me. But secure numbers can't be predicted, though I don't know how to confirm it.
I'm posting below my CSPRN function, in case anybody wants to compare results against the built-in Random.

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

CSPRN() : Generates a Cryptographically-secure pseudorandom number

Post by SKAN » 05 May 2021, 08:24

The function along with test code:

Code: Select all

CSPRN(Min, Max, RAND_MAX:=0x7fffffffffffffff) {  ; v0.32 by SKAN on D44R/D455 @ tiny.cc/csprn
Static r := DllCall("Bcrypt.dll\BCryptGenRandom", "Ptr",0, "Int64P",r:=0, "Int",8, "Int",0x2)
            DllCall("Bcrypt.dll\BCryptGenRandom", "Ptr",0, "Int64P",r:=0, "Int",8, "Int",0x2)
Return Round( Min + ( ((r & RAND_MAX)/RAND_MAX) * (Max-Min) )
            , Max( (r := InStr(Min, ".")) ? StrLen(SubStr(Min, r+1)) : 0
                 , (r := InStr(Max, ".")) ? StrLen(SubStr(Max, r+1)) : 0 ) )
}

#NoEnv
#Warn
#SingleInstance, Force

Min := 1
Max := 100

Loop
{
  A := ""
  Loop 100
    A .= CSPRN(Min, Max) . "`t"
  Sort, A, D`t U                               ; Unique sort
  Sort, A, D`t                                 ; Alphabetical sort
  StrReplace(A, "`t", "`t", Count:=0)
  MsgBox % "Unique:`t" . Count . "`n`n" . A
}
Notes:
Maximum value for Min and Max should be within 52 bits. MsgBox, % CSPRN(-4503599627370495, 4503599627370495)
If either of the number exceeds 52 bits the result is always an even number. For example: CSPRN(0x1, 0x7fffffffffffffff)
The first call to any of these CNG functions is always slow owing to seed generation (which is best done slow). Hence the dummy Static init.
If either Min or Max is a float, output will match the (better) precision of one of them.
For example, CSPRN(0, 1.00000000000000000) will output a float with .17 precision.

RtGenRandom() is faster than BCryptGenRandom(), but again, I think, slower is better for RNG.
My Scripts and Functions: V1  V2

lexikos
Posts: 9559
Joined: 30 Sep 2013, 04:07
Contact:

Re: Need experts opinion/advice on Random command

Post by lexikos » 12 Jun 2021, 03:51

SKAN wrote:The first call to any of these CNG functions is always slow owing to seed generation (which is best done slow).
How sure are you about that? ;)

Some simple measurement with QueryPerformanceCounter on my system showed the first call (with DllCall) to be around 220 µs, unless DllCall("LoadLibrary", "str", "Bcrypt.dll") was called first, in which case it took around 60 µs. Subsequent calls took around 6 µs. So how much of that overhead is actually seed generation?

This is from the go project, which presumably uses RtlGenRandom:
Looks like BoringSSL, Chromium, Firefox, and Rust all use RtlGenRandom unless they are targeting the UWP, and indeed we seem to understand it better than BCryptGenRandom, let's stick with that.
Source: crypto/rand: Currently using deprecated API for random number generation on Windows · Issue #33542 · golang/go
I also found
just use the CNG functions (BCryptGenRandom[1]), which is what both RtlGenRandom and CryptGenRandom use under the covers. BCrypt.dll is going to get loaded either way.
Source: The recommendation for windows is hopelessly out of date. On Vista and up CryptG... | Hacker News
Both RtlGenRandom and BCryptGenRandom will be slow on their first call:
  • RtlGenRandom (part of advapi32.dll) has to load bcrypt.dll which contains the Windows cryptographic code
  • BCryptGenRandom is part of bcrypt.dll, so no additional DLL loading is necessary, but it has to find the "system preferred" RNG, which might require reading values from the windows registry
Source: Windows: RtlGenRandom vs BCryptGenRandom · Issue #65 · rust-random/getrandom
However,
  • Even after calling SystemFunction036, GetModuleHandle("BCrypt.dll") returns null, and the first call to BCryptGenRandom still incurs the same cost.
  • I did observe that the first call to SystemFunction036 was "slower" than subsequent calls, but on my system the difference (and total time) was < 10 µs.
  • Using BCRYPT_RNG_ALG_HANDLE instead of the BCRYPT_USE_SYSTEM_PREFERRED_RNG flag did take the first call to BCryptGenRandom down to around 40 µs (from 60 µs), so I guess around 20 µs was for identifying the "preferred" RNG algorithm, probably by reading the registry. (I think prior to Windows 10 it is necessary to use BCryptOpenAlgorithmProvider, not pass the pseudo-handle directly as I did.)
Another source indicates that it's actually Bcryptprimitives.dll that ultimately does the work in both cases on Windows 8. But to use BCryptGenRandom, you have to load the rest of CNG as well.

Maybe it's seeded differently depending on which API you use, and that's part of the cost. Or maybe it's just doing extra unnecessary initialization of CNG.


Of some interest but not necessarily useful, there is the Intel RDRAND instruction. Seems to work on my i7-10700...

Code: Select all

#Requires AutoHotkey v2.0-a136
rdrand32 := Buffer(4)
NumPut("uint", 0xC3F0C70F, rdrand32)
DllCall("VirtualProtect", "ptr", rdrand32.ptr, "ptr", rdrand32.size, "uint", 0x40, "uint*", 0)
Loop 10
    MsgBox DllCall(rdrand32.ptr)
Above is the encoding for a 32-bit random number on both 32-bit and 64-bit platforms. On x64 there is also an encoding for 64-bit numbers:

Code: Select all

#Requires AutoHotkey v2.0-a136
rdrand64 := Buffer(8)
NumPut("int64", 0xC3F0C70F48, rdrand64)
DllCall("VirtualProtect", "ptr", rdrand64.ptr, "ptr", rdrand64.size, "uint", 0x40, "uint*", 0)
Loop 10
    MsgBox DllCall(rdrand64.ptr, "int64")
I assembled it "by hand", referring to the "Intel 64 and IA-32 Architectures Software Developer's Manual". I'm not really using it correctly;
The Carry Flag indicates whether a random value is available at the time the instruction is executed. CF=1 indicates that the data in the destination is valid. Otherwise CF=0 and the data in the destination operand will be returned as zeros for the specified width. All other flags are forced to 0 in either situation. Software must check the state of CF=1 for determining if a valid random value has been returned, otherwise it is expected to loop and retry execution of RDRAND

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Need experts opinion/advice on Random command

Post by SKAN » 12 Jun 2021, 05:35

lexikos wrote:
12 Jun 2021, 03:51
SKAN wrote:The first call to any of these CNG functions is always slow owing to seed generation (which is best done slow).
How sure are you about that? ;)
 
I already wrote:Need experts opinion/advice on Random command
 
I'm just a layman. :)

 

Thanks for the insights and links.
I think I went through most of them.. with limited understanding.
 
lexikos wrote:
12 Jun 2021, 03:51
Another source indicates that it's actually Bcryptprimitives.dll that ultimately does the work in both cases on Windows 8. But to use BCryptGenRandom, you have to load the rest of CNG as well.
The source code is available at the first link you posted. : https://github.com/golang/go/issues/33542#issuecomment-564234063
 
Of some interest but not necessarily useful, there is the Intel RDRAND instruction. Seems to work on my i7-10700...

Code: Select all

...
Above is the encoding for a 32-bit random number on both 32-bit and 64-bit platforms. On x64 there is also an encoding for 64-bit numbers:
 

I get a mix of signed/unsigned numbers in
AMD Ryzen 3 3200G (1 year old)
AMD Athlon 200GE (2 year old)

Fails in
AMD Athlon(tm) 5150 (8 year old)
image.png
image.png (11.55 KiB) Viewed 851 times

lexikos
Posts: 9559
Joined: 30 Sep 2013, 04:07
Contact:

Re: Need experts opinion/advice on Random command

Post by lexikos » 12 Jun 2021, 07:01

SKAN wrote:
04 May 2021, 07:42
occasionally a result can be slightly greater than the specified Max (this is caused in part by the imprecision inherent in floating point numbers).
So: If max is critical to me I have to do an additional validation with Min()?. Why can't Min() be done by AHK itself?
It appears that Random is using the same formula that you are using, so if it can be affected by this issue, so can your code. I don't see how it can be true, but perhaps it is a matter of perspective. For instance, if Max is specified as a decimal string, it might not be possible to exactly represent that decimal number as a binary floating-point number, so the conversion to binary floating-point might be what actually makes it "slightly greater than specified", and passing in binary floating-point numbers in the first place will always produce a number within that range. Min() won't do any good if there's no way to represent the original value in binary.

But if that's "in part", I don't know about the other part.

Without SetFormat, I'm stuck at max precision of 0.6.
Without SetFormat, you have the full precision of a 64-bit binary floating-point number available (in general, although not from the RNG). You only lose precision when you convert the number to a string, just as you would lose precision if you converted double to float or int. Think of the default conversion to string as a Round(n, 6) call. If you want a string representing the number with higher precision, use a different operation. Format is the replacement for SetFormat; you just have to call it explicitly instead of using implicit conversion to string.

Why not a higher range?
The RNG implementation provides a 32-bit number. Of course, we could (in script or C++) just call it twice to produce a 64-bit integer. For floating-point numbers, there's a formula (commented out) in the source code:

Code: Select all

// generates a random number on [0,1) with 53-bit resolution
double genrand_res53(void) 
{ 
    unsigned long a=genrand_int32()>>5, b=genrand_int32()>>6;

    return(a*67108864.0+b)*(1.0/9007199254740992.0);
} 
// These real versions are due to Isaku Wada, 2002/01/09 added */
I suppose that a*67108864.0+b = a*(2.0**26)+b = ((a << 26) | b), so it's just making a random 53-bit integer and dividing it by 2**53. It took me a while to figure out.

I'm just a layman.
I'm no expert on RNG.

The source code is available
That's code, but not the source. It was produced by reverse engineering, which is presumably also how the other guy got his information.

0xc000001d
I assumed that would be the result if the CPU doesn't support RDRAND, but wasn't sure as I only saw it when I encoded an instruction incorrectly. Thanks for confirmation.

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Need experts opinion/advice on Random command

Post by SKAN » 12 Jun 2021, 08:43

lexikos wrote:
12 Jun 2021, 07:01
It appears that Random is using the same formula that you are using, so if it can be affected by this issue, so can your code. I don't see how it can be true, but perhaps it is a matter of perspective. For instance, if Max is specified as a decimal string, it might not be possible to exactly represent that decimal number as a binary floating-point number, so the conversion to binary floating-point might be what actually makes it "slightly greater than specified", and passing in binary floating-point numbers in the first place will always produce a number within that range. Min() won't do any good if there's no way to represent the original value in binary.
 
Thank you.
 
lexikos wrote:
12 Jun 2021, 07:01
Without SetFormat, I'm stuck at max precision of 0.6.
Without SetFormat, you have the full precision of a 64-bit binary floating-point number available (in general, although not from the RNG). You only lose precision when you convert the number to a string, just as you would lose precision if you converted double to float or int.
I wrongly presumed RANDOM result will be a string formatted by SetFormat
I should've tested.
 

Code: Select all

Random, R, 0.0, 1.0
MsgBox % [R].GetCapacity(1)
 
Thanks for the clarification.
 
lexikos wrote:
12 Jun 2021, 07:01
Think of the default conversion to string as a Round(n, 6) call. If you want a string representing the number with higher precision, use a different operation. Format is the replacement for SetFormat; you just have to call it explicitly instead of using implicit conversion to string.
 
Edit: OK. Crystal clear.
For float, round() seems to be better.

Code: Select all

MsgBox % Format( "{:0.3f}", 1.2345 ) ; 1.234 <- incorrect?
         . "`n" . Round( 1.2345, 3 ) ; 1.235 <- correct!
 
lexikos wrote:
12 Jun 2021, 07:01
The source code is available
That's code, but not the source. It was produced by reverse engineering, which is presumably also how the other guy got his information.
 
:D.
OK. Pseudo-source code.
 
lexikos wrote:
12 Jun 2021, 07:01
0xc000001d
I assumed that would be the result if the CPU doesn't support RDRAND, but wasn't sure as I only saw it when I encoded an instruction incorrectly. Thanks for confirmation.
 
I expected to get an error on all three AMD machines. :)

lexikos
Posts: 9559
Joined: 30 Sep 2013, 04:07
Contact:

Re: Need experts opinion/advice on Random command

Post by lexikos » 12 Jun 2021, 17:08

SKAN wrote:
12 Jun 2021, 08:43
For float, round() seems to be better.
It does seem more useful, but maybe not more correct, because the number you are giving it is not 1.2345, but the nearest binary approximation, which is < 1.2345.

Code: Select all

MsgBox % Format( "{:0.17f}", 1.2345 ) ; 1.23449999999999990
Actually, the last digit differs between v1 and v2, probably due to the compiler.

Round uses _stprintf, but after doing some "rounding". Basically it does this (but with fewer printf digits):

Code: Select all

MsgBox % Format( "{:0.17f}", Floor(1.2345 * 10**3 + 0.5) / 10**3 ) ; 1.23500000000000010

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Need experts opinion/advice on Random command

Post by SKAN » 13 Jun 2021, 02:19

lexikos wrote:
12 Jun 2021, 17:08
Thanks for all these useful info. :) :thumbup:

lexikos
Posts: 9559
Joined: 30 Sep 2013, 04:07
Contact:

Re: Need experts opinion/advice on Random command

Post by lexikos » 13 Jun 2021, 06:47

SKAN wrote:
12 Jun 2021, 08:43
I wrongly presumed RANDOM result will be a string formatted by SetFormat
Maybe you rightly but incorrectly presumed...
The format of stored floating point numbers is determined by SetFormat.
Source: Random - Syntax & Usage | AutoHotkey
I'll fix that.


Regarding the "result > Max" problem, I think we have only Laszlo's word to go on. I experimented for a while (generating billions of random numbers in several arbitrary ranges) and searched for hours, but ultimately failed to find any specific example input which would produce a number greater than Max. Some of my experimentation was done with the "random number" being actually the hard-coded maximum integer, but I didn't go so far as trying to iterate through all possible floating-point numbers.

I did find a bunch of RNG functions from different languages which returned a number in the range 0.0 (inclusive) to 1.0 (exclusive), with a slight exception in JavaScript,
If extremely large bounds are chosen (2^53 or higher), it's possible in extremely rare cases to calculate the usually-excluded upper bound.
which makes me wonder if this is the same issue Laszlo was talking about. I also found several instances of the same formula used in your script and in the built-in Random, some with no mention of possible rounding errors.

I think when the upper bound is "excluded" (like Min + ( ((r & RAND_MAX)/(RAND_MAX+1)) * (Max-Min)), it is impossible for a rounding error to produce a value greater than Max. For 0.0 to 1.0 in current versions, the chance of actually getting exactly 1.0 is very low anyway; I think it would be 1 in 2**32. On top of that,
Laszlo wrote:Another problem is the division by 2**32-1 (it should be 2**32). 0 ≤ y ≤ 2**32-1 to start with, so the return value can be 0.0 and 1.0, both having the same mantissa (0). Consequently, there will be at least one missing 32-bit mantissa, which is just blown up to 53-bits (the precision of doubles) by the division.
So I'll be changing v2 to divide by 2**53 (after lifting limits), which will generally exclude Max for floating-point and hopefully remove any possibility of retval > Max.


An interesting problem I've come across (and solved) is bias when the range of numbers approaches the maximum supported:

Code: Select all

max := 2000000000  ; Pick a max close to 2**31.
iterations := 10000
low_10_pct := 0
Loop % iterations
{
    Random n, 0, max
    if (n < max//10)  ; Count the numbers in the bottom 10%.
        low_10_pct += 1
}
; Show percentage of iterations that fell within the bottom 10%.
MsgBox % (low_10_pct / iterations) * 100
Currently shows around 14%, but should be around 10%. This is due to the use of modulo to limit the random number to the specified range.

User avatar
SKAN
Posts: 1551
Joined: 29 Sep 2013, 16:58

Re: Need experts opinion/advice on Random command

Post by SKAN » 13 Jun 2021, 07:38

lexikos wrote:
13 Jun 2021, 06:47
Maybe you rightly but incorrectly presumed...
The format of stored floating point numbers is determined by SetFormat.
Source: Random - Syntax & Usage | AutoHotkey
I'll fix that.
 
Thank you.
 
lexikos wrote:
13 Jun 2021, 06:47
If extremely large bounds are chosen (2^53 or higher), it's possible in extremely rare cases to calculate the usually-excluded upper bound.
which makes me wonder if this is the same issue Laszlo was talking about. I also found several instances of the same formula used in your script and in the built-in Random, some with no mention of possible rounding errors.
 
My function was a result of learning and adapting from Mr.Laszlo's RandI()
https://autohotkey.com/board/topic/5362-more-secure-random-numbers/

Code: Select all

RandI(min,max) ; Random integer (< 53-bit) from Globally Unique ID
{
   VarSetCapacity(A,16)
   DllCall("rpcrt4\UuidCreate","Str",A)
   Address := &A - 1
   r = 0
   Loop 8
      r := (r + (*(Address+A_Index) ^ *(Address+A_Index+8)))/256.0
   Return min + Floor(r*(max-min+1))
}
 
lexikos wrote:
13 Jun 2021, 06:47
So I'll be changing v2 to divide by 2**53 (after lifting limits), which will generally exclude Max for floating-point and hopefully remove any possibility of retval > Max.
 
:) :thumbup:
 

Post Reply

Return to “Ask for Help (v1)”