RegEx challenge: thousands separator Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: RegEx challenge: thousands separator

09 Nov 2017, 15:41

It was really good to know, what "?+" and "??" mean.
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegEx challenge: thousands separator

09 Nov 2017, 15:57

I was reading here:
pcre.txt
http://www.pcre.org/pcre.txt

But it still isn't clear to me what 'possessive' is, or how it can be useful.

pcresyntax specification
http://www.pcre.org/original/doc/html/p ... .html#SEC9

Code: Select all

QUANTIFIERS

  ?           0 or 1, greedy
  ?+          0 or 1, possessive
  ??          0 or 1, lazy
  *           0 or more, greedy
  *+          0 or more, possessive
  *?          0 or more, lazy
  +           1 or more, greedy
  ++          1 or more, possessive
  +?          1 or more, lazy
  {n}         exactly n
  {n,m}       at least n, no more than m, greedy
  {n,m}+      at least n, no more than m, possessive
  {n,m}?      at least n, no more than m, lazy
  {n,}        n or more, greedy
  {n,}+       n or more, possessive
  {n,}?       n or more, lazy
Do you have any good examples that show where ?+, ?? or *+ are useful? Thanks.

Btw I have a link here:
jeeswg's RegEx tutorial (RegExMatch, RegExReplace) - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=28031

One thing I'm interested in, is collecting examples of RegEx syntax that aren't mentioned in the AutoHotkey documentation, I have a list there which is currently: \G, *ACCEPT, *SKIP.

[EDIT:] Btw which do you think is the more difficult challenge, my question (where any method is allowed), or your question?

[EDIT:] Also, do you think you can solve my question without a \G?
Last edited by jeeswg on 09 Nov 2017, 16:05, edited 1 time in total.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
FanaticGuru
Posts: 1906
Joined: 30 Sep 2013, 22:25

Re: RegEx challenge: thousands separator

09 Nov 2017, 16:02

teadrinker wrote:
It looks like you changed the \A to a ^ in an edit of your post but still nice to know that option if I ever need it in the future.
As I understood, there is no difference between \A and ^.

Code: Select all

str := "abcabc"

RegExMatch(str, "^abc", match1)
RegExMatch(str, "\Aabc", match2)

MsgBox, % match1 "`n" match2
^ matches the beginning of a line
\A matches the beginning of the string

Normally this is one and the same unless you are dealing with a multi-line string.

Code: Select all

str := "abc`r`ndef"

RegExMatch(str, "m)^def", match1)
RegExMatch(str, "m)\Adef", match2)
MsgBox, % "Match 1 = " match1 "`nMatch 2 = " match2
The first matches because "def" is at the beginning of a line.
The second does not match decause "def" is not at the beginning of the string.

\A \z is to the string what ^ $ is to a line.

FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
User avatar
Delta Pythagorean
Posts: 627
Joined: 13 Feb 2017, 13:44
Location: Somewhere in the US
Contact:

Re: RegEx challenge: thousands separator

09 Nov 2017, 16:04

Use this, you're welcome 8-)

Code: Select all

Th_Sep(X, s := ",") {
	Return RegExReplace(X, "(?(?<=\.)(*COMMIT)(*FAIL))\d(?=(\d{3})+(\D|$))", "$0" s)
}

[AHK]......: v2.0.12 | 64-bit
[OS].......: Windows 11 | 23H2 (OS Build: 22621.3296)
[GITHUB]...: github.com/DelPyth
[PAYPAL]...: paypal.me/DelPyth
[DISCORD]..: tophatcat

teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: RegEx challenge: thousands separator

09 Nov 2017, 16:10

@Delta Pythagorean, look at this, please. :)
User avatar
Delta Pythagorean
Posts: 627
Joined: 13 Feb 2017, 13:44
Location: Somewhere in the US
Contact:

Re: RegEx challenge: thousands separator

09 Nov 2017, 16:14

teadrinker wrote:@Delta Pythagorean, look at this, please. :)
If what you're implying is that I took it, I most certainly did not lol
My appologies

[AHK]......: v2.0.12 | 64-bit
[OS].......: Windows 11 | 23H2 (OS Build: 22621.3296)
[GITHUB]...: github.com/DelPyth
[PAYPAL]...: paypal.me/DelPyth
[DISCORD]..: tophatcat

User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegEx challenge: thousands separator

09 Nov 2017, 16:20

Well done Delta Pythagorean :salute:, you've completed my challenge of a thousands separator RegEx one-liner that doesn't use the \G anchor. The function passes my 12-number challenge.

So we have 2 champions, and I've got some bits of RegEx to learn. This is great because I haven't really learnt any new RegEx in a long time, and this challenge has done just what I hoped it would do, given me some new ideas to look at.

Btw as I intimated earlier, I was expecting that solutions would probably use RegEx techniques not mentioned in the AutoHotkey documentation. So I'm interested, Delta, in where you came across the bits of RegEx needed to solve the problem. Cheers.

[EDIT:] @Delta Pythagorean: I think he's implying that you take a look at the second of the two challenges.

[EDIT:] One of the toughest RegEx challenges that I've looked at/solved:
[remove items from a list if they start with a particular character]
Help with RegExReplace - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=33768
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
FanaticGuru
Posts: 1906
Joined: 30 Sep 2013, 22:25

Re: RegEx challenge: thousands separator

09 Nov 2017, 16:52

jeeswg wrote:But it still isn't clear to me what 'possessive' is, or how it can be useful.
From what I understand 'possessive' is only useful for increasing performance. But is easy to create a pattern that will always fail.

.*x = greedy, match as much as you can with an x at the end
.*?x = ungreedy, match as little as you can with an x at the end
.*+x = possessive, match as much as you can regard of anything, match everything that . will match, then look for an x

The last one will always fail as .*+ matches x and gobbles up x so there cannot possibly be an x after .*+
.*+ is like super greedy. It will match until it can't match anymore and then look for what is after it in the pattern. Which basically is not that useful but it is fast. Basically it can only safely be used when the part before the + is something that will not match the part after the + otherwise it will always fail.

FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
User avatar
FanaticGuru
Posts: 1906
Joined: 30 Sep 2013, 22:25

Re: RegEx challenge: thousands separator

09 Nov 2017, 17:13

jeeswg wrote:Well done Delta Pythagorean :salute:, you've completed my challenge of a thousands separator RegEx one-liner that doesn't use the \G anchor. The function passes my 12-number challenge.
I'm confused. Delta Pythagorean posted the exact same RegEx needle as I did.
https://autohotkey.com/boards/viewtopic ... 55#p180055

The original challenge of putting commas to the left of a decimal is pretty simple.

The second challenge of putting commas to the left and right of a decimal in reversed order is fairly straight forward with "or" solution but a branching solution like used in the original challenge is allusive. I toyed with it but could not solve it with a branching/conditional solution. The ability not to use variable width quantifiers in look behinds has stymied me many times working with RegEx.

I have yet to wrap my head around all the branching and conditional abilities of Regex.

The good thing is that RegEx is not specific to AHK so there is a ton of information and pre-made RegEx patterns out there.

FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegEx challenge: thousands separator

09 Nov 2017, 17:25

Ah, so indeed you did solve problem 1, FanaticGuru :salute:. So how did you come across those esoteric bits of RegEx? I.e. *COMMIT and *FAIL. Cheers.

Btw thanks for your comments re. possessive, I thought it seemed to be talking about performance rather than anything else. Although of course, some example code showing a different result to using greedy, if there is one, would be interesting to see.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
teadrinker
Posts: 4326
Joined: 29 Mar 2015, 09:41
Contact:

Re: RegEx challenge: thousands separator

09 Nov 2017, 17:26

FanaticGuru wrote:I'm confused. Delta Pythagorean posted the exact same RegEx needle as I did.
My bad :) I was so carried away by the second challendge that I forgot about the first, and I confused you, sorry :facepalm:
User avatar
FanaticGuru
Posts: 1906
Joined: 30 Sep 2013, 22:25

Re: RegEx challenge: thousands separator

10 Nov 2017, 02:54

jeeswg wrote:Ah, so indeed you did solve problem 1, FanaticGuru :salute:. So how did you come across those esoteric bits of RegEx? I.e. *COMMIT and *FAIL. Cheers.

Btw thanks for your comments re. possessive, I thought it seemed to be talking about performance rather than anything else. Although of course, some example code showing a different result to using greedy, if there is one, would be interesting to see.
The *COMMIT *FAIL are called Backtracking Control Verbs. There are others.

The actual needle used in my code is from the forum many years ago. I have used it in a function for a long time. It was the first RegEx that got me to learn about Backtracking Control Verbs.

Here is some examples for possessive.

Code: Select all

RegExMatch("123X4", ".*X", Match1) ; .* matches 123X4 but then backtracks to find the X and the pattern succeeds
RegExMatch("123X4", ".*+X", Match2) ; .* matches 123X4 but then possessive + prevents backtracking to find the X so the pattern fails. possessive prevents backtracking which is why it is faster
RegExMatch("123X4", "\d*+X", Match3) ; \d* matches 123, the possessive + prevents backtracking but that is not a problem here and it is faster
MsgBox % "First: " Match1 "`nSecond: " Match2 "`nThird: " Match3
Basically there are no examples that I know of where possessive does anything useful other than being quicker when you know there will be no need to backtrack in a variable width quantitative. It is basically just a performance tweak. Some flavors of RegEx automatically does this tweak for you when it analysis the pattern. I don't know if AHK's flavor does.

FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegEx challenge: thousands separator

10 Nov 2017, 06:22

@FanaticGuru: Great comments and examples, many thanks.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
sachin24
Posts: 9
Joined: 19 May 2020, 07:25

Re: RegEx challenge: thousands separator

18 Dec 2020, 09:10

Thanks It what I wanted :bravo:
FanaticGuru wrote:
02 Nov 2017, 18:14
Here is the RegExReplace that I use for commas and money.

Code: Select all

MsgBox % Format_Commas(123455)
MsgBox % Format_Commas(Round(123455,2))
MsgBox % Format_Commas(.12)
MsgBox % Format_Commas(Round(.12,2))
MsgBox % Format_Money(123455)
MsgBox % Format_Money(.12)

Format_Commas(val) 
{ 
	Return RegExReplace(val, "(?(?<=\.)(*COMMIT)(*FAIL))\d(?=(\d{3})+(\D|$))", "$0,") 
} 

Format_Money(val)
{
	return "$ " RegExReplace(Round(val,2), "(?(?<=\.)(*COMMIT)(*FAIL))\d(?=(\d{3})+(\D|$))", "$0,")
}
FG

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: No registered users and 275 guests