AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

RegEx Analyzer
Goto page 1, 2  Next
 
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Scripts & Functions
View previous topic :: View next topic  
Author Message
Titan



Joined: 11 Aug 2004
Posts: 5009
Location: imaginationland

PostPosted: Wed May 02, 2007 2:22 pm    Post subject: RegEx Analyzer Reply with quote

This parses regex and shows modifiers, subpatterns, alternations and repetitions in a hierarchical format. Special characters like \b and $ are defined and all groups are matched and displayed in a ListView.

This is just an alpha test, it was originally a modification for toralf's script but became something quite different.

Screenshot:


Download
_________________

RegExReplace("irc.freenode.net/autohotkey", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2")


Last edited by Titan on Fri May 04, 2007 10:00 pm; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website
PhiLho



Joined: 27 Dec 2005
Posts: 6721
Location: France (near Paris)

PostPosted: Wed May 02, 2007 2:44 pm    Post subject: Reply with quote

That's funny, I though regexes were a good opportunity to test the treeview component, but never managed to work on such script. Now, you removed all motivation to do it... Smile

Might I suggest you name it RegExAnalyzer? It might better describe your program (or Analyzer & Tester). Any way, the name conflict with toralf' script is annoying (I have to unzip somewhere else and rename one...).

I see a typo in the screenshot: \w: [...] charcter

OK, now, I will test it a bit... Smile

[EDIT] Looks fine after a short test. It will be a great learning tool! Congratulations.
_________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")
Back to top
View user's profile Send private message Visit poster's website
Titan



Joined: 11 Aug 2004
Posts: 5009
Location: imaginationland

PostPosted: Wed May 02, 2007 3:00 pm    Post subject: Reply with quote

PhiLho wrote:
That's funny, I though regexes were a good opportunity to test the treeview component, but never managed to work on such script.
lol, feel free to post your mods/scripts.

PhiLho wrote:
Might I suggest you name it RegExAnalyzer?
I like it. It'll be renamed in the next version (cba to take another screenshot).

PhiLho wrote:
I see a typo in the screenshot: \w: [...] charcter
Cheers. There might be a whole lot more, I'll have to double check the Analyze function Confused

PhiLho wrote:
Looks fine after a short test. It will be a great learning tool! Congratulations.
Thanks Smile
_________________

RegExReplace("irc.freenode.net/autohotkey", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2")
Back to top
View user's profile Send private message Visit poster's website
PhiLho



Joined: 27 Dec 2005
Posts: 6721
Location: France (near Paris)

PostPosted: Wed May 02, 2007 3:47 pm    Post subject: Reply with quote

First bug report:
- It saved automatically my test Context, but I lost the leading spaces...
- It updates the tree in real time, but there are some weird stuff:
. I type ^\s+(?P<a>\w+), it is OK. If I type x after ), still OK. If I type x before \w, it is displayed with 3 weird chars after (slashed o).
. I type (\d) after \w+, the program ceases to update the tree... The tree seems to be crashed, as it doesn't refresh. I have to restart te program.
It is an issue with real-time update, if I paste ^\s+(?P<a>\w(\d)) in the freshly opened window, it works.

Note sure of the difference between string and literal string...
\x85 isn't recognized as a single char.

^>\s*(?P<a>\w(?<b>\s*(?'c'\d+([A-Z]+?)))) gives weird results...
^>\s*(\w+(-*(\d+([A-Z]+?)))) is better, but it seems to miss deeply nested captures.

You started an ambitious program, it will be hard (but interesting) to bring it out of alpha state... Wink
_________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")
Back to top
View user's profile Send private message Visit poster's website
Titan



Joined: 11 Aug 2004
Posts: 5009
Location: imaginationland

PostPosted: Wed May 02, 2007 4:04 pm    Post subject: Reply with quote

PhiLho wrote:
- It saved automatically my test Context, but I lost the leading spaces...
If I can't fix it I'll switch to using an XML settings file.

PhiLho wrote:
I type (\d) after \w+, the program ceases to update the tree...
Strange, I'll take a look.

PhiLho wrote:
Note sure of the difference between string and literal string...
Literal string should say 'escaped'. I'll change this.

PhiLho wrote:
\x85 ... ^>\s*(?P<a>\w(?<b>\s*(?'c'\d+([A-Z]+?)))) ... ^>\s*(\w+(-*(\d+([A-Z]+?)))) [do not work as expected]
Sigh, even a bunch of expressions like ((?:(?<!\\)\((?:(?:[^\)]*\([^\)]*\)[^\)]*)|[^\)]*)(?<!\\)\)|(?<!\\)(?:\[[^]]*(?<!\\)\]|\{[^\}](?<!\\)\})|(?<!\\)\\[^\.aAbBzZ]|(?<!\\)\.)(?:\{\d+(?:,\d+)?\}|[\*\+\?])?\??) will not cover everything. I'll see if I can make amends.

Thanks for the feedback.
_________________

RegExReplace("irc.freenode.net/autohotkey", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2")
Back to top
View user's profile Send private message Visit poster's website
majkienetor!
Guest





PostPosted: Wed May 02, 2007 4:59 pm    Post subject: Reply with quote

Very nice Titan.
Back to top
SKAN



Joined: 26 Dec 2005
Posts: 5574

PostPosted: Wed May 02, 2007 8:37 pm    Post subject: Reply with quote

Great tool. Thanks for creating/sharing this. Smile
Back to top
View user's profile Send private message
Sean



Joined: 12 Feb 2007
Posts: 1240

PostPosted: Thu May 03, 2007 2:33 am    Post subject: Reply with quote

I always felt that it's misleading to regard `a as merely CRLF/LF/CR, but supposed OK practically.
The reason I'm now saying about it is that I noticed PCRE was updated lately to 7.1 where introduced was new flag,
which does exactly the above:

Code:
PCRE_NEWLINE_ANYCRLF
Back to top
View user's profile Send private message
Chris
Site Admin


Joined: 02 Mar 2004
Posts: 10463

PostPosted: Thu May 03, 2007 2:13 pm    Post subject: Reply with quote

RegExAnalyzer is amazing! Smile
Back to top
View user's profile Send private message Send e-mail
Titan



Joined: 11 Aug 2004
Posts: 5009
Location: imaginationland

PostPosted: Fri May 04, 2007 9:57 pm    Post subject: Reply with quote

Thanks everyone.

Changes in Alpha 2:
  • Sets are more descriptive
  • \Q...\E now supported
  • Drastic improvements to regex parser (took me a while to accept the fact that nested parentheses can't be extracted with a single expression)
  • Alternations include subpatterns on either side where appropriate
  • Regexes can be saved and deleted (with autosave on exit)
  • New control showing replacement text (in Context tab)
  • Several description changes
  • ListView now shows output variables
  • Minor UI tweaks

I haven't moved up to beta because there's some more features I'd like to add, such as a regex builder.

A couple bugs I noticed: AutoHotkey crashes when switching regexes if you omit the braces for the If on line 376, and `a/`r`/`n aren't parsed as options in certain strange conditions - try removing the second RegExReplace on line 68 to see what I mean.

Sean wrote:
I always felt that it's misleading to regard `a as merely CRLF/LF/CR, but supposed OK practically.
I just copied the docs, which says it "Recognizes any type of newline, namely `r, `n, or `r`n".
_________________

RegExReplace("irc.freenode.net/autohotkey", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2")
Back to top
View user's profile Send private message Visit poster's website
Chris
Site Admin


Joined: 02 Mar 2004
Posts: 10463

PostPosted: Tue May 08, 2007 2:40 am    Post subject: Reply with quote

Titan wrote:
AutoHotkey crashes when switching regexes if you omit the braces for the If on line 376
You probably realize that omitting the braces causes the ELSE to become owned by the inner IF and not the outer one. This seems to cause an infinite loop (as evidenced by the CPU staying maxed on my system). When combined with your use of Critical in the auto-execute section, the missing braces seem to hang the script (though it doesn't crash on my system). Removing Critical solves the hang but the GUI window doesn't seem to function much at all.

Quote:
`a/`r`/`n aren't parsed as options in certain strange conditions - try removing the second RegExReplace on line 68 to see what I mean.
Its hard to analyze in its current state because I don't understand the script well enough. If you ever happen to reproduce it with a simpler example, please let me know.

Thanks.
Back to top
View user's profile Send private message Send e-mail
azure



Joined: 07 Jun 2007
Posts: 296

PostPosted: Fri Jul 13, 2007 10:57 am    Post subject: Reply with quote

thanks very much for the script, but its needs improvement imo

A)

for example, I type "." and it replies: "matches any character"

this is not what I want, because its not accurate:

1) how many instances of the character?
2) what are "characters" ? you have to tell me all the characters
3) you mean alphanumeric characters too? letters AND numbers or only letters? in other words how do you define characters? are αβγ characters as well? or only abc?

so a better reply would be:

. matches any character (parenthesis a,b,c,d,etc) of one instance, etc of any case, etc

B)

also in addition to the explanation of the regex, I would like it to make a list of the stings that match this regex

eg if I type to it .. it should reply:

1) matches two instances of a character (these are meant to be characters: abc,etc)

AND

2) this is the list that this regex matches:

aa,ab,ac, etc

this can be very big, so it should display some only, but to have the ability to build the matching strings in a txt file

thanks
Back to top
View user's profile Send private message
Titan



Joined: 11 Aug 2004
Posts: 5009
Location: imaginationland

PostPosted: Fri Jul 13, 2007 11:54 am    Post subject: Reply with quote

azure wrote:
1) how many instances of the character?
2) what are "characters" ? you have to tell me all the characters
3) you mean alphanumeric characters too? letters AND numbers or only letters? in other words how do you define characters? are αβγ characters as well? or only abc?
'Character' is singular so it implies only one instance. Alphanumeric, letter, number, symbol all have restrictive meanings and by definition 'character' is all of them.

azure wrote:
have the ability to build the matching strings in a txt file
While I like the idea a reverse regex would be extremely complex. Instead the program will show you the matching set in the listview - so you can work with the context.
_________________

RegExReplace("irc.freenode.net/autohotkey", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2")
Back to top
View user's profile Send private message Visit poster's website
ManaUser



Joined: 24 May 2007
Posts: 900

PostPosted: Fri Jul 13, 2007 2:53 pm    Post subject: Reply with quote

Please provide a list of everything this matches: .*

Wink
Back to top
View user's profile Send private message
azure



Joined: 07 Jun 2007
Posts: 296

PostPosted: Fri Jul 13, 2007 3:04 pm    Post subject: Reply with quote

ManaUser wrote:
Please provide a list of everything this matches: .*

Wink


a(anything)
b(anything)
etc

Wink
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Scripts & Functions All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group