AutoHotkey Community

It is currently May 25th, 2012, 3:32 am

All times are UTC [ DST ]




Post new topic This topic is locked, you cannot edit posts or make further replies.  [ 23 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: RegEx Analyzer
PostPosted: May 2nd, 2007, 2:22 pm 
Offline
User avatar

Joined: August 11th, 2004, 1:47 am
Posts: 5346
Location: UK
This parses regex and shows modifiers, subpatterns, alternations and repetitions in a hierarchical format. Special characters like \b and $ are defined and all groups are matched and displayed in a ListView.

This is just an alpha test, it was originally a modification for toralf's script but became something quite different.

Screenshot:
Image

Download

_________________
GitHubScriptsIronAHK Contact by email not private message.


Last edited by polyethene on May 4th, 2007, 10:00 pm, edited 1 time in total.

Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 2nd, 2007, 2:44 pm 
Offline

Joined: December 27th, 2005, 1:46 pm
Posts: 6837
Location: France (near Paris)
That's funny, I though regexes were a good opportunity to test the treeview component, but never managed to work on such script. Now, you removed all motivation to do it... :-)

Might I suggest you name it RegExAnalyzer? It might better describe your program (or Analyzer & Tester). Any way, the name conflict with toralf' script is annoying (I have to unzip somewhere else and rename one...).

I see a typo in the screenshot: \w: [...] charcter

OK, now, I will test it a bit... :-)

[EDIT] Looks fine after a short test. It will be a great learning tool! Congratulations.

_________________
Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 2nd, 2007, 3:00 pm 
Offline
User avatar

Joined: August 11th, 2004, 1:47 am
Posts: 5346
Location: UK
PhiLho wrote:
That's funny, I though regexes were a good opportunity to test the treeview component, but never managed to work on such script.
lol, feel free to post your mods/scripts.

PhiLho wrote:
Might I suggest you name it RegExAnalyzer?
I like it. It'll be renamed in the next version (cba to take another screenshot).

PhiLho wrote:
I see a typo in the screenshot: \w: [...] charcter
Cheers. There might be a whole lot more, I'll have to double check the Analyze function :?

PhiLho wrote:
Looks fine after a short test. It will be a great learning tool! Congratulations.
Thanks :)

_________________
GitHubScriptsIronAHK Contact by email not private message.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 2nd, 2007, 3:47 pm 
Offline

Joined: December 27th, 2005, 1:46 pm
Posts: 6837
Location: France (near Paris)
First bug report:
- It saved automatically my test Context, but I lost the leading spaces...
- It updates the tree in real time, but there are some weird stuff:
. I type ^\s+(?P<a>\w+), it is OK. If I type x after ), still OK. If I type x before \w, it is displayed with 3 weird chars after (slashed o).
. I type (\d) after \w+, the program ceases to update the tree... The tree seems to be crashed, as it doesn't refresh. I have to restart te program.
It is an issue with real-time update, if I paste ^\s+(?P<a>\w(\d)) in the freshly opened window, it works.

Note sure of the difference between string and literal string...
\x85 isn't recognized as a single char.

^>\s*(?P<a>\w(?<b>\s*(?'c'\d+([A-Z]+?)))) gives weird results...
^>\s*(\w+(-*(\d+([A-Z]+?)))) is better, but it seems to miss deeply nested captures.

You started an ambitious program, it will be hard (but interesting) to bring it out of alpha state... ;-)

_________________
Image vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 2nd, 2007, 4:04 pm 
Offline
User avatar

Joined: August 11th, 2004, 1:47 am
Posts: 5346
Location: UK
PhiLho wrote:
- It saved automatically my test Context, but I lost the leading spaces...
If I can't fix it I'll switch to using an XML settings file.

PhiLho wrote:
I type (\d) after \w+, the program ceases to update the tree...
Strange, I'll take a look.

PhiLho wrote:
Note sure of the difference between string and literal string...
Literal string should say 'escaped'. I'll change this.

PhiLho wrote:
\x85 ... ^>\s*(?P<a>\w(?<b>\s*(?'c'\d+([A-Z]+?)))) ... ^>\s*(\w+(-*(\d+([A-Z]+?)))) [do not work as expected]
Sigh, even a bunch of expressions like ((?:(?<!\\)\((?:(?:[^\)]*\([^\)]*\)[^\)]*)|[^\)]*)(?<!\\)\)|(?<!\\)(?:\[[^]]*(?<!\\)\]|\{[^\}](?<!\\)\})|(?<!\\)\\[^\.aAbBzZ]|(?<!\\)\.)(?:\{\d+(?:,\d+)?\}|[\*\+\?])?\??) will not cover everything. I'll see if I can make amends.

Thanks for the feedback.

_________________
GitHubScriptsIronAHK Contact by email not private message.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 2nd, 2007, 4:59 pm 
Very nice Titan.


Report this post
Top
  
Reply with quote  
 Post subject:
PostPosted: May 2nd, 2007, 8:37 pm 
Offline
User avatar

Joined: December 26th, 2005, 4:40 pm
Posts: 8775
Great tool. Thanks for creating/sharing this. :)


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 3rd, 2007, 2:33 am 
Offline

Joined: February 12th, 2007, 7:54 am
Posts: 2462
I always felt that it's misleading to regard `a as merely CRLF/LF/CR, but supposed OK practically.
The reason I'm now saying about it is that I noticed PCRE was updated lately to 7.1 where introduced was new flag,
which does exactly the above:

Code:
PCRE_NEWLINE_ANYCRLF


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 3rd, 2007, 2:13 pm 
Offline

Joined: March 2nd, 2004, 3:36 pm
Posts: 10720
RegExAnalyzer is amazing! :)


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 4th, 2007, 9:57 pm 
Offline
User avatar

Joined: August 11th, 2004, 1:47 am
Posts: 5346
Location: UK
Thanks everyone.

Changes in Alpha 2:
  • Sets are more descriptive
  • \Q...\E now supported
  • Drastic improvements to regex parser (took me a while to accept the fact that nested parentheses can't be extracted with a single expression)
  • Alternations include subpatterns on either side where appropriate
  • Regexes can be saved and deleted (with autosave on exit)
  • New control showing replacement text (in Context tab)
  • Several description changes
  • ListView now shows output variables
  • Minor UI tweaks

I haven't moved up to beta because there's some more features I'd like to add, such as a regex builder.

A couple bugs I noticed: AutoHotkey crashes when switching regexes if you omit the braces for the If on line 376, and `a/`r`/`n aren't parsed as options in certain strange conditions - try removing the second RegExReplace on line 68 to see what I mean.

Sean wrote:
I always felt that it's misleading to regard `a as merely CRLF/LF/CR, but supposed OK practically.
I just copied the docs, which says it "Recognizes any type of newline, namely `r, `n, or `r`n".

_________________
GitHubScriptsIronAHK Contact by email not private message.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: May 8th, 2007, 2:40 am 
Offline

Joined: March 2nd, 2004, 3:36 pm
Posts: 10720
Titan wrote:
AutoHotkey crashes when switching regexes if you omit the braces for the If on line 376
You probably realize that omitting the braces causes the ELSE to become owned by the inner IF and not the outer one. This seems to cause an infinite loop (as evidenced by the CPU staying maxed on my system). When combined with your use of Critical in the auto-execute section, the missing braces seem to hang the script (though it doesn't crash on my system). Removing Critical solves the hang but the GUI window doesn't seem to function much at all.

Quote:
`a/`r`/`n aren't parsed as options in certain strange conditions - try removing the second RegExReplace on line 68 to see what I mean.
Its hard to analyze in its current state because I don't understand the script well enough. If you ever happen to reproduce it with a simpler example, please let me know.

Thanks.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: July 13th, 2007, 10:57 am 
Offline

Joined: June 7th, 2007, 1:33 pm
Posts: 1018
thanks very much for the script, but its needs improvement imo

A)

for example, I type "." and it replies: "matches any character"

this is not what I want, because its not accurate:

1) how many instances of the character?
2) what are "characters" ? you have to tell me all the characters
3) you mean alphanumeric characters too? letters AND numbers or only letters? in other words how do you define characters? are αβγ characters as well? or only abc?

so a better reply would be:

. matches any character (parenthesis a,b,c,d,etc) of one instance, etc of any case, etc

B)

also in addition to the explanation of the regex, I would like it to make a list of the stings that match this regex

eg if I type to it .. it should reply:

1) matches two instances of a character (these are meant to be characters: abc,etc)

AND

2) this is the list that this regex matches:

aa,ab,ac, etc

this can be very big, so it should display some only, but to have the ability to build the matching strings in a txt file

thanks


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: July 13th, 2007, 11:54 am 
Offline
User avatar

Joined: August 11th, 2004, 1:47 am
Posts: 5346
Location: UK
azure wrote:
1) how many instances of the character?
2) what are "characters" ? you have to tell me all the characters
3) you mean alphanumeric characters too? letters AND numbers or only letters? in other words how do you define characters? are αβγ characters as well? or only abc?
'Character' is singular so it implies only one instance. Alphanumeric, letter, number, symbol all have restrictive meanings and by definition 'character' is all of them.

azure wrote:
have the ability to build the matching strings in a txt file
While I like the idea a reverse regex would be extremely complex. Instead the program will show you the matching set in the listview - so you can work with the context.

_________________
GitHubScriptsIronAHK Contact by email not private message.


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: July 13th, 2007, 2:53 pm 
Offline

Joined: May 24th, 2007, 3:45 am
Posts: 1121
Please provide a list of everything this matches: .*

:wink:


Report this post
Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: July 13th, 2007, 3:04 pm 
Offline

Joined: June 7th, 2007, 1:33 pm
Posts: 1018
ManaUser wrote:
Please provide a list of everything this matches: .*

:wink:


a(anything)
b(anything)
etc

:wink:


Report this post
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic This topic is locked, you cannot edit posts or make further replies.  [ 23 posts ]  Go to page 1, 2  Next

All times are UTC [ DST ]


Who is online

Users browsing this forum: lblb, nothing and 11 guests


You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Powered by phpBB® Forum Software © phpBB Group