ahk lexers and parsers [was syntax highlighting]
I'm looking at creating a grammar for parsing AHK (or a subset of AHK). I've got some simple stuff working, but the unquoted interpolated strings are proving tricky for the lexer because it seems I need the parser to determine when the code should be treated as an interpolated string, and that information needs to direct the lexer. Lex can use start states to change the tokenization rules, and ANTLR can apparently do the same with predicates, but I don't think information can flow backward from parser to lexer without multiple passes.
autohotkey itself only reads the code once...
I kind of gave up / lost interest for the time being...
take a look at the c++ code in autohotkey.exe or the ironahk code, for how string interpolation is done...
Excellent work, by the way.
Edit: also tried to ask on #pocoo, but seems pretty quiet there...
It wasn't accepted, because there were major issues with it, and it wasn't ready. I have learnt a lot more python since my last attempt, so I will give it another shot... I am restarting from scratch: source. Look at ahklexer.py, it just supports comments and continuation sections for now.
tinku99: Pygments has not included your lexer yet. I was hoping you could maybe dredge up the topic again? ...
To test it,
1. install pygments : easy_install Pygments
2. git clone <!-- m -->https://github.com/t...9/ahklexers.git<!-- m -->
4. cd ahklexer/pygments
5. python testAhkLexer.py
fincs, I will be working with AutoHotkey_L.
Actually it was accepted on 2011-01-03 for pygments 1.4: commit.
But I will submit the new and improved version, once all the comments are in.
I have added variable declarations to commands.
I only noticed two issues:
[*:351tv0do]"Static" doesn't appear to be highlighted (as in static variable declarations).
[*:351tv0do]For "else return", only "else" is highlighted.
For now, I am just treating else, return, static, and friends all as commands for now. They only get recognized as commands if they are the first word on a line.
Properly lexing command parameters is going to require more complex logic i think. I will try.
- no longer cheating with (r'.', Text) rule to eat any unmatched characters.
- this means now a few scripts in toralf's stdlib fail on non-ascii characters.
- fixed allowed variable names to include @$#.
- added all ahk escape sequences
- changed labels and hotkeys rules to not match (". As a workaround, you could use +9 or +' as hotkeys if you really want them colored.
- added more easily browsable samples of highlighted scripts here
send a pull request to the pygments mercurial repo.
pull request 20: updated autohotkey lexer.
but command parameters are highlighted as strings rather than generic text now.
sample output: <!-- m -->http://golguppe.com/.../pygments/demo/<!-- m -->
source: <!-- m -->https://github.com/tinku99/ahklexers<!-- m -->
In pygments v1.4, if is highlighted as a function:
if (A_Username = "a") MsgBox aOutput:
[color=red]<span class='nf'>if</span>[/color]<span class='w'> </span><span class='p'>(</span><span class='nb'>A_Username</span><span class='w'> </span><span class='o'>=</span><span class='w'> </span><span class='s'>"a"</span><span class='p'>)</span> <span class='w'> </span><span class='k'>MsgBox</span><span class='w'> </span>aSorry if this has already been fixed.
Also I'm wondering why the apostroph is highlighted as error?
MsgBox He's very clever.
<span class='k'>MsgBox</span><span class='w'> </span>He[color=red]<span class='err'>'</span>[/color]s<span class='w'> </span>very<span class='w'> </span>clever<span class='p'>.</span>
for now, i am just treating "if", "for" and friends as functions as opposed to commands as they expect expressions. I could change them to highlight as builtins, but was defering to a full parser. I have been able to feed the pygments lexer output to both PLY and antlr, but hadn't done beyond a proof of concept. <!-- m -->https://github.com/t.../antlr-pygments<!-- m -->
if is highlighted as a function:...
why the apostroph is highlighted as error?
<!-- m -->https://github.com/t... ... tAhkPly.py<!-- m -->
the apostrophe error is probably from "He'svery clever" not correctly parsing as a string after a command. Probably need to stop treating the apostrophe as a punctuation or something...
Single-line commands are enclosed by
<span class="[color=red]c-Singleline[/color]"> ; resolve type[color=blue]</span>[/color]This causes them not to be highlighted properly on github. Github support says that's an error in pygments and this class should not be there.