Format the wikicode of infoboxes… Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
SyntaxTerror
Posts: 53
Joined: 23 May 2017, 12:55

Format the wikicode of infoboxes…

Post by SyntaxTerror » 16 Nov 2022, 11:42

Hello

I am a contributor to Wikipedia and I would like to make a script with AutoHotKey that could format the wikicode of infoboxes and other similar templates.

Infoboxes are templates that displays a box on the side of articles and shows the values of the parameters entered (they are numerous and they differ in number, lenght and type of characters used depending on the infobox).

Parameters are always preceded by a pipe (|) and end with an equal sign (=). On rare occasions, multiple parameters can be put on the same line, but I can sort this manually before running the script.

A typical infobox will be like this:

Code: Select all

{{Infobox XYZ
 | first parameter  = foo
 | second_parameter = 
 | 3rd parameter    = bar
 | 4th              = bazzzzz
 | 5th              = 
 | etc.             = 
}}
But sometime, (lazy) contributors put them like this:

Code: Select all

{{Infobox XYZ
|first parameter=foo
|second_parameter= 
|3rd parameter=bar
|4th=bazzzzz
|5th= 
|etc.= 
}}
Which isn't very easy to read and modify.

I would like to know if it is possible to make a regex (or a serie of regexes) that would transform the second example into the first.

The lines should start with a space, then a pipe, then another space, then the parameter name, then any number of spaces (to match the other lines lenght), then an equal sign, then another space, and if present, the parameter value.

I try some things using multiple capturing groups, but I'm going nowhere... (this is the best I could make: https://regex101.com/r/GunrUg/1).

Would someone have an idea on how to make it work?

I have asked also on StackOverflow https://stackoverflow.com/questions/74448042/regex-to-format-wikipedias-infoboxes-code?noredirect=1#comment131424643_74448042 and Taazar told me about a JS script I could add to my common.js page on Wikipedia, but it doesn't seem to work for me (script is here: https://en.wikipedia.org/wiki/User:Taavi/Aligner.js )

Thank you for your time.

picklepissjar
Posts: 20
Joined: 22 Oct 2022, 10:03

Re:

Post by picklepissjar » 16 Nov 2022, 12:06

I'm not very good with regex, but here's a start. Idk how to get the tabs to align like in the example.
https://regex101.com/r/eQ1R0Y/1

User avatar
SyntaxTerror
Posts: 53
Joined: 23 May 2017, 12:55

Re: Re:

Post by SyntaxTerror » 16 Nov 2022, 12:14

picklepissjar wrote:
16 Nov 2022, 12:06
I'm not very good with regex, but here's a start. Idk how to get the tabs to align like in the example.
https://regex101.com/r/eQ1R0Y/1
Thank you but tabs don't work in Wikipedia editor (only normal spaces).
Note: The font used in Wikipedia editing window is monospaced (like Courrier New), so every character has the same width.

picklepissjar
Posts: 20
Joined: 22 Oct 2022, 10:03

Re: Format the wikicode of infoboxes…

Post by picklepissjar » 16 Nov 2022, 13:14

Code: Select all

length := 20
FileRead, .txt			;	Fill in here with text file to parse
stuff := []
While RegExMatch(            , "O)\n\|(.*)=(.*)", m, prevPos ? prevPos : 1) {			; and put the output var here
	str := " | " m.Value(1)
	Loop, % length - m.len(1)
		str .= A_Space
	str .= "= " m.Value(2)
	stuff .= str
	prevPos := m.Pos() + m.Len()
	MsgBox % str
}
The length is set manually since I don't know how it should be, also assuming equally sized characters.
length should be > the longest parameter before the =
Edit: The one below is way better :lol:
Last edited by picklepissjar on 16 Nov 2022, 23:38, edited 1 time in total.

geek
Posts: 1055
Joined: 02 Oct 2013, 22:13
Location: GeekDude
Contact:

Re: Format the wikicode of infoboxes…

Post by geek » 16 Nov 2022, 13:54

Check out this code from Mordecai on the Discord:

Code: Select all

regex := "O)\s*\|\s*(.*?)\s*=\s*(.*)", width := 1
Loop, Parse, Clipboard, `n, `r
	If RegExMatch(A_LoopField, regex, _)
		width := Max(width, StrLen(_[1]))
Loop, Parse, Clipboard, `n, `r
	If RegExMatch(A_LoopField, regex, _)
		out .= Format(" | {:-" width "} = {2}", _[1],_[2]) "`n"
	else
		out .= A_LoopField "`n"
Clipboard := out

/*
input:
{{Infobox XYZ
|first parameter=foo
|second_parameter= 
|3rd parameter=bar
|4th=bazzzzz
|5th= 
|etc.= 
}}
output:
{{Infobox XYZ
 | first parameter  = foo
 | second_parameter = 
 | 3rd parameter    = bar
 | 4th              = bazzzzz
 | 5th              = 
 | etc.             = 
}}
*/

User avatar
SyntaxTerror
Posts: 53
Joined: 23 May 2017, 12:55

Re: Format the wikicode of infoboxes…

Post by SyntaxTerror » 17 Nov 2022, 09:30

geek wrote:
16 Nov 2022, 13:54
Check out this code from Mordecai on the Discord:

Code: Select all

^i::
^x
regex := "O)\s*\|\s*(.*?)\s*=\s*(.*)", width := 1
Loop, Parse, Clipboard, `n, `r
	If RegExMatch(A_LoopField, regex, _)
		width := Max(width, StrLen(_[1]))
Loop, Parse, Clipboard, `n, `r
	If RegExMatch(A_LoopField, regex, _)
		out .= Format(" | {:-" width "} = {2}", _[1],_[2]) "`n"
	else
		out .= A_LoopField "`n"
Clipboard := out
Sleep, 500
^v
Return
*/
Hello geek and thank you for your answers.

I tried the script above but couldn't make it work:
I guess that the script takes what is inside the clipboard and replaces it with the modified code with added spaces, but when I select the lines needing to be modified and press Ctrl+i, it just erases everything selected...
Sorry but I'm a bit of a noob sometimes...

picklepissjar
Posts: 20
Joined: 22 Oct 2022, 10:03

Re: Format the wikicode of infoboxes…  Topic is solved

Post by picklepissjar » 17 Nov 2022, 14:52

Code: Select all

^i::
^x
regex := "O)\s*\|\s*(.*?)\s*=\s*(.*)", width := 1
Loop, Parse, Clipboard, `n, `r
	If RegExMatch(A_LoopField, regex, _)
		width := Max(width, StrLen(_[1]))
Loop, Parse, Clipboard, `n, `r
	If RegExMatch(A_LoopField, regex, _)
		out .= Format(" | {:-" width "} = {2}", _[1],_[2]) "`n"
	else
		out .= A_LoopField "`n"
Clipboard := out
Sleep, 500
^v
Return
*/
The issue is with the ^x and ^v being improper syntax
It should be Send, ^x and Send, ^v respectively
Also Idk if the sleep is needed, but I removed it on my end and it worked perfectly. :D
Complete Code:

Code: Select all

^i::
out := ""
Send, ^x
regex := "O)\s*\|\s*(.*?)\s*=\s*(.*)", width := 1
Loop, Parse, Clipboard, `n, `r
	If RegExMatch(A_LoopField, regex, _)
		width := Max(width, StrLen(_[1]))
Loop, Parse, Clipboard, `n, `r
	If RegExMatch(A_LoopField, regex, _)
		out .= Format(" | {:-" width "} = {2}", _[1],_[2]) "`n"
else
	out .= A_LoopField "`n"
Clipboard := out
Send, ^v
Return
Edit: Added out := "" to reset value between runs (otherwise it keeps adding onto itself)

User avatar
SyntaxTerror
Posts: 53
Joined: 23 May 2017, 12:55

Re: Format the wikicode of infoboxes…

Post by SyntaxTerror » 17 Nov 2022, 15:28

picklepissjar wrote:
17 Nov 2022, 14:52
The issue is with the ^x and ^v being improper syntax
Gee, I'm ashamed of myself... :oops :

Thank you very much, it works very well!

Post Reply

Return to “Ask for Help (v1)”