jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

Helpful script writing tricks and HowTo's
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

11 Feb 2017, 22:16

[this page was called 'RegEx handy examples (RegExMatch, RegExReplace)']

I've tried to collect all of the most important RegEx techniques that I've used or may like to use.

Please notify of any corrections, improvements/simplifications.
Do post any handy examples not included, or links to other examples.
Also do mention any useful techniques doable in RegEx but not mentioned in AutoHotkey's help.

Btw if other people want to create their own lists of RegEx handy samples, including some that are closely/loosely based on mine, please go ahead and post your link here.
For example at the extreme end: copying the code, but only changing the variable names, is fine.

Here is a RegEx quick tutorial, outlining various basic concepts with example code:

Code: Select all

;RegEx quick tutorial

;CONCEPTS: STRING STARTS/ENDS/CONTAINS
;CONCEPTS: CASE SENSITIVE/CASE INSENSITIVE
;CHARACTERS: ^ (STARTS WITH)
;CHARACTERS: $ (ENDS WITH)
;CHARACTERS: | (OR)
;CHARACTERS: ()
;STRINGS: i) (CASE INSENSITIVE) (AHK-SPECIFIC, NONSTANDARD REGEX)

MsgBox, % RegExMatch(vText, "abc") ;does string contain abc (case sensitive)
MsgBox, % RegExMatch(vText, "ABC") ;does string contain ABC (case sensitive)
MsgBox, % RegExMatch(vText, "i)abc") ;does string contain abc (case insensitive)

MsgBox, % RegExMatch(vText, "abc") ;does string contain abc
MsgBox, % RegExMatch(vText, "^abc") ;does string start with abc
MsgBox, % RegExMatch(vText, "abc$") ;does string end with abc
MsgBox, % RegExMatch(vText, "^abc$") ;does string equal abc

MsgBox, % RegExMatch(vText, "abc|def") ;does string contain abc or def (or both)
MsgBox, % RegExMatch(vText, "(abc|def)") ;does string contain abc or def (or both) (same as line above)
MsgBox, % RegExMatch(vText, "^(abc|def)") ;does string start with abc or def
MsgBox, % RegExMatch(vText, "(abc|def)$") ;does string end with abc or def
MsgBox, % RegExMatch(vText, "^(abc|def)$") ;does string equal abc or def

MsgBox, % RegExMatch(vText, "^(Sun|Mon|Tues|Wednes|Thurs|Fri|Satur)day$") ;the string is a weekday

;note: you can also use the ~= operator, to use RegExMatch with 2 parameters:
MsgBox, % ("ABCDEFGHI" ~= "i)def") ;does string contain def (case insensitive)
MsgBox, % RegExMatch("ABCDEFGHI", "i)def") ;does string contain def (case insensitive)

;CONCEPTS: CHARACTERS/STRINGS/NEEDLES MUST APPEAR N TIMES
;CONCEPTS: USE THE FOUND TEXT IN THE REPLACE TEXT
;CHARACTERS: ? (0 OR 1)
;CHARACTERS: * (0 OR MORE)
;CHARACTERS: + (1 OR MORE)
;CHARACTERS: . (ANY CHARACTER) (NOTE: IF DOTALL OFF, DOESN'T MATCH NEWLINES)
;CHARACTERS: {}
;STRINGS: \d (DIGIT)
;STRINGS: $0 (TO USE FOUND TEXT IN THE REPLACE TEXT)

MsgBox, % RegExMatch(vText, "colou?r") ;does string contain color/colour
MsgBox, % RegExMatch(vText, "color|colour") ;does string contain color/colour (same as line above)
MsgBox, % RegExMatch(vText, "^\d*$") ;does the string only contain digits (it can be blank)
MsgBox, % RegExMatch(vText, "^\d+$") ;does the string only contain digits (at least one digit)
MsgBox, % RegExMatch(vText, "^-?\d+$") ;does the string only contain digits (at least one digit) (it can have a leading minus sign)
MsgBox, % RegExReplace(vText, " +", " ") ;replace multiple spaces with single spaces
MsgBox, % RegExReplace(vText, ".", "$0,") ;add a comma after every character
MsgBox, % RegExReplace(vText, ".{3}", "$0,") ;add a comma after every 3 characters
MsgBox, % RegExMatch(vText, "^(...)*$") ;string must have 0/3/6/9/... characters (a multiple of 3)
MsgBox, % RegExMatch(vText, "^(...)+$") ;string must have 3/6/9/... characters (a multiple of 3)
MsgBox, % RegExMatch(vText, "^(abc)*$") ;string must contain abc zero or more times
MsgBox, % RegExMatch(vText, "^(abc)+$") ;string must contain abc one or more times

MsgBox, % RegExMatch(vText, "^\d{2}$") ;is it a 2-digit number
MsgBox, % RegExMatch(vText, "^\d{4}$") ;is it a 4-digit number
MsgBox, % RegExMatch(vText, "^\d{2,}$") ;is it a 2-digit number (or longer)
MsgBox, % RegExMatch(vText, "^\d{0,4}$") ;is it a 4-digit number (or shorter) (or blank)
MsgBox, % RegExMatch(vText, "^\d{2,4}$") ;is it a number between 2 and 4 digits long

MsgBox, % RegExReplace(A_Now, "(....)(..)(..)(..)(..)(..)", "$1-$2-$3 $4:$5:$6") ;format the date

;CONCEPTS: SOME SPECIAL REGEX CHARACTERS NEED ESCAPING TO BE USED LITERALLY
;CHARACTERS: \
;STRINGS: \Q \E (text between \Q and \E is treated literally)
;NOTE: 12 characters that need escaping in RegEx generally: \.*?+[{|()^$

MsgBox, % RegExReplace("a.b", ".") ;replaces all characters
MsgBox, % RegExReplace("a.b", "\.") ;replaces dots
MsgBox, % RegExReplace("a.b", "\Q.\E") ;replaces dots
MsgBox, % RegExReplace("a...b", "\.\.\.") ;replaces dots
MsgBox, % RegExReplace("a...b", "\Q...\E") ;replaces dots

;CONCEPTS: STORE OUTPUT IN AN OBJECT (NAMED SUBPATTERNS)
;STRINGS: O) (STORE OUTPUT IN AN OBJECT) (AHK V1 ONLY) (AHK-SPECIFIC, NONSTANDARD REGEX)
;STRINGS: (?P<name>needle) (STORE OUTPUT IN AN OBJECT)

;note: for AHK v2, remove the 'O)'
;note: it outputs a RegExMatchObject, not a standard AHK array
RegExMatch(A_Now, "O)^(?P<Year>\d{4})(?P<Month>\d{2})(?P<Day>\d{2})", oMatch)
MsgBox, % oMatch.Year " " oMatch.Month " " oMatch.Day

;CONCEPTS: CHARACTER CLASSES
;CHARACTERS: ^ (CHARACTER CLASS: DOESN'T CONTAIN)
;CHARACTERS: - (CHARACTER CLASS: RANGE OF CHARACTERS)
;CHARACTERS: []
;NOTE: 4 characters that need escaping in a RegEx character class: ^-]\

MsgBox, % RegExReplace("abcdefghij", "[a-e]") ;replace characters a to e
MsgBox, % RegExReplace("abcdefghij", "[f-j]") ;replace characters f to j
MsgBox, % RegExReplace("abcdefghij", "[aeiou]") ;replace vowels
MsgBox, % RegExReplace("abcdefghij", "[^aeiou]") ;replace anything that isn't a vowel
MsgBox, % RegExReplace("abcdefghij", "[ac-hj]") ;replace characters a, c to h, and j

MsgBox, % RegExReplace(vText, "^[ `t]+") ;replace leading whitespace
MsgBox, % RegExReplace(vText, "[ `t]+$") ;replace trailing whitespace

;CONCEPTS: LOOK-AHEAD AND LOOK-BEHIND ASSERTIONS
;STRINGS: (?<=) (?=) (POSITIVE LOOK BEHIND/AHEAD)
;STRINGS: (?<!) (?!) (NEGATIVE LOOK BEHIND/AHEAD)

MsgBox, % RegExMatch(vText, "(?<= )abc") ;replace abc if it's preceded by a space
MsgBox, % RegExMatch(vText, "(?<! )abc") ;replace abc if it's not preceded by a space
MsgBox, % RegExMatch(vText, "abc(?= )") ;replace abc if it's followed by a space
MsgBox, % RegExMatch(vText, "abc(?! )") ;replace abc if it's not followed by a space
MsgBox, % RegExMatch(vText, "(?<= )abc(?= )") ;replace abc if it's preceded and followed by a space
MsgBox, % RegExMatch(vText, "(?<! )abc(?! )") ;replace abc if it's not preceded and followed by a space
Here is the RegEx tutorial proper: [522 lines (initially)]

Code: Select all

;[updated: 2018-02-25]
;==================================================

;CONTENTS

;QUICK REFERENCE: ESCAPED CHARACTERS
;LINKS
;NOTES: REGEX EQUIVALENTS FOR ? AMD * WILDCARDS IN WINDOWS
;NOTES: + VERSUS *
;NOTES: ?
;NOTES: <.+> VERSUS <.+?>
;NOTES: ^
;NOTES: NEWLINES (LINE BREAKS)
;NOTES: CHARACTER CLASSES
;NOTES: CHARACTERS
;NOTES: CHARACTER TYPES
;NOTES: CHARACTER TYPES (SCRIPT NAMES)
;NOTES: POSIX NAMED SETS
;NOTES: ESCAPED CHARACTERS
;COLUMNS/CROP: REMOVE COLUMNS (CROP BEFORE/AFTER FIRST/LAST OCCURRENCE IN EACH LINE)
;COLUMNS: INCREASE WHITESPACE BETWEEN COLUMNS
;FIND/REMOVE CHARACTERS
;REMOVE TAGS (REMOVE HTML TAGS)
;STRING COMPARE (WILDCARDS): ? AND *
;KEEP/REMOVE LINES THAT CONTAIN STRING
;CROP CHARACTERS
;GET NTH ITEM
;LOOK UP ITEMS IN A TABLE
;IF VAR IN LIST
;IF VAR CONTAINS(/STARTS/ENDS) LIST
;TRIM LEADING/MULTIPLE/TRAILING CHARACTERS
;TRIM: RECREATING AUTOHOTKEY'S TRIM/LTRIM/RTRIM FUNCTIONS
;SEPARATE LEADING WHITESPACE / CODE / COMMENTS
;PUT STRINGS BEFORE/AFTER OCCURRENCES OF NEEDLE
;SLICE STRING / PAD STRING
;WHOLE WORD MATCH/REPLACE
;UPPERCASE / LOWERCASE / TITLE CASE
;REMOVE URLS FROM STRING
;SPLIT PATH
;DATES
;BACKREFERENCES
;BINARY SEARCH
;REGULAR EXPRESSION CALLOUTS (GET ALL MATCHES/OCCURRENCES)
;GET ALL MATCHES
;SYNTAX SPECIFIC TO AUTOHOTKEY
;QUERIES

;==================================================

;QUICK REFERENCE: ESCAPED CHARACTERS

;12 characters that need escaping in RegEx generally: \.*?+[{|()^$

;not brackets (8): \ .*?+ | ^$
;open brackets (3): [ { (
;close brackets (1): )

;4 characters that need escaping in a RegEx character class: ^-]\

;==================================================

;LINKS

;Regular Expressions (RegEx) - Quick Reference
;https://www.autohotkey.com/docs/misc/RegEx-QuickRef.htm

;RegExMatch
;https://www.autohotkey.com/docs/commands/RegExMatch.htm

;RegExReplace
;https://www.autohotkey.com/docs/commands/RegExReplace.htm

;Regular Expression Callouts
;https://autohotkey.com/docs/misc/RegExCallout.htm

;SetTitleMatchMode
;https://www.autohotkey.com/docs/commands/SetTitleMatchMode.htm#RegEx

;pcre.txt
;http://www.pcre.org/pcre.txt

;==================================================

;NOTES: REGEX EQUIVALENTS FOR ? AMD * WILDCARDS IN WINDOWS

;In Windows:
;? usually means 1 character [RegEx equivalent: .]
;* usually means 0 or more characters [RegEx equivalent: .*]

;In RegEx:
;? usually relates to 'greedy'/'ungreedy', discussed below, or have a special meaning inside round brackets
;* usually means 0 or more of the preceding character

;note:
;a* (0 or more a's)
;a+ (1 or more a's)
;a{2,} (2 or more a's)
;a{3} (3 a's)
;a{4,6} (between 4 and 6 a's)
;a{0,7} (between 0 and 7 a's)
;a{,7} (warning: appears not to work)

;==================================================

;NOTES: + VERSUS *

;a+ 1 or more a's
;a* 0 or more a's
;.+ 1 or more characters
;.* 0 or more characters

;==================================================

;NOTES: ?

;zero or one of the preceding character, class, or subpattern:
;?

;Greed:
;see 'NOTES: <.+> VERSUS <.+?>' below

;To use the parentheses without the side-effect of capturing a subpattern
;e.g. (?:.*)

;Look-ahead and look-behind assertions:
;(?=...) pattern exists to right
;(?!...) pattern does not exist to right
;(?<=...) pattern exists to left (\K is similar and more versatile)
;(?<!...) pattern does not exist to left

;Callouts provide a means of temporarily passing control to the script in the middle of regular expression pattern matching.
;(Perform RegExMatch multiple times, a function retrieves the results one by one.)
;e.g. (?CCallout)

;Change options on-the-fly.
;e.g. (?im) turns on the case-insensitive and multiline options
;e.g. (?-im) would turn them both off

;==================================================

;NOTES: <.+> VERSUS <.+?>
;also: <.*> versus <.*?>

;https://autohotkey.com/docs/misc/RegEx-QuickRef.htm#UCP
;Greed: By default, *, ?, +, and {min,max} are greedy because they consume all characters up through the last possible one that still satisfies the entire pattern.
;To instead have them stop at the first possible character, follow them with a question mark.
;For example, the pattern <.+> (which lacks a question mark) means: "search for a <, followed by one or more of any character, followed by a >".
;To stop this pattern from matching the entire string <em>text</em>, append a question mark to the plus sign: <.+?>.
;This causes the match to stop at the first '>' and thus it matches only the first tag <em>.

;==================================================

;NOTES: ^

;check if string starts with 'a'
vPos := RegExMatch(vText, "^a")
;check if string contains a character that is not 'a'
;(returns 0 if string is blank or only contains a's)
vPos := RegExMatch(vText, "[^a]")

;==================================================

;NOTES: NEWLINES (LINE BREAKS)
;e.g. crop 3 characters from the start of each line
;the examples in this document assume CRLF-delimited text,
;but can be adjusted as shown in the examples below

;CRLF (`r`n)
vText := RegExReplace(vText, "m)^...")
;LF (`n)
vText := RegExReplace(vText, "`nm)^...")
;CR (`r)
vText := RegExReplace(vText, "`rm)^...")
;CR, LF, and CRLF
vText := RegExReplace(vText, "m)(*ANYCRLF)^...")
;any type of newline, namely `r, `n, `r`n, `v/VT/vertical tab/chr(0xB), `f/FF/formfeed/chr(0xC), and NEL/next-line/chr(0x85)
vText := RegExReplace(vText, "`am)^...")

;==================================================

;NOTES: CHARACTER CLASSES

;special characters: ^-]\
;e.g. [\^], [\-], [\]], and [\\]

;[abc] ;3 characters
;[a-z] ;lowercase letters
;[A-Za-z0-9] ;letters or digits
;[A-Za-z0-9\-_] ;valid characters in a YouTube video ID
;[aeiou] ;vowels
;[bcdfghjklmnpqrstvwxyz] ;consonants
;[b-df-hj-np-tv-z] ;consonants
;[^abc] ;not one of 3 characters
;[^a-z] ;not a lowercase letter
;[^A-Za-z0-9] ;not a letter or a digit

;note: [A-z] would include:
;ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
;i.e. every character from 'A' (Chr(65)) to 'z' (Chr(122))

;==================================================

;NOTES: CHARACTERS

;e.g. double quotes: \x22
vText := "abc" Chr(34) "def"
MsgBox, % RegExReplace(vText, "\x22")

;remove characters 1 to 31
vText := ""
Loop, 255
	vText .= Chr(A_Index)
MsgBox, % ">>>" vText
MsgBox, % ">>>" RegExReplace(vText, "[\ca-\c_]")
MsgBox, % ">>>" RegExReplace(vText, "[\x01-\x1F]")

;==================================================

;NOTES: CHARACTER TYPES

;e.g.
;\p{xx} 'a character with the xx property'
;\P{xx} 'a character without the xx property'
;e.g. \p{Ll} a lowercase letter a-z
;e.g. \P{Ll} not a lowercase letter a-z

;for more details see:
;pcresyntax specification
;http://www.pcre.org/original/doc/html/pcresyntax.html#SEC5

;examples:

;[A-Z]
;\p{Lu}

;[a-z]
;\p{Ll}

;[A-Za-z]
;\p{L}
;[^\W0-9_] ;i.e. a '\w' (see below) but not an _ or digit

;[A-Za-z0-9]
;(\p{L}|\d)
;\p{Xan}
;[^\W_] ;i.e. a '\w' (see below) but not an _

;[A-Za-z0-9_]
;\p{Xwd}
;\w

;==================================================

;NOTES: CHARACTER TYPES (SCRIPT NAMES)

;example code to list every Unicode character (1 to 65535) for a certain script/'alphabet'/language:

;list of scripts (i.e. loosely speaking 'alphabets' and related characters) from:
;pcresyntax specification
;http://www.pcre.org/original/doc/html/pcresyntax.html
vList := "Arabic, Armenian, Avestan, Balinese, Bamum, Bassa_Vah, Batak, Bengali, Bopomofo, Brahmi, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Caucasian_Albanian, Chakma, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret, Devanagari, Duployan, Egyptian_Hieroglyphs, Elbasan, Ethiopic, Georgian, Glagolitic, Gothic, Grantha, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Imperial_Aramaic, Inherited, Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Khojki, Khudawadi, Lao, Latin, Lepcha, Limbu, Linear_A, Linear_B, Lisu, Lycian, Lydian, Mahajani, Malayalam, Mandaic, Manichaean, Meetei_Mayek, Mende_Kikakui, Meroitic_Cursive, Meroitic_Hieroglyphs, Miao, Modi, Mongolian, Mro, Myanmar, Nabataean, New_Tai_Lue, Nko, Ogham, Ol_Chiki, Old_Italic, Old_North_Arabian, Old_Permic, Old_Persian, Old_South_Arabian, Old_Turkic, Oriya, Osmanya, Pahawh_Hmong, Palmyrene, Pau_Cin_Hau, Phags_Pa, Phoenician, Psalter_Pahlavi, Rejang, Runic, Samaritan, Saurashtra, Sharada, Shavian, Siddham, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet, Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Tirhuta, Ugaritic, Vai, Warang_Citi, Yi"
vList := StrReplace(vList, ", ", ",")

vOutput := ""
VarSetCapacity(vOutput, 100000*2)
Loop, Parse, vList, % ","
{
	vNeedle := "\p{" A_LoopField "}"
	vTemp := ""
	VarSetCapacity(vTemp, 65535*2)
	Loop, 65535
		if RegExMatch(Chr(A_Index), vNeedle)
			vTemp .= Chr(A_Index)
	vOutput .= StrLen(vTemp) "`t" A_LoopField "`t" vTemp "`r`n"
}
Clipboard := vOutput
MsgBox, % "done"

;==================================================

;NOTES: POSIX NAMED SETS

;https://autohotkey.com/docs/misc/RegEx-QuickRef.htm#class
;The following POSIX named sets are also supported via the form [[:xxx:]],
;where xxx is one of the following words:
;alnum, alpha, ascii (0-127), blank (space or tab), cntrl (control character),
;digit (0-9), xdigit (hex digit), print, graph (print excluding space),
;punct, lower, upper, space (whitespace), word (same as \w).

vText := "|123abc$£¥€√|"
MsgBox, % RegExReplace(vText,"[^[:ascii:]]")
MsgBox, % RegExReplace(vText,"[[:ascii:]]")
MsgBox, % RegExMatch(vText,"[[:ascii:]]")
MsgBox, % RegExMatch(vText,"[^[:ascii:]]")
return

;==================================================

;NOTES: ESCAPED CHARACTERS
;escaped characters: \.*?+[{|()^$

;https://autohotkey.com/docs/misc/RegEx-QuickRef.htm#Options
;the characters \.*?+[{|()^$ must be preceded by a backslash to be seen as literal

;prepare a literal needle:
;12 characters that need escaping in RegEx generally: \.*?+[{|()^$
vNeedle := RegExReplace(vNeedle, "[\Q\.*?+[{|()^$\E]", "\$0")

;also:
vNeedle := "\Q" RegExReplace(vNeedle, "\\E", "\E\\E\Q") "\E"

;use literal characters in a character class:
;4 characters that need escaping in a RegEx character class: ^-]\
MsgBox, % RegExReplace("abc^-]\", "[abc]")
MsgBox, % RegExReplace("abc^-]\", "[a-z]")
MsgBox, % RegExReplace("abc^-]\", "[\^\-\]\\]")
MsgBox, % RegExReplace("abc^-]\", "[\Q^-]\\E]")

;this function will prepare a string to be searched literally
;in a needle for RegExMatch/RegExReplace:
JEE_StrRegExLiteral(vText)
{
	vOutput := ""
	VarSetCapacity(vOutput, StrLen(vText)*2*2)

	Loop, Parse, vText
		if InStr("\.*?+[{|()^$", A_LoopField)
			vOutput .= "\" A_LoopField
		else
			vOutput .= A_LoopField
	return vOutput
}

;==================================================

;COLUMNS/CROP: REMOVE COLUMNS (CROP BEFORE/AFTER FIRST/LAST OCCURRENCE IN EACH LINE)
;note: see 'NOTES: newlines' above [use (*ANYCRLF) or `a to handle more types of line break]

;(where n is the number of columns in the line)
;keep column 1 (remove columns 2 to n)
vText := RegExReplace(vText, "`t.*")
;keep columns 2 to n (remove column 1)
vText := RegExReplace(vText, "m)^.*?`t")
;vText := RegExReplace(vText, "mU)^.*`t") ;equivalent to line above
;keep columns 1 to n-1 (remove column n)
vText := RegExReplace(vText, "m)`t(?!.*`t).*")
;keep column n (remove columns 1 to n-1)
vText := RegExReplace(vText, ".*`t")

;ini get keys (keep column 1) (remove columns 2 to n)
vText := RegExReplace(vText, "=.*")
;ini get values (keep columns 2 to n) (remove column 1)
vText := RegExReplace(vText, "m)^.*?=")
;vText := RegExReplace(vText, "mU)^.*=") ;equivalent to line above

;AutoHotkey script remove comments (keep column 1) (remove columns 2 to n)
vText := RegExReplace(vText, " `;.*")

;==================================================

;COLUMNS: INCREASE WHITESPACE BETWEEN COLUMNS

;RegExReplace: expand nth gap
vText := " ;continuation section
(
a	b	c	d	e
f	g	h	i	j
k	l	m	n	o
p	q	r	s	t
u	v	w	x	y
z
)"
MsgBox, % vText
Loop, 4
{
	vNum := A_Index
	MsgBox, % RegExReplace(vText, "(^|\n)(\t*[^\t]*){" vNum "}\K\t", "`t`t")
}
MsgBox, % vText
Loop, 4
{
	vNum := A_Index
	MsgBox, % vText := RegExReplace(vText, "(^|\n)(\t*[^\t]*){" vNum "}\K\t", "`t`t")
}

;==================================================

;FIND/REMOVE CHARACTERS
;note: some of these are similar to 'if var is type'

;check if contains/doesn't contain certain characters
vPos := RegExMatch(vText, "[A-Za-z0-9_]") ;alphanumeric characters or underscore
vPos := RegExMatch(vText, "\w") ;alphanumeric characters or underscore
vPos := RegExMatch(vText, "[A-Za-z0-9_\-]") ;valid characters in a YouTube video ID
vPos := RegExMatch(vText, "[\w\-]") ;valid characters in a YouTube video ID
vPos := RegExMatch(vText, "[A-Za-z0-9]") ;alphanumeric characters
vPos := RegExMatch(vText, "i)[a-z0-9]") ;alphanumeric characters
vPos := RegExMatch(vText, "[A-Za-z]") ;letters
vPos := RegExMatch(vText, "[A-Z]") ;uppercase letters
vPos := RegExMatch(vText, "[a-z]") ;lowercase letters
vPos := RegExMatch(vText, "\d") ;digits
vPos := RegExMatch(vText, "[0-9]") ;digits
vPos := RegExMatch(vText, "[aeiou]") ;individual characters (e.g. vowels)

vPos := RegExMatch(vText, "[^A-Za-z0-9_]") ;not alphanumeric characters or underscore
vPos := RegExMatch(vText, "\W") ;not alphanumeric characters or underscore
vPos := RegExMatch(vText, "[^A-Za-z0-9_\-]") ;invalid characters in a YouTube video ID
vPos := RegExMatch(vText, "[^\w\-]") ;invalid characters in a YouTube video ID
vPos := RegExMatch(vText, "[^A-Za-z0-9]") ;non-alphanumeric characters
vPos := RegExMatch(vText, "i)[^a-z0-9]") ;non-alphanumeric characters
vPos := RegExMatch(vText, "[^A-Za-z]") ;non-letters
vPos := RegExMatch(vText, "[^A-Z]") ;not uppercase letters
vPos := RegExMatch(vText, "[^a-z]") ;not lowercase letters
vPos := RegExMatch(vText, "\D") ;non-digits
vPos := RegExMatch(vText, "[^0-9]") ;digits
vPos := RegExMatch(vText, "[^aeiou]") ;not individual characters (e.g. vowels)

;replace/remove characters (see just above for more RegEx needles)
vText := RegExReplace(vText, "[aeiou]")  ;individual characters (e.g. vowels)

;remove multiple strings
vText := RegExReplace(vText, "aa|bb|cc")
vText := RegExReplace(vText, "(aa|bb|cc)")

;further examples:

;if letters or digits only
if !RegExMatch(vText, "[^A-Za-z0-9]")
	MsgBox, % "letters/digits only"

;if letters/uppercase letters/lowercase letters only
if !RegExMatch(vText, "[^A-Za-z]")
	MsgBox, % "letters only"
if !RegExMatch(vText, "[^A-Z]")
	MsgBox, % "uppercase letters only"
if !RegExMatch(vText, "[^a-z]")
	MsgBox, % "lowercase letters only"

;check for invalid filename characters [Chr(1) to Chr(31) and \/:*?"<>|]
if RegExMatch(vName, "[" Chr(1) "-" Chr(31) "\\/:*?""<>|]") ;invalid file name characters (40)
if RegExMatch(vPath, "[" Chr(1) "-" Chr(31) "/*?""<>|]") ;invalid file path characters (38) (allow : and \)

;if digits only (with optional leading hyphen)
;if !RegExMatch(vNum, "[^0-9\-]") && !InStr(SubStr(vNum, 2), "-")
if RegExMatch(vNum, "^(-\d|)\d*$") ;improved version

;check for datestamp e.g. '03:02 04/05/2006' ('dd:dd dd/dd/dddd')
vPos := RegExMatch(vText, "\d\d:\d\d \d\d/\d\d/\d\d\d\d")

;word/phrase to initials
;e.g. 'light-emitting diode' to 'led' (LED)
vText := "light-emitting diode"
MsgBox, % RegExReplace(vText, "[A-Za-z]\K[A-Za-z]+|[^A-Za-z]") ;with +
MsgBox, % RegExReplace(vText, "[A-Za-z]\K[A-Za-z]|[^A-Za-z]")  ;without + (doesn't work)
MsgBox, % RegExReplace(vText, "(?<=[A-Za-z])[A-Za-z]+|[^A-Za-z]") ;with +
MsgBox, % RegExReplace(vText, "(?<=[A-Za-z])[A-Za-z]|[^A-Za-z]") ;without +

;==================================================

;REMOVE TAGS (REMOVE HTML TAGS)

;html: remove tags
vText := RegExReplace(vText, "<.+?>")

;html: remove 'a' tags
vText := RegExReplace(vText, "<a .+?>")

;old AutoHotkey forum: remove 'color' tags ('colour' tags)
vText := StrReplace(vText, "[/color]")
vText := RegExReplace(vText, "\[color=.+?]")

;==================================================

;STRING COMPARE (WILDCARDS): ? AND *
;compare strings using ? and * as wildcards

;prepare a literal string but with ? and * as wildcards:
;deal with special characters: \.*?+[{|()^$
vText := RegExReplace(vText, "[\Q\.+[{|()^$\E]", "\$0")
vText := StrReplace(vText, "?", ".")
vText := StrReplace(vText, "*", ".*")

;e.g.
q::
vText := "qwertyuiopasdfghjklzxcvbnm"
vNeedle := "qw?rty*m"
MsgBox, % JEE_StrMatchWildcards(vText, vNeedle)
return

JEE_StrMatchWildcards(vText, vNeedle, vCaseSen=0)
{
	vOpt := "s"
	(vCaseSen) ? "" : (vOpt .= "i")

	;escaped characters: \.*?+[{|()^$
	vNeedle2 := ""
	VarSetCapacity(vNeedle2, StrLen(vNeedle)*2*2)
	Loop, Parse, vNeedle
	{
		vTemp := A_LoopField
		(InStr("\.+[{|()^$", vTemp)) ? (vTemp := "\" vTemp) : ""
		(vTemp = "?") ? (vTemp := ".") : ""
		(vTemp = "*") ? (vTemp := ".*") : ""
		vNeedle2 .= vTemp
	}
	return (RegExMatch(vText, vOpt ")^" vNeedle2 "$") = 1)
}

;==================================================

;KEEP/REMOVE LINES THAT CONTAIN STRING

;note: if a line contains tbe needle, the line and the line break after it are removed,
;this will affect whether there are any trailing line breaks

;remove lines that contain string
vText := RegExReplace(vText, "m)^.*\Q" vNeedle "\E.*(\R|$)")
;remove lines that start with string
vText := RegExReplace(vText, "m)^\Q" vNeedle "\E.*(\R|$)")
;remove lines that end with string
vText := RegExReplace(vText, "m)^.*\Q" vNeedle "\E(\R|$)")

;make lines blank that contains string
vText := RegExReplace(vText, "m)^.*\Q" vNeedle "\E.*")
;make lines blank that start with string
vText := RegExReplace(vText, "m)^\Q" vNeedle "\E.*")
;make lines blank that end with string
vText := RegExReplace(vText, "m)^.*\Q" vNeedle "\E$")

;remove lines that contain YouTube webpage titles (e.g. prior to spellcheck)
;(i.e. remove lines that end with ' - YouTube')
vText := RegExReplace(vText, "m)^.* - YouTube(\R|$)")

vText := "abc.abc`r`nabc.txt`r`nabc.abc`r`nabc.txt`r`nabc.abc`r`nabc.txt`r`n"
;remove lines that end in .txt
MsgBox, % RegExReplace(vText, "m)^.*\.txt(\R|$)")
;only keep lines that end in .txt (remove lines that don't end in .txt) (note: uses look behind)
MsgBox, % RegExReplace(vText, "m)^.*(?<!\.txt)(\R|$)")

vText := "abc`r`ndef`r`nabc`r`ndef"
;remove lines that contain b
MsgBox, % RegExReplace(vText, "m)^.*b.*(\R|$)")
;only keep lines that contain b (remove lines that don't contain b) (note: uses look behind)
MsgBox, % RegExReplace(vText, "m)^((?!b).)*(\R|$)")

;==================================================

;CROP CHARACTERS
;crop first/last n characters from each line

;crop first 5 characters from each line
vText := RegExReplace(vText, "m)^.....")

;crop last 5 characters from each line
vText := RegExReplace(vText, "m).....$")

;crop first 5 and last 5 characters from each line
vText := RegExReplace(vText, "m)^.....|.....$")

;==============================

;delete text relative to needle

;delete everything after the first m
vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % RegExReplace(vText, "^.*?m\K.*")

;delete from the first n onwards
vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % RegExReplace(vText, "^.*?\Kn.*")

;delete everything up to the first m
vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % RegExReplace(vText, "^.*?m")

;delete everything after the first n
vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % RegExReplace(vText, "^.*(?=n)")

;==============================

;get text before first i
vText := "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz"
RegExMatch(vText, ".*?(?=i)", v)
MsgBox, % v

;get text before last i
vText := "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz"
RegExMatch(vText, ".*(?=i)", v)
MsgBox, % v

;==================================================

;GET NTH ITEM

;get first line
vText := SubStr(vText, 1, RegExMatch(vText "`r", "`r|`n") - 1)

;get nth item (useful link)
;NCHITTEST Full Throttled - basic Window manipulations - Scripts and Functions - AutoHotkey Community
;https://autohotkey.com/board/topic/31032-nchittest-full-throttled-basic-window-manipulations/#entry197623

vNum := 3
RegExMatch("ERROR TRANSPARENT NOWHERE CLIENT CAPTION SYSMENU SIZE MENU HSCROLL VSCROLL MINBUTTON MAXBUTTON LEFT RIGHT TOP TOPLEFT TOPRIGHT BOTTOM BOTTOMLEFT BOTTOMRIGHT BORDER OBJECT CLOSE HELP", "(?:\w+\s+){" . ErrorLevel+2&0xFFFFFFFF . "}(?<AREA>\w+\b)", HT)
MsgBox, % HTAREA

;get nth item
vNum := 3
RegExMatch("red yellow green blue", "(?:\w+\s+){" (vNum-1) "}(?<Item>\w+\b)", vMatch)
MsgBox, % vMatchItem

;initial attempt at using a comma-separated list instead
vNum := 3
RegExMatch("red,yellow,green,blue", "(?:[^,]+,+){" (vNum-1) "}(?<Item>[^,]+)", vMatch)
MsgBox, % vMatch

;==================================================

;LOOK UP ITEMS IN A TABLE

;lookup list:
;e.g. RGB values

;uses look-ahead (?=...) and look-behind (?<=...) assertions, see:
;https://autohotkey.com/docs/misc/RegEx-QuickRef.htm#UCP

vNeedles := "red,yellow,green,blue,orange"
vList := "red:FF0000,yellow:FFFF00,green:00FF00,blue:0000FF"
Loop, Parse, vNeedles, `,
{
	vNeedle := A_LoopField
	;RegExMatch("," vList ",", "(?<=" vNeedle ":).*?(?=,)", vMatch) ;easier to understand version
	RegExMatch(vList, "(^|,)" vNeedle ":\K.*?(?=($|,))", vMatch)
	MsgBox, % vMatch
}

vNeedles := "red,yellow,green,blue,orange"
vList := "red=FF0000`nyellow=FFFF00`ngreen=00FF00`nblue=0000FF"
Loop, Parse, vNeedles, `,
{
	vNeedle := A_LoopField
	;RegExMatch("`n" vList "`n", "(?<=" vNeedle "=).*?(?=`n)", vMatch) ;easier to understand version
	RegExMatch(vList, "(^|`n)" vNeedle "=\K.*?(?=($|`n))", vMatch)
	MsgBox, % vMatch
}

;==================================================

;IF VAR IN LIST
;IF VAR CONTAINS(/STARTS/ENDS) LIST

;if string is in list (if var in)
MsgBox, % RegExMatch(vText, "^(red|yellow|green|blue)$")
MsgBox, % RegExMatch(vText, "^(Sun|Mon|Tues|Wednes|Thurs|Fri|Satur)day$")
MsgBox, % RegExMatch(vMonth, "^(4|6|9|11)$")

;if string starts with string (if var starts)
MsgBox, % RegExMatch(vText, "^http")
;if strings starts with one of items in list
MsgBox, % RegExMatch(vText, "^(re|un)")
MsgBox, % RegExMatch(vText, "^(http|www)")
;if string starts with one of characters in list
MsgBox, % RegExMatch(vText, "^[aeiou]")

;if string ends with string (if var ends)
MsgBox, % RegExMatch(vText, "day$")
;if strings ends with one of items in list
MsgBox, % RegExMatch(vText, "(s|ing|ed)$")
MsgBox, % RegExMatch(vText, "(bmp|gif|jpg|png)$")
;if string ends with one of characters in list
MsgBox, % RegExMatch(vText, "[aeiou]$")

;if string contains (at least) one of items in list (if var contains)
MsgBox, % RegExMatch(vText, "aa|bb|cc") ;case sensitive
MsgBox, % RegExMatch(vText, "i)aa|bb|cc") ;case insensitive

;if string contains every item in list (multiple if var contains)
vText := "zzAAzzBBzzCC"
MsgBox, % RegExMatch(vText, "(?=.*aa)(?=.*bb)(?=.*cc)") ;case sensitive
MsgBox, % RegExMatch(vText, "i)(?=.*aa)(?=.*bb)(?=.*cc)") ;case insensitive

;==================================================

;TRIM LEADING/MULTIPLE/TRAILING CHARACTERS

;trim leading/multiple/trailing whitespace
vText := RegExReplace(vText, "m)^[ `t]+")
vText := RegExReplace(vText, "[ `t]{2,}", " ") ;all multiple whitespace to spaces
vText := RegExReplace(vText, "[ `t]{2,}", "`t") ;all multiple whitespace to tabs
vText := RegExReplace(vText, " [ `t]+", " ") ;whitespace starting with space to spaces
vText := RegExReplace(vText, "`t[ `t]+", "`t") ;whitespace starting with tabs to tabs
vText := RegExReplace(vText, "m)[ `t]+$")

;trim leading/multiple/trailing spaces
vText := RegExReplace(vText, "m)^ +")
vText := RegExReplace(vText, " {2,}", " ")
vText := RegExReplace(vText, "m) +$")

;trim leading/multiple/trailing tabs
vText := RegExReplace(vText, "m)^`t+")
vText := RegExReplace(vText, "`t{2,}", "`t")
vText := RegExReplace(vText, "m)`t+$")

;trim leading/multiple/trailing CRLFs (trim enters)
;(note: multiple will remove blank lines,
;although it will leave leading/trailing blank lines)
vText := RegExReplace(vText, "^(`r`n)+")
vText := RegExReplace(vText, "(`r`n){2,}", "`r`n")
vText := RegExReplace(vText, "(`r`n)+$")

;multiple blank lines to single blank lines (trim blank lines)
;(note: this will replace multiple blank lines with single blank lines,
;although it will leave leading/trailing double blank lines)
vText := RegExReplace(vText, "(`r`n){3,}", "`r`n`r`n")

;==================================================

;RECREATING AUTOHOTKEY'S TRIM/LTRIM/RTRIM FUNCTIONS

;Trim/LTrim/RTrim via RegExReplace. Note: the original functions and these also, are case sensitive.

Trim(vText, vOmitChars=" `t")
{
	vOmitChars := RegExReplace(vOmitChars, "[\Q^-]\\E]", "\$0") ;make 4 chars literal: ^-]\
	return RegExReplace(vText, "^[" vOmitChars "]*|[" vOmitChars "]*$")
}
LTrim(vText, vOmitChars=" `t")
{
	vOmitChars := RegExReplace(vOmitChars, "[\Q^-]\\E]", "\$0") ;make 4 chars literal: ^-]\
	return RegExReplace(vText, "^[" vOmitChars "]*")
}
RTrim(vText, vOmitChars=" `t")
{
	vOmitChars := RegExReplace(vOmitChars, "[\Q^-]\\E]", "\$0") ;make 4 chars literal: ^-]\
	return RegExReplace(vText, "[" vOmitChars "]*$")
}

;==================================================

;SEPARATE LEADING WHITESPACE / CODE / COMMENTS

;LEADING WHITESPACE / CODE
;if there are 5 spaces I want the number 6
;if there are 0 spaces I want the number 1 (even if the line is blank)
;in order to do this:
;vWhitespace := SubStr(vText, 1, vPos-1)
;vCode := SubStr(vText, vPos)

vText := A_Space A_Space A_Space A_Space A_Space
MsgBox, % vPos := RegExMatch(vText, "[ `t]*\K")
vText := ""
MsgBox, % vPos := RegExMatch(vText, "[ `t]*\K")

;CODE / COMMENTS
;AutoHotkey sees comments as a semicolon preceded by a space or tab
;get start of comments on a line
;vPos := RegExMatch(vText, "[ `t]+;")
;remove comments from a line
;vText := RegExReplace(vText, "[ `t]+;.*")

;==================================================

;PUT STRINGS BEFORE/AFTER OCCURRENCES OF NEEDLE

;put strings before/after datestamps (datestamp is anywhere)
vText := RegExReplace(vText, "\d\d:\d\d \d\d/\d\d/\d\d\d\d", vPrefix "$0" vSuffix)

;put strings before/after datestamps (datestamp is the entire line)
vText := RegExReplace(vText, "m)^\d\d:\d\d \d\d/\d\d/\d\d\d\d$", vPrefix "$0" vSuffix)

;enclose pagestamps (that occupy an entire line) in square brackets e.g. 'p.1' to '[p.1]'
vText := RegExReplace(vText, "m)^p\.\d+$", "[$0]")

;put parentheses (round brackets) around first item
vText1 := "abc,def,ghi"
vText2 := "abc,def"
vText3 := "abc"
MsgBox, % "(" RegExReplace(vText1, "^[^,]*(?=,|$)", "$0)")
MsgBox, % "(" RegExReplace(vText2, "^[^,]*.(?=,|$)", "$0)")
MsgBox, % "(" RegExReplace(vText3, "^[^,]*.(?=,|$)", "$0)")

;put parentheses (round brackets) around last item
vText1 := "abc,def,ghi"
vText2 := "abc,def"
vText3 := "abc"
MsgBox, % RegExReplace(vText1, "(?<=^|,).(?!.*,)", "($0") ")"
MsgBox, % RegExReplace(vText2, "(?<=^|,).(?!.*,)", "($0") ")"
MsgBox, % RegExReplace(vText3, "(?<=^|,).(?!.*,)", "($0") ")"

;==================================================

;SLICE STRING / PAD STRING

vText := "abcdefghij"
MsgBox, % RegExReplace(vText, "", "_") ;left/middle/right ;_a_b_c_d_e_f_g_h_i_j_
MsgBox, % RegExReplace(vText, "(?<=.)(?=.)", "_") ;middle ;a_b_c_d_e_f_g_h_i_j
MsgBox, % RegExReplace(vText, "(?=.)", "_") ;left/middle ;_a_b_c_d_e_f_g_h_i_j
MsgBox, % RegExReplace(vText, "(?<=.)", "_") ;middle/right ;a_b_c_d_e_f_g_h_i_j_

vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % Trim(RegExReplace(vText, ".{3}", "$0_"), "_") ;abc_def_ghi_jkl_mno_pqr_stu_vwx_yz

vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % Trim(RegExReplace("_" vText, ".{3}", "$0_"), "_") ;ab_cde_fgh_ijk_lmn_opq_rst_uvw_xyz

vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % Trim(RegExReplace("__" vText, ".{3}", "$0_"), "_") ;a_bcd_efg_hij_klm_nop_qrs_tuv_wxy_z

vText := "abcdefghijklmnopqrstuvwxyz"
vOutput := ""
Loop, 26
	vOutput .= Trim(RegExReplace(vText, ".{" A_Index "}", "$0 ")) "`r`n"
MsgBox, % SubStr(vOutput, 1, -2)

;==================================================

;WHOLE WORD MATCH/REPLACE

;whole word match
if RegExMatch(vText, "\b\Q" A_LoopField "\E\b")

;whole word replace (e.g. change variable names)
vText := RegExReplace(vText, "\bname_no_ext\b", "vNameNoExt", 0, -1, 1)

;==================================================

;UPPERCASE / LOWERCASE / TITLE CASE

vText := "The quick brown fox jumps over the lazy dog."
MsgBox, % RegExReplace(vText, ".*", "$U0")

vText := "The quick brown fox jumps over the lazy dog."
MsgBox, % RegExReplace(vText, ".*", "$L0")

vText := "The quick brown fox jumps over the lazy dog."
MsgBox, % RegExReplace(vText, ".*", "$T0")

vText := "HeLLO HeLLo HeLLo"
MsgBox, % RegExReplace(vText, "([^ ]*) ([^ ]*) ([^ ]*)", "$U1 $T2 $L3")

;there are many different variants of 'title case',
;even MS Excel and MS Word differ on how they convert to title case,
;here is some code for converting to title case which you may want to tweak:

vText := "The quick brown fox jumps over the lazy dog."
MsgBox, % RegExReplace(vText, "(\b[a-z])", "$U1")

vText := "The-quick-brown-fox-jumps-over-the-lazy-dog."
MsgBox, % RegExReplace(vText, "(\b[a-z])", "$U1")

;invert case:
vText := "Hello World"
MsgBox, % RegExReplace(vText, "([A-Z])|([a-z])", "$L1$U2")

;get string but only when it's *not* lowercase
vText := "_aaa_Aaa_AAA_aaa_Aaa_AAA_"
vPos := 1
vOutput := ""
;note: uses look behind
while vPos := RegExMatch(vText, "i)aaa(?-i)(?<!aaa)", "", vPos)
	vOutput .= SubStr(vText, vPos, 3) "`r`n", vPos += 3
MsgBox, % SubStr(vOutput, 1, -2)

;get string but only when it's *not* lower/title/upper case
vText := "_aaa_Aaa_aAA_AAA_aaa_Aaa_AAA_aAA_"
vPos := 1
vOutput := ""
;note: uses look behind
while vPos := RegExMatch(vText, "i)aaa(?-i)(?<!(aaa|Aaa|AAA))", "", vPos)
	vOutput .= SubStr(vText, vPos, 3) "`r`n", vPos += 3
MsgBox, % SubStr(vOutput, 1, -2)

;==================================================

;REMOVE URLS FROM STRING

;remove urls (e.g. prior to spellcheck)
;(there may be more advanced ways of doing this)
vListUrl := ""
VarSetCapacity(vListUrl, StrLen(vText)*2)
Loop
{
	if !RegExMatch(vText, "\bhttp.*?\b", vUrl, 1)
		break
	vText := RegExReplace(vText, "\bhttp.*?\b", "", 0, 1, 1)
	vListUrl .= vUrl "`r`n"
}
Loop
{
	if !RegExMatch(vText, "\bwww\.+?\b", vUrl, 1)
		break
	vText := RegExReplace(vText, "\bwww\.+?\b", "", 0, 1, 1)
	vListUrl .= vUrl "`r`n"
}

;==================================================

;SPLIT PATH

;there are many techniques for splitting a path into dir/name/name no ext/ext etc
;here are some of them:

vPath := "C:\Program Files\AutoHotkey\AutoHotkey.exe"
;vPath := "C:\Program Files\AutoHotkey\MyNoExtFile"
SplitPath, vPath, vName, vDir, vExt, vNameNoExt, vDrive
MsgBox, % vName "`r`n" vDir "`r`n" vExt "`r`n" vNameNoExt "`r`n" vDrive

;path/name to name
vOutput := "name:`r`n" vName
vOutput .= "`r`n" RegExReplace(vPath, ".*\\")
;vOutput .= "`r`n" SubStr(vPath, InStr(vPath, "\", 0, 0)+1) ;AHK v1
vOutput .= "`r`n" SubStr(vPath, InStr(vPath, "\", 0, -1)+1) ;AHK v2
MsgBox, % vOutput

;path to directory
vOutput := "dir:`r`n" vDir
vOutput .= "`r`n" RegExReplace(vPath, "\\(?!.*\\).*$")
vOutput .= "`r`n" RegExReplace(vPath, "\\[^\\]*$")
vOutput .= "`r`n" RegExReplace(vPath, ".*\K\\.*")
;vOutput .= "`r`n" SubStr(vPath, 1, InStr(vPath, "\", 0, 0)-1) ;AHK v1
vOutput .= "`r`n" SubStr(vPath, 1, InStr(vPath, "\", 0, -1)-1) ;AHK v2
MsgBox, % vOutput

;path to directory (with trailing backslash)
vOutput := "dir\:`r`n" vDir "\"
vOutput .= "`r`n" RegExReplace(vPath, "(?!.*\\).*$")
vOutput .= "`r`n" RegExReplace(vPath, "[^\\]*$")
vOutput .= "`r`n" RegExReplace(vPath, ".*\\\K.*")
;vOutput .= "`r`n" SubStr(vPath, 1, InStr(vPath, "\", 0, 0)) ;AHK v1
vOutput .= "`r`n" SubStr(vPath, 1, InStr(vPath, "\", 0, -1)) ;AHK v2
MsgBox, % vOutput

;path/name to extension
vOutput := "ext:`r`n" vExt
;vOutput .= "`r`n" RegExReplace(vPath, ".*(\.|$)") ;doesn't work
vOutput .= "`r`n" RegExReplace(vPath, "^.*?((\.(?!.*\\)(?!.*\.))|$)")
MsgBox, % vOutput

;path/name to name no extension
vOutput := "name no ext:`r`n" vNameNoExt
vOutput .= "`r`n" RegExReplace(vPath, ".*\\|\.[^.]*$")
MsgBox, % vOutput

;path to drive
vOutput := "drive:`r`n" vDrive
vOutput .= "`r`n" RegExReplace(vPath, ".*?:\K.*")
MsgBox, % vOutput

;also:
vPath := "C:\abc.def\ghi.jkl"
;vPath := "C:\abc.def\ghi"
;vPath := "ghi.jkl"
;vPath := "ghi"
vName := vPath

;remove extension (path to path no extension) (name to name no extension)
MsgBox, % RegExReplace(vPath, "\.[^.\\]*$")

;remove extension (name to name no extension)
;(note: not reliable for 'path to path no extension', e.g. if the directory contains '.' but the name doesn't)
MsgBox, % RegExReplace(vName, "\.[^.]*$")

;==================================================

;DATES

;alternatives to AutoHotkey's FormatTime function
;where A_Now is of the form 'yyyyMMddHHmmss'
MsgBox, % vDate := RegExReplace(A_Now, "(....)(..)(..).{6}", "$1-$2-$3")
MsgBox, % vDate := RegExReplace(A_Now, ".{8}(..)(..)(..)", "$1-$2-$3")
MsgBox, % vDate := RegExReplace(A_Now, "(....)(..)(..)(..)(..)(..)", "$1-$2-$3 $4-$5-$6")
MsgBox, % vDate := RegExReplace(A_Now, "(?<=..)..(?=.)", "$0|") ;e.g. 20060504030201 -> 2006|05|04|03|02|01
MsgBox, % vDate := RegExReplace(A_Now, "(?<=..)..(?=.)", "$0 ") ;e.g. 20060504030201 -> 2006 05 04 03 02 01

;from:
;[handy FormatTime one-liners]
;combining date variables is unreliable - AutoHotkey Community
;https://autohotkey.com/boards/viewtopic.php?f=5&t=36338

;==================================================

;BACKREFERENCES

;RegExMatch, brackets within brackets

vText := "abc123xyz"
RegExMatch(vText, "(abc)123(xyz)", v)
MsgBox, % v1 " " v2
RegExMatch(vText, "(abc(123)xyz)", v)
MsgBox, % v1 " " v2

RegExMatch(A_Now, "^(?P<Year>\d{4})(?P<Month>\d{2})(?P<Day>\d{2})", v)
MsgBox, % Format("{} {} {}", vYear, vMonth, vDay)

RegExMatch(A_Now, "O)^(?P<Year>\d{4})(?P<Month>\d{2})(?P<Day>\d{2})", o)
MsgBox, % Format("{} {} {}", o.Year, o.Month, o.Day)

;==================================================

;BINARY SEARCH
;binary search (get offset of binary needle relative to start of variable)

;https://autohotkey.com/docs/misc/RegEx-QuickRef.htm
;Escape sequences in the form \xhh are also supported,
;in which hh is the hex code of any ANSI character between 00 and FF.

;e.g. the string 'abcd' as a haystack (UTF-16 LE) (4 characters, 8 bytes)
;bytes in order (in hex): 61,00,62,00,63,00,64,00
vText := "abcd"
MsgBox, % (RegExMatch(vText, "\x{0061}\x{0062}") - 1) * 2
MsgBox, % (RegExMatch(vText, "\x{0063}\x{0064}") - 1) * 2

;e.g. the square root sign as a needle (Chr(8730)) (UTF-16 LE) (1 character, 2 bytes)
;bytes in order (in hex): 1A,22
vText := "aaa" Chr(8730)
MsgBox, % (RegExMatch(vText, "\x{221A}") - 1) * 2

;e.g. the number 0x1020304000000000 as an Int64 in an 8-byte variable as a haystack
;bytes in order (in hex): 00,00,00,00,40,30,20,10
VarSetCapacity(vData, 8, 1)
NumPut(0x1020304000000000, vData, 0, "Int64")
MsgBox, % (RegExMatch(vData, "\x{3040}\x{1020}") - 1) * 2

;I recommend wOxxOm's InBuf function for binary searching:
;Machine code binary buffer searching regardless of NULL - Scripts and Functions - AutoHotkey Community
;https://autohotkey.com/board/topic/23627-machine-code-binary-buffer-searching-regardless-of-null/

;==================================================

;REGULAR EXPRESSION CALLOUTS (GET ALL MATCHES/OCCURRENCES)
;(Perform RegExMatch multiple times, a function retrieves the results one by one.)

;https://autohotkey.com/docs/misc/RegExCallout.htm
;Callouts provide a means of temporarily passing control to the script in the middle of regular expression pattern matching.

;e.g. datestamps and text, get list of datestamps and first line
vText = ;continuation section
(Join`r`n
01:00 01/01/2001
S1 L1
S1 L2

02:00 01/01/2001
S2 L1
S2 L2
)

vOutput := ""
RegExMatch(vText, "(?:^|`r`n)\K(\d\d:\d\d \d\d/\d\d/\d\d\d\d)`r`n(.*?)(?=`r`n|$)(?CCallout)")
MsgBox, % vOutput
return

Callout(m)
{
	global vOutput
	MsgBox, m=%m%`r`nm1=%m1%`r`nm2=%m2%
	vOutput .= m1 "`t" m2 "`r`n"
	return 1
}

;Note: if we replaced '(?CCallout)' with '(?C10:Callout)'.
;and if we replaced 'Callout(m)' with 'Callout(m, n)'
;then n would contain the number '10'.

;==================================================

;GET ALL MATCHES

;to count occurrences:
vText := "ABCabcd12345"
MsgBox, % RegExReplace(vText, "[A-Z]", "", vCountU)
MsgBox, % RegExReplace(vText, "[a-z]", "", vCountL)
MsgBox, % RegExReplace(vText, "\d", "", vCountN)
MsgBox, % Format("{} {} {}", vCountU, vCountL, vCountN)

;note: RegExMatch does not have a way to get all matches and store them in an object

;get multiple matches via a while loop
vText := "abc 123 def 456 ghi 789"
vOutput := "", vPos := 1
while vPos := RegExMatch(vText, "O)[A-Za-z]+", o, vPos)
	vOutput .= o.0 "`r`n", vPos += StrLen(o.0)
MsgBox, % vOutput

;get multiple matches by repeating the needle
;you can use RegExReplace to work out how many times to repeat a needle
;note: if an item appears more times than the repeated needle, the approach will still work
;note: if an item appears fewer times than the repeated needle, the approach will fail
;note: you need to pick appropriate RegEx needles and padding
vText := "abc 123 def 456 ghi 789"
vNeedle := "[A-Za-z]+"
vNeedlePad := ".*?"
RegExReplace(vText, vNeedle, "", vCount)
;MsgBox, % vCount
vNeedleMult := ""
Loop, % vCount-1
	vNeedleMult .= "(" vNeedle ")" vNeedlePad
vNeedleMult .= "(" vNeedle ")"
MsgBox, % vNeedleMult
RegExMatch(vText, "O)" vNeedleMult, o)
vOutput := ""
Loop, % o.Count()
	vOutput .= o[A_Index] "`r`n"
MsgBox, % vOutput

;see also: the section on callouts above

;see also:
;extracting items from a list using RegEx (various methods) (get all matches) - AutoHotkey Community
;https://autohotkey.com/boards/viewtopic.php?f=5&t=30448
;RegEx issue. Of course.... - AutoHotkey Community
;https://autohotkey.com/boards/viewtopic.php?f=5&t=43975&p=199745#p199745

;==================================================

;SYNTAX SPECIFIC TO AUTOHOTKEY

;AutoHotkey accepts various options
;placed before a close bracket
;e.g. object mode: 'O)'

;AHK v1:
;O: object mode, P: position mode
;"" literal double quote

;AHK v2 alpha:
;`" literal double quote

;==================================================

;QUERIES

;is '[a-zA-Z0-9]' preferable to '[A-Za-z0-9]'?
;(in some tests it seemed to make no difference to the speed)
;(presumably lowercase letters appear more often than uppercase letters,
;so better to check for them first when looking for
;a character class that includes/excludes letters,
;so that on average, fewer characters are searched for)

;possible to add numbers to each line?
;e.g. add number n and a tab to the start of line n
vText := Clipboard
vOutput := ""
VarSetCapacity(vOutput, StrLen(vText)*2*2)
vText := StrReplace(vText, "`r`n", "`n")
Loop, Parse, vText, `n
	vOutput .= A_Index "`t" A_LoopField "`r`n"
Clipboard := vOutput

;look up list item, RegEx not working (that in theory should work):
vList := "red:FF0000,yellow:FFFF00,green:00FF00,blue:0000FF"
;vNeedle := "red"
vNeedle := "yellow"
MsgBox, % RegExMatch(vList, "(?<=," vNeedle ":).*?(?=($|,))", vMatch) ;works except for first word in the list
MsgBox, % RegExMatch(vList, "(?<=(^|,)" vNeedle ":).*?(?=($|,))", vMatch) ;didn't work (but looks like it should)
MsgBox, % RegExMatch(vList, "(^|,)" vNeedle ":\K.*?(?=($|,))", vMatch) ;works
MsgBox, % vMatch
;https://autohotkey.com/docs/misc/RegEx-QuickRef.htm#UCP
;is this the explanation?
;Look-behinds are more limited than look-aheads because they do not support quantifiers of varying size such as *, ?, and +.

;\Q and \E are not guaranteed to work if the string in-between also contains \Q and \E, right?

;Is it possible to achieve the AutoHotkey RegEx 'options',
;by putting text in the needle proper? (i.e. if you use RegEx outside of AutoHotkey)
;e.g. i m s U (case-insensitive matching, multiline, DotAll, ungreedy) [see: Change options on-the-fly][https://autohotkey.com/docs/misc/RegEx-QuickRef.htm]
;e.g. `n `r `a [unresolved]
;see: https://autohotkey.com/docs/misc/RegEx-QuickRef.htm#Options

;I would be interested in RegExMatch versions
;of AutoHotkey's 'if var is type',
;if anyone has already prepared such RegEx needles.
;See script.cpp, for the source code:
;search for 'case ACT_IFIS:', the second occurrence.
;See also:
;If var is [not] type
;https://autohotkey.com/docs/commands/IfIs.htm

;is it possible to repeat a string n times using RegEx?

;unresolved:
;key things in RegEx I've not been able to do
;repeat a string n times
;reverse a string
;replace A1 with A2, B1 with B2 etc
;output 1 if matches A, 2 if matches B, etc

;==================================================
LINKS (DOCUMENTATION):

:arrow: Regular Expressions (RegEx) - Quick Reference

:arrow: RegExMatch

:arrow: RegExReplace

:arrow: Regular Expression Callouts

:arrow: SetTitleMatchMode

LINKS (SPECIFIC EXAMPLES):
[remove items from a list if they start with a particular character]
Help with RegExReplace - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=33768

[trim numbers]
ZTrim() : Remove redundant leading/trailing zeroes from a number - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=33960&p=159193#p159193

[using RegEx to repeat a string]
Replicate() : Repeats a string N times - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=33977&p=157428#p157428

[backreferencing within needles]
RegExReplace More than One Needle in a Script? - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=35229

[only keep lines that contain string (replace lines that don't contain string)]
TF library TF_RegExReplaceInLines - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=36575

[remove consecutive duplicate lines]
Put here requests of problems with regular expressions - Page 9 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-9#entry94923

[find a 3-digit number that is not '008']["\d{3}(?<!008)"]
Put here requests of problems with regular expressions - Page 20 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-20

[need a regex to extract the $100 if the string contains HELLO and WORLD, otherwise, extract last word]
Put here requests of problems with regular expressions - Page 29 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-29

Put here requests of problems with regular expressions - Page 63 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-63

Code: Select all

msgbox % RegExReplace("123 456 789", "A)(\d)\d\d ?", "$1") ; outputs "147"
msgbox % RegExReplace("123 456 789", "^(\d)\d\d ?", "$1") ; outputs "1456 789"
msgbox % RegExReplace("123`n456`n789", "`nm)^(\d)\d\d\R?", "$1") ; outputs "147"
RegExReplace Transpose Chords - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=50706

[remove first character in pattern (handle repeated subpatterns) (subpatterns as 'variables') (backreferences) (capturing groups) (word count)]
Simple script to count characters and words in selection - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=63302&p=270939#p270939

LINKS (SYNTAX NOT MENTIONED IN https://autohotkey.com/docs/misc/RegEx-QuickRef.htm)
[the \G anchor]
Add Thousands Separator - Scripts and Functions - AutoHotkey Community
https://autohotkey.com/board/topic/50019-add-thousands-separator/

[*ACCEPT: quit if error]
Default/Portable installation StdLib - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=13&t=10434&p=74978#p74978
Default/Portable installation StdLib - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=13&t=10434&p=75262#p75262

[*SKIP]
Put here requests of problems with regular expressions - Page 44 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-44

[general][click Show Sidebar, View Regex Quick Reference]
Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript
https://regex101.com/

LINKS:
[get all matches]
extracting items from a list using RegEx (various methods) (get all matches) - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=30448
prototype 'RegEx match all' function - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=50012

[\Q, \E and escaping characters with backslashes]
simplest way to make a RegEx needle literal? - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=30420

[PCRE REGULAR EXPRESSION SYNTAX SUMMARY]
[for syntax not mentioned in https://autohotkey.com/docs/misc/RegEx-QuickRef.htm]
pcresyntax specification
http://www.pcre.org/original/doc/html/pcresyntax.html

pcre.txt
http://www.pcre.org/pcre.txt

[TIPS] A collection/library of regular expressions - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12374-tips-a-collectionlibrary-of-regular-expressions/

Regular Expressions: a simple, easy tutorial
http://phi.lho.free.fr/programming/RETutorial.en.html

Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript
https://regex101.com/

Tutorial: An AHK Introduction to RegEx - Tutorials - AutoHotkey Community
https://autohotkey.com/board/topic/39733-tutorial-an-ahk-introduction-to-regex/

AutoHotkey Expression Examples: "" %% () and all that
http://www.daviddeley.com/autohotkey/xprxmp/autohotkey_expression_examples.htm#N

[look-behind v. '\K']
Put here requests of problems with regular expressions - Page 28 - Ask for Help - AutoHotkey Community
https://autohotkey.com/board/topic/12375-put-here-requests-of-problems-with-regular-expressions/page-28
\K is a different method of lookbehind. The traditional method (?<=...) cannot support quantifiers of varying size (i.e. *, ?, and +) whereas \K can. That aside of it just being more simple to use, IMO.

Best way to learn RegEx for AHK? - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=22&t=13030

[inclues some RegEx benchmark tests]
jeeswg's benchmark tests - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=37876&p=174191#p174191

RegEx: add/remove characters periodically - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=59641

LINKS (HISTORY):
[creator of RegEx (regular expressions)]
Philip Hazel - Wikipedia
https://en.wikipedia.org/wiki/Philip_Hazel
[From Punched Cards To Flat Screens - A Technical Autobiography By Philip Hazel]
[pp. 85-90: regular expressions]
CIHK.pdf
http://people.ds.cam.ac.uk/ph10/CIHK.pdf

==================================================

NEW SECTIONS:

[2017-07-07] SPLIT PATH
[2017-07-07] SLICE STRING / PAD STRING
[2017-07-07] UPPERCASE / LOWERCASE / TITLE CASE
[2017-07-07] NOTES: CHARACTER TYPES
[2017-07-07] NOTES: CHARACTER TYPES (SCRIPT NAMES)
[2017-10-05] NOTES: CHARACTERS
[2017-10-05] COLUMNS: INCREASE WHITESPACE BETWEEN COLUMNS
[2017-10-05] TRIM: RECREATING AUTOHOTKEY'S TRIM/LTRIM/RTRIM FUNCTIONS
[2017-10-05] SEPARATE LEADING WHITESPACE / CODE / COMMENTS
[2017-10-05] DATES
[2017-10-05] BACKREFERENCES
[2017-10-05] GET ALL MATCHES
[2017-10-05] SYNTAX SPECIFIC TO AUTOHOTKEY
Last edited by jeeswg on 17 Sep 2019, 18:22, edited 63 times in total.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
User avatar
noname
Posts: 516
Joined: 19 Nov 2013, 09:15

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 11:09

Nice collection ,i already found some i will use.Thanks for posting :)
kon
Posts: 1756
Joined: 29 Sep 2013, 17:11

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 11:37

This looks like a useful collection.

Just some suggestions, do with them what you will:

";Greed: By default..." -- This line is the only one that is very long. Perhaps some linefeeds are in order?

vList is used throughout. I suggest giving it a value of some sample text. Then add comments to each regular expression showing the expected result. With a few modifications, this could be an actual working script that runs and shows the user the results of each regular expression.

The lack of indenting is painful for me to read, but I realize this is just personal preference. I don't think I'm in the minority though.

Sites like https://regex101.com/ are worth mentioning IMO. There are many, but that's the one I use. They are especially helpful when answering "Ask for Help" questions because you can save a link to your regular expression and post it for others to read. They explain what each symbol does, which saves a lot of typing-out explanations.

Edit:

Code: Select all

;Is it possible to achieve the AutoHotkey RegEx 'options',
;by putting text in the needle proper?
;e.g. i m s U (case-insensitive matching, multiline, DotAll, ungreedy)
;e.g. `n `r `a
;see: https://autohotkey.com/docs/misc/RegEx-QuickRef.htm#Options
See https://www.autohotkey.com/docs/misc/Re ... htm#subpat specifically "Change options on-the-fly..."
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 12:35

@noname
Thanks so much for you comments.
At the moment AHK v2 doesn't have if var in/contains/is type,
(I believe,) which had prompted me to seek various RegEx methods
in the meantime, leading to this collection.

@kon
[UPDATE:]
I split the line ';Greed: By default...'
I added indentation.
I added https://regex101.com/.
() I'm still to add in vText in more places.

@kon
'Greed: By default'
That line is long, it's a quote from the help,
I might split it up.

'vList is used throughout'
I used examples with vList at it happens,
I think you mean vText.
That's a tricky one, because that would double/triple
the lines used, and really bulk everything out.
To begin with at least, there are some that would definitely benefit
from example text, like the 'remove columns' ones,
so I'll add example text to those.

'The lack of indenting'
I counted 15 lines starting with 'if ',
with only 7 followed by lines (which would thereby
require indentation), in basic 2-line/4-line blocks.
[EDIT: and 7 lines beginning with 'Loop']
I'm sympathetic to people wanting indentation for
larger scripts.
It wouldn't be too hard for me to add that in,
and I might do, I'm curious though, is:

Code: Select all

if a
b
if c
d

if a
if b
if c
d
so much worse than:

Code: Select all

if a
	b
if c
	d

if a
	if b
		if c
			d
People seem to be really allergic to non-indentation.
I think there should be a poll on this forum,
as to whether I should use indentation,
I've heard it a fair amount.
I don't mind when people don't use indentation,
and I use Notepad, so I get zero assistance when reading scripts.
Btw these comments on indentation are aimed at everyone,
not you specifically, just that I hadn't got round to posting them.

To be honest I see indentation as a silly coding fad,
that often makes code less readable.
I do use it sometimes, when I have 2 or 3 pairs of curly brackets.
It definitely helps in those situations.
It seems to be pretty widespread, the demand for indentation,
I'm working on functions to add it in for my big scripts,
and library functions before I share them (together with manual checking),
even though personally I think over-indentation is unnecessary and undesirable,
a mark of indoctrination, with blank lines, comments and good variable names
being far more important.
I also like to use barriers (e.g. equal signs), uppercase comments e.g. ';STAGE 1 -',
and notations like ';;;SECTION' in some instances (3 semicolons being easy to search for).
Haha I'm going hard on indentation, because I've seen what looks like
some scary groupthink on the matter, I'm not particularly ideological re. programming.

Thanks so much for your comments, it means a lot coming
from you, you've done some clever things on this forum.
Last edited by jeeswg on 19 Feb 2017, 12:47, edited 2 times in total.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
kon
Posts: 1756
Joined: 29 Sep 2013, 17:11

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 12:58

Yes, I meant vText.

I edited my post right before you posted. Not sure if you saw my edit so I'll just point it out here.

Regarding the sample text and expected results.
I've answered a lot of "Ask for Help" questions on regex. They usually play-out in two ways.
1. The OP provides sample text for both the haystack and the expected results.
2. OP tries to explain the problem in plain English.

#1 is usually answered with one reply.
#2 usually becomes a "moving target" style question as people post solutions that appear to conform to OP's request, but due to OP's lack of understanding of regex the solutions need to be adjusted to satisfy new requirements. These threads tend to drag on for quite a while.

I think this is a good argument for any regex tutorial to have sample text and show the expected results. It doesn't have to increase the length of the tutorial by 2-3x. Maybe just add a comment to the same line?
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: RegEx handy examples (RegExMatch, RegExReplace)

16 Feb 2017, 13:14

Hmm, classic insight.
Yes I've thought for a while that the antidote to manuals being difficult
to understand in a foreign language,
is lots of before/after examples.
Actually even if the manual is in your language.

Wow, one magical haystack, will think of what I can do.

It was exhausting producing this tutorial, although I needed it for my own use anyway,
yeah, the way I like to do things, is get something quality *finished*,
and then eventually have a rethink and come back to it,
when you can bear to re-explore the material, and have possibly come up against some new ideas.

[EDIT:]
Thanks for the link re. 'Change options on-the-fly', I had noticed that but not understood it at the time. I found it mentioned here under Options (PhiLho was quite good with this RegEx stuff):
Regular Expressions: a simple, easy tutorial
http://phi.lho.free.fr/programming/RETutorial.en.html

I've updated the queries section to reflect this.
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
guest3456
Posts: 3478
Joined: 09 Oct 2013, 10:31

Re: RegEx handy examples (RegExMatch, RegExReplace)

18 Feb 2017, 09:04

jeeswg wrote: People seem to be really allergic to non-indentation.
I think there should be a poll on this forum,
as to whether I should use indentation,
I've heard it a fair amount.
I don't mind when people don't use indentation,
and I use Notepad, so I get zero assistance when reading scripts.
Btw these comments on indentation are aimed at everyone,
not you specifically, just that I hadn't got round to posting them.

To be honest I see indentation as a silly coding fad,
that often makes code less readable.
lol. yet another example of multiple people telling you that you're wrong, and you continuing to be stubborn and hardheaded. are you ever going to come down from your high horse? you would probably get the same responses on stackoverflow, and then complain that "those arrogant stackoverflow people deleted my post because i didn't indent my code" or some other nonsense. no poll is necessary. this is one of the most basic style aspects of programming, regardless of the langauge. god forbid you ever use Python, where indentation is REQUIRED

but, if you like the fact that no one reads your posts or your code, then continue with what you're doing

if however you actually want input from others, you should probably present your code in ways that they want to read it.

User
Posts: 407
Joined: 26 Jun 2017, 08:12

Re: RegEx handy examples (RegExMatch, RegExReplace)

17 Oct 2017, 12:01

guest3456 wrote: but, if you like the fact that no one reads your posts or your code, then continue with what you're doing
if however you actually want input from others, you should probably present your code in ways that they want to read it.
I think you are jealous, poor @guest3456

Actually, @jeeswg is proving to be really helpful here in this forum, but unfortunately, I can't say the same of you!

Keep the good and hard work @jeeswg, I myself already learned a lot from you! Thanks!
User avatar
nnnik
Posts: 4500
Joined: 30 Sep 2013, 01:01
Location: Germany

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

17 Oct 2017, 15:01

To be honest I mostly don't bother reading unindented code.
Recommends AHK Studio
DRocks
Posts: 565
Joined: 08 May 2018, 10:20

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

05 Oct 2018, 07:15

Thank you Jee

When I was newer to AHK it was hard for me to understand or follow anything at all here but now that I am getting it m ore and more these tutorials are very handy. They cover blank areas of the docs most of the time and its great. thanks alot man
Rohwedder
Posts: 7904
Joined: 04 Jun 2014, 08:33
Location: Germany

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

21 Apr 2020, 09:25

Hallo,
this is in RegEx tutorial proper [updated: 2018-02-25]
:

Code: Select all

;delete everything after the first n
vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % RegExReplace(vText, "^.*(?=n)")
seems to be wrong.
Here it's delete everything before the first n.
hasantr
Posts: 933
Joined: 05 Apr 2016, 14:18
Location: İstanbul

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

29 Apr 2020, 06:55

These are amazing. Thanks.
Helgef
Posts: 4709
Joined: 17 Jul 2016, 01:02
Contact:

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

29 Apr 2020, 11:47

Rohwedder wrote:
21 Apr 2020, 09:25
Hallo,
this is in RegEx tutorial proper [updated: 2018-02-25]
:

Code: Select all

;delete everything after the first n
vText := "abcdefghijklmnopqrstuvwxyz"
MsgBox, % RegExReplace(vText, "^.*(?=n)")
seems to be wrong.
Here it's delete everything before the first n.
"abcdefghijklmnopqrstuvwxyz"
It is a pretty poor haystack to test after the first n, as it only contains one n. You can do like this,

Code: Select all

; delete everything after the first n
text := "anbcdefghijklmnopqrstuvnwxyz"
MsgBox % RegExReplace(text, "(^.*?n).*", "$1")
I recommend :arrow: regex quick reference,
Greed wrote: Greed: By default, *, ?, +, and {min,max} are greedy because they consume all characters up through the last possible one that still satisfies the entire pattern. To instead have them stop at the first possible character, follow them with a question mark.
Cheers.
Last edited by Helgef on 30 Apr 2020, 00:31, edited 1 time in total.
Rohwedder
Posts: 7904
Joined: 04 Jun 2014, 08:33
Location: Germany

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

29 Apr 2020, 12:33

You've misunderstood me!
I was not looking for a solution, but wanted to point out a supposed mistake in your tutorial at the beginning of this article.
User avatar
boiler
Posts: 17706
Joined: 21 Dec 2014, 02:44

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

29 Apr 2020, 19:32

Rohwedder wrote:
29 Apr 2020, 12:33
You've misunderstood me!
I was not looking for a solution, but wanted to point out a supposed mistake in your tutorial at the beginning of this article.
I agree with you that the mistake in the tutorial is that is says after the first n when it should say before. I would also point out that it’s not Helgef’s tutorial you are referring to, it’s jeeswg’s (although Helgef did reply to your post).
Helgef
Posts: 4709
Joined: 17 Jul 2016, 01:02
Contact:

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

30 Apr 2020, 00:34

it should say before
@boiler, that still wouldn't be right, as it would remove everything before the last n.

Code: Select all

; delete everything before the first n
text := "abcdefghijklmnopqrstuvnnnnwxyz"
MsgBox % RegExReplace(text, "^.*?n", "n")
@Rohwedder, you are the one who have misunderstood, we do not post for your benefit only, anyone could benefit from any solutions provided, even if you do not look for one.

I've fixed my previous post, it seems I mixed up before/after, it didn't make sense.

Cheers.
User avatar
boiler
Posts: 17706
Joined: 21 Dec 2014, 02:44

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

30 Apr 2020, 00:50

Helgef wrote:
30 Apr 2020, 00:34
@boiler, that still wouldn't be right, as it would remove everything before the last n.
Good point. There was more wrong with it than just the word before/after, with errors in either the description or execution (or both) depending on the intent.
Vusami
Posts: 2
Joined: 12 Aug 2020, 01:40

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

20 Sep 2020, 13:50

Thank you for this post.
jsong55
Posts: 323
Joined: 30 Mar 2021, 22:02

Re: jeeswg's RegEx tutorial (RegExMatch, RegExReplace)

10 May 2021, 04:17

Update: I somehow found a solution here on the forum by @jeeswg though I don't really fully understand the RegEx portions, but tweaked it a little.

The regex portions' a little too complex for me

Created a function with it

Code: Select all

	Format_OL_date(vDate,format) {
		; vDate can have the following formats - 
		/* 

		======== Variable Inputs ==========		
		vDateEg2 := "4/5/2006 12:00:00 AM" ;24-hour: 00
		vDateEg3 := "4/5/2006 06:00:00 AM" ;24-hour: 06
		vDateEg4 := "4/5/2006 12:00:00 PM" ;24-hour: 12
		vDateEg5 := "4/5/2006 06:00:00 PM" ;24-hour: 18
		vDateEg6 := "14/5/2006 6:00:00 PM"
		vDateEg7 := "4/12/2006 6:00:00 AM" 

		======== Function Usage ===========
		mail := ComObjActive("Outlook.Application").ActiveInspector.CurrentItem
		Msgbox, % this.Format_OL_date(mail.start,"ddMMyyyy") "`n" FormatDateSpecial(mail.start,"hh:mmtt") ; remove this if function NOT in class
		*/
		if !RegExMatch(vDate, "^\d+/\d+/\d+ \d+:\d+:\d+ [AP]M$")
		{
			MsgBox, % "error: nonstandard date:`r`n" vDate
			return
		}
		vDate2 := RegExReplace(vDate, "^(\d+)/(\d+)/(\d+) (\d+:\d+:\d+).*$", "$3:$2:$1:$4")
		oDate := StrSplit(vDate2, ":")
		if RegExMatch(vDate, "AM$") && (oDate.4 = 12)
			oDate.4 := 0
		else if RegExMatch(vDate, "PM$") && !(oDate.4 = 12)
			oDate.4 += 12
		vDate2 := Format("{:04}{:02}{:02}{:02}{:02}{:02}", oDate*)
		FormatTime, easy, % vDate2, % format
		; MsgBox, % "Example`n" vDate "`r`n" vDate2 "`nEasy to Read:" easy
		Return easy
	}

How to get Outlook AppointmentItem.Start date into AHK's YYYYMMDDHH24MISS format to use FormatTime

A few output of .start date can be

1. 5/10/2021 4:45:00 PM (or AM)
2. 15/10/2021 4:45:00 PM
3. 5/1/2021 12:45:00 PM

Different scenarios will have different digits
I can't think of a good solution. Anyone?

Open a outlook appointment and try the code below

Code: Select all

oL := ComObjActive("Outlook.Application")
		mail := oL.ActiveInspector.CurrentItem
		msgbox, % mail.start
		clipboard:=mail.start

Return to “Tutorials (v1)”

Who is online

Users browsing this forum: No registered users and 4 guests