RegExReplace() [v1.0.45+]

Replaces occurrences of a pattern (regular expression) inside a string.

NewStr := RegExReplace(Haystack, NeedleRegEx , Replacement, OutputVarCount, Limit, StartingPos)

Parameters

Haystack

The string whose content is searched and replaced.

NeedleRegEx

The pattern to search for, which is a Perl-compatible regular expression (PCRE). The pattern's options (if any) must be included at the beginning of the string followed by a close-parenthesis. For example, the pattern i)abc.*123 would turn on the case-insensitive option and search for "abc", followed by zero or more occurrences of any character, followed by "123". If there are no options, the ")" is optional; for example, )abc is equivalent to abc.

Replacement

If blank or omitted, NeedleRegEx will be replaced with blank (empty), meaning it will be omitted from the return value. Otherwise, specify the string to be substituted for each match, which is plain text (not a regular expression).

This parameter may include backreferences like $1, which brings in the substring from Haystack that matched the first subpattern. The simplest backreferences are $0 through $9, where $0 is the substring that matched the entire pattern, $1 is the substring that matched the first subpattern, $2 is the second, and so on. For backreferences greater than 9 (and optionally those less than or equal to 9), enclose the number in braces; e.g. ${10}, ${11}, and so on. For named subpatterns, enclose the name in braces; e.g. ${SubpatternName}. To specify a literal $, use $$ (this is the only character that needs such special treatment; backslashes are never needed to escape anything).

To convert the case of a subpattern, follow the $ with one of the following characters: U or u (uppercase), L or l (lowercase), T or t (title case, in which the first letter of each word is capitalized but all others are made lowercase). For example, both $U1 and $U{1} transcribe an uppercase version of the first subpattern.

Nonexistent backreferences and those that did not match anything in Haystack -- such as one of the subpatterns in (abc)|(xyz) -- are transcribed as empty strings.

OutputVarCount

If omitted, the corresponding value will not be stored. Otherwise, specify an output variable in which to store the number of replacements that occurred (0 if none).

Limit

If omitted, it defaults to -1, which replaces all occurrences of the pattern found in Haystack. Otherwise, specify the maximum number of replacements to allow. The part of Haystack to the right of the last replacement is left unchanged.

StartingPos

If omitted, it defaults to 1 (the beginning of Haystack). Otherwise, specify 2 to start at the second character, 3 to start at the third, and so on. If StartingPos is beyond the length of Haystack, the search starts at the empty string that lies at the end of Haystack (which typically results in no replacements).

If StartingPos is less than 1, it is considered to be an offset from the end of Haystack. For example, 0 starts at the last character and -1 starts at the next-to-last character. If StartingPos tries to go beyond the left end of Haystack, all of Haystack is searched.

Regardless of the value of StartingPos, the return value is always a complete copy of Haystack -- the only difference is that more of its left side might be unaltered compared to what would have happened with a StartingPos of 1.

Return Value

This function returns a version of Haystack whose contents have been replaced by the operation. If no replacements are needed, Haystack is returned unaltered. If an error occurs (such as a syntax error inside NeedleRegEx), Haystack is returned unaltered (except in versions prior to 1.0.46.06, which return "") and ErrorLevel is set to one of the values below instead of 0.

Error Handling

[v1.1.04+]: This function is able to throw an exception on failure (not to be confused with "no match found"). For more information, see Runtime Errors.

ErrorLevel is set to one of the following:

Options

See RegEx Quick Reference for options such as i)abc, which turns off case-sensitivity.

Performance

To replace simple substrings, use StrReplace() or StringReplace because it is faster than RegExReplace().

If you know what the maximum number of replacements will be, specifying that for the Limit parameter improves performance because the search can be stopped early (this might also reduce the memory load on the system during the operation). For example, if you know there can be only one match near the beginning of a large string, specify a limit of 1.

To improve performance, the 100 most recently used regular expressions are kept cached in memory (in compiled form).

The study option (S) can sometimes improve the performance of a regular expression that is used many times (such as in a loop).

Remarks

Most characters like abc123 can be used literally inside a regular expression. However, any of the characters in the set \.*?+[{|()^$ must be preceded by a backslash to be seen as literal. For example, \. is a literal period and \\ is a literal backslash. Escaping can be avoided by using \Q...\E. For example: \QLiteral Text\E.

Within a regular expression, special characters such as tab and newline can be escaped with either an accent (`) or a backslash (\). For example, `t is the same as \t except when the x option is used.

To learn the basics of regular expressions (or refresh your memory of pattern syntax), see the RegEx Quick Reference.

RegExMatch(), RegEx Quick Reference, Regular Expression Callouts, StrReplace(), InStr(), StringReplace

Common sources of text data: FileRead, UrlDownloadToFile, Clipboard, GUI Edit controls

Examples

For general RegEx examples, see the RegEx Quick Reference.

Reports "abc123xyz" because the $ allows a match only at the end.

MsgBox % RegExReplace("abc123123", "123$", "xyz")

Reports "123" because a match was achieved via the case-insensitive option.

MsgBox % RegExReplace("abc123", "i)^ABC")

Reports "aaaXYZzzz" by means of the $1 backreference.

MsgBox % RegExReplace("abcXYZ123", "abc(.*)123", "aaa$1zzz")

Reports an empty string and stores 2 in ReplacementCount.

MsgBox % RegExReplace("abc123abc456", "abc\d+", "", ReplacementCount)