VBScript.Regex and highlighting text Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
JustAnotherAHKUser
Posts: 13
Joined: 26 Mar 2015, 14:38

VBScript.Regex and highlighting text

05 Apr 2019, 09:49

Hi guys,

I'm trying to use vbscript.regex to highlight text in Word documents. It seems to work fine, though it doesn't highlight the text properly if there's a table in the file. Then the highlighting shifts.

Does anyone know how to fix that? Thanks for the help!

Code: Select all

needleArray := []
oWord := ComObjCreate("Word.Application")
regex := ComObjCreate("VBScript.RegExp")
regex.IgnoreCase := true

loop, read, %A_ScriptDir%\test.txt
{
	loop, parse, A_LoopReadLine, `n
	{
		needle := needleArray.Push(A_LoopReadLine)
	}
}

sourceFile := "test with table.docx"
filePath = %A_ScriptDir%\
sourceFullPath :=  filePath . sourceFile
SplitPath, sourceFullPath, OutFileName, OutDir, OutExtension, OutNameNoExt, OutDrive
oWord.Documents.Open(sourceFullPath)
oWord.Visible := 1
oWord.Activate
haystack := oWord.ActiveDocument.Range.Text

for index, needle in needleArray {
	regex.Pattern := ""
	regex.Pattern := needle

	regexMatch := regex.Execute(haystack)
	for item in regexMatch {
		oWord.ActiveDocument.Range(item.FirstIndex, item.FirstIndex + item.Length).HighlightColorIndex := 4
	}
}
oWord.Application.ActiveDocument.SaveAs(outDir . "\" . OutNameNoExt . "_xxx." . OutExtension)
oWord.Application.ActiveDocument.Close()
oWord.Application.Quit
sleep 100
oWord := ""
sleep 100
regex := ""
MsgBox,, Oh..., Done, done!
Exitapp
Regex patterns:
Aen(.*?)n nec l(.*?)m
Suspendisse dui purus, (.*?), nunc


Result w/o table is here:
https://imgur.com/a/4bgrLSz

Result w table:
https://imgur.com/a/01Upkwf
Klarion
Posts: 176
Joined: 26 Mar 2019, 10:02

Re: VBScript.Regex and highlighting text

05 Apr 2019, 12:05

interesting
i have never seen this 'error'
it looks like sum up each cell count ahead of 'it'
-so, first '4' and the next '5'
-each cell has Chr(7) at the end of it. I guess, this is why but, I am not sure about it.

if you really had hard time and nobody helped you..
how about try native search style

i mean range.find method
it works almost same as common RegExp
-though a little bit different

Good Luck To you
User avatar
sinkfaze
Posts: 616
Joined: 01 Oct 2013, 08:01

Re: VBScript.Regex and highlighting text

05 Apr 2019, 14:51

Tables have a lot of "junk" that you encounter using VBA that you don't see on your screen. You could iterate the table, pull each value out of each cell and evaluate/highlight that way, but as far as using Range itself there's not much that you can do to get around the issues.
User avatar
FanaticGuru
Posts: 1906
Joined: 30 Sep 2013, 22:25

Re: VBScript.Regex and highlighting text  Topic is solved

05 Apr 2019, 15:13

JustAnotherAHKUser wrote:
05 Apr 2019, 09:49
I'm trying to use vbscript.regex to highlight text in Word documents. It seems to work fine, though it doesn't highlight the text properly if there's a table in the file. Then the highlighting shifts.

Like sinkfaze said. You cannot simple take all the text of a Word document, put it in a string, run RegEx on the string, and then expect the positions found in the string to match up with the position in the actual Word document. There are lots of formatting that Word sees that is lost in a plain string. Pretty much any HTML type formatting will mess it up. Tables are a type of HTML formatting.

One way this could be done is to use the RegEx to get the matches in the string, then use the actual text of the matches to use Find in Word to modify those exact text strings in the document.

Here is an example.

Code: Select all

wdApp := ComObjActive("Word.Application")
wdRegEx := ComObjCreate("VBScript.RegExp")

wdRegEx.Pattern := "t.st"
wdRegEx.Global := true
wdRegEx_Matches := wdRegEx.Execute(wdApp.ActiveDocument.Range.Text)

wdApp.Options.DefaultHighlightColorIndex := 4 ; Bright Green
wdFind := wdApp.ActiveDocument.Content.Find
for wdRegEx_Match in wdRegEx_Matches
{
	wdFind.ClearFormatting
	wdFind.Replacement.ClearFormatting
	wdFind.Replacement.Highlight := true
	wdFind.Execute(wdRegEx_Match.Value,,,,,,,1,,,2)
}
This will find t.st in the Word document and highlight. So it will find things like "test", "tast", "tost", etc.

This code could be made more efficient by weeding out duplicate Matches from the RegEx as ReplaceAll will get duplicates all in one go.

FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
JustAnotherAHKUser
Posts: 13
Joined: 26 Mar 2015, 14:38

Re: VBScript.Regex and highlighting text

10 Apr 2019, 09:15

Hi guys,

Thank you very much for your comments and suggestions - really appreciate it!
One way this could be done is to use the RegEx to get the matches in the string, then use the actual text of the matches to use Find in Word to modify those exact text strings in the document.
@FanaticGuru - this is an excellent idea! Been using find.execute for normal text searches - all clear how to use it! Thanks again! :bravo:

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: OrangeCat and 294 guests