Help with Grep issue Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
maitresin
Posts: 45
Joined: 20 Mar 2018, 19:33

Help with Grep issue

22 Oct 2022, 19:35

Hi,

What is wrong with my grep? It should not msgbox "ABC Group"?


This is a line in the download.htm file
<div class="geodir-post-meta-container bsui sdel-298db276" ><div class="geodir_post_meta text-left d-block geodir-field-banniere"><span class="geodir_post_meta_icon geodir-i-select" style=""><i class="fas fa-flag fa-fw" aria-hidden="true"></i> <span class="geodir_post_meta_title gv-secondary" >Banner: </span></span>ABC Group</div></div>

Code: Select all

numpad2::

FileRead, FileRead1, DOWNLOAD.HTM
	g:=grep(fileread1, "sU)Banner: </span></span>(.*)</div></div>")
	for i, v in g
	{
	result1 := v.1
	msgbox, %result1%
	}
	return


grep(haystack, needle)
{
    a:=[], match := "", pos := 1
    while pos:=RegExMatch(haystack, needle, match, pos+StrLen(match))
        a[A_Index]:= {"match": match, 1: match1}
    Return a
}
return
User avatar
boiler
Posts: 17206
Joined: 21 Dec 2014, 02:44

Re: Help with Grep issue

22 Oct 2022, 22:17

Are you sure your file is right? Run the following:

Code: Select all

FileRead1 = <div class="geodir-post-meta-container bsui sdel-298db276" ><div class="geodir_post_meta text-left d-block geodir-field-banniere"><span class="geodir_post_meta_icon geodir-i-select" style=""><i class="fas fa-flag fa-fw" aria-hidden="true"></i> <span class="geodir_post_meta_title gv-secondary" >Banner: </span></span>ABC Group</div></div>
g:=grep(fileread1, "sU)Banner: </span></span>(.*)</div></div>")
for i, v in g
{
	result1 := v.1
	msgbox, %result1%
}
return


grep(haystack, needle)
{
    a:=[], match := "", pos := 1
    while pos:=RegExMatch(haystack, needle, match, pos+StrLen(match))
        a[A_Index]:= {"match": match, 1: match1}
	Return a
}
maitresin
Posts: 45
Joined: 20 Mar 2018, 19:33

Re: Help with Grep issue

23 Oct 2022, 17:35

Like this it works but from the file it does not work.

I did copy the line exactly as per the file and it work if I put it manually in the variable FileRead1, but from the file it refuse to work.

Is there linefeed or carriage that block from file?
User avatar
boiler
Posts: 17206
Joined: 21 Dec 2014, 02:44

Re: Help with Grep issue  Topic is solved

23 Oct 2022, 17:51

Maybe instead of sU), try `aU) as the RegEx pattern’s options.

Are you sure there are no other characters within what you’ve shown? Can you attach a sample file that doesn’t work?
maitresin
Posts: 45
Joined: 20 Mar 2018, 19:33

Re: Help with Grep issue

23 Oct 2022, 18:01

ok I found. Sorry my mistake there was a special character with an accent

Regex does not recognize accent characters like "é,à,è..."
User avatar
boiler
Posts: 17206
Joined: 21 Dec 2014, 02:44

Re: Help with Grep issue

23 Oct 2022, 18:29

maitresin wrote:
23 Oct 2022, 18:01
Regex does not recognize accent characters like "é,à,è..."
Yes it does. Run the following

Code: Select all

FileRead1 = <div class="geodir-post-meta-container bsui sdel-298db276" ><div class="geodir_post_meta text-left d-block geodir-field-banniere"><span class="geodir_post_meta_icon geodir-i-select" style=""><i class="fas fa-flag fa-fw" aria-hidden="true"></i> <span class="geodir_post_meta_title gv-secondary" >Banner: </span></span>àé Group</div></div>
g:=grep(fileread1, "sU)Banner: </span></span>(.*)</div></div>")
for i, v in g
{
	result1 := v.1
	msgbox, %result1%
}
return


grep(haystack, needle)
{
    a:=[], match := "", pos := 1
    while pos:=RegExMatch(haystack, needle, match, pos+StrLen(match))
        a[A_Index]:= {"match": match, 1: match1}
	Return a
}
maitresin
Posts: 45
Joined: 20 Mar 2018, 19:33

Re: Help with Grep issue

23 Oct 2022, 18:59

It is the Fileread that does not get character correctly when there is an accent

I did change only this section of the line: </span></spanàé>ABC Group</div></div>


Test1 work because the variable did store your input correctly

Code: Select all

FileRead1 = <div class="geodir-post-meta-container bsui sdel-298db276" ><div class="geodir_post_meta text-left d-block geodir-field-banniere"><span class="geodir_post_meta_icon geodir-i-select" style=""><i class="fas fa-flag fa-fw" aria-hidden="true"></i> <span class="geodir_post_meta_title gv-secondary" >Banner: </span></spanàé>ABC Group</div></div>
msgbox, %FileRead1%
g:=grep(fileread1, "sU)Banner: </span></spanàé>(.*)</div></div>")
for i, v in g
{
	result1 := v.1
	msgbox, %result1%
}
return


grep(haystack, needle)
{
    a:=[], match := "", pos := 1
    while pos:=RegExMatch(haystack, needle, match, pos+StrLen(match))
        a[A_Index]:= {"match": match, 1: match1}
	Return a
}

Test2 Save the line into test.txt and you can see from the msgbox the character with accent are wrong and code does not work

Code: Select all

Fileread, FileRead1, test.txt
msgbox, %FileRead1%
g:=grep(fileread1, "sU)Banner: </span></spanàé>(.*)</div></div>")
for i, v in g
{
	result1 := v.1
	msgbox, %result1%
}
return


grep(haystack, needle)
{
    a:=[], match := "", pos := 1
    while pos:=RegExMatch(haystack, needle, match, pos+StrLen(match))
        a[A_Index]:= {"match": match, 1: match1}
	Return a
}

User avatar
boiler
Posts: 17206
Joined: 21 Dec 2014, 02:44

Re: Help with Grep issue

23 Oct 2022, 19:53

That’s likely because you haven’t saved the files (both the file containing those characters and your script file) with the right encoding: “UTF-8 with BOM”. It’s not because of how RegEx works. What does it show when you put the msgbox, %FileRead1% line after reading the text from your file? If the characters are wrong there, then it can’t be the RegEx’s problem since that hasn’t even occurred yet.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Leonardo_Portela and 120 guests