ahcahc,
(1) Great! I decided not to use « and » but went with µ for OR (micro sign-181) and ¡ for NOT (inverted exclamation mark-161). It's working fine.
(2 & 3) Thanks for the case insensitive and word boundary code - and for the heads-up about word boundary working only if the search text is alnum.
(4) Interesting idea! Hadn't thought of that angle. I'll have to decide which approach I prefer. My first thought was that I like the idea of allowing mixed case on the Boolean operators (users will probably like being able to enter all lower case), but the idea of being able to search for them is nice, too.
Will continue testing. Thanks again, Joe
Boolean Search as the needle searching a haystack
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
ahcahc,
I just noticed this commented out line:
;Search3 :=RegExReplace(Search3,"(?<=\d)\s+?(?=\d)","*") ;In case regexmatch failed to match the entire sentence, search for space sorrounded by numbers and replace it with *
Should I be doing anything with this line? Should it be un-commented? Thanks, Joe
I just noticed this commented out line:
;Search3 :=RegExReplace(Search3,"(?<=\d)\s+?(?=\d)","*") ;In case regexmatch failed to match the entire sentence, search for space sorrounded by numbers and replace it with *
Should I be doing anything with this line? Should it be un-commented? Thanks, Joe
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
guest3456,
Thanks for the suggestion. Regards, Joe
Thanks for the suggestion. Regards, Joe
Re: Boolean Search as the needle searching a haystack
Previously the final expression would have a space between two numbers because regex only partially matched the sentence and the resulting expression will be like (1 * 1 0), eval (1 * 1 0) would result to 10 but it should be zero so (1 * 1*0). If you get and expression like that, you can use that code fix to it. I think this wont likely happen now.JoeWinograd wrote:ahcahc,
I just noticed this commented out line:
;Search3 :=RegExReplace(Search3,"(?<=\d)\s+?(?=\d)","*") ;In case regexmatch failed to match the entire sentence, search for space sorrounded by numbers and replace it with *
Should I be doing anything with this line? Should it be un-commented? Thanks, Joe
I've made it easier to replace the operators with different special characters and made the match-whole-word also include non-alphanumeric characters(if this fails you can always revert back to \b).
Code: Select all
string=
(
@utoHo:tkey? is a
better
language``s's
than
VBScr|pt
)
search = (@utoHo:tkey? OR Auto Hotkey OR Auto-Hotkey) AND language``s's AND NOT (Microsoft OR VBScr|pt)
op_and=¤ ;AND
op_or=µ ;OR
op_not=¡ ;NOT
case_sense=1 ;case insensitive
whole_word=1 ;match whole word/sentence
option_start1=
option_start2=
option_end1=
option_end2=
if case_sense=1
{
option_start1 .= "i)"
option_start2 .= "i)"
}
if whole_word=1
{
;option_start1 .= "\b" ;alphanumeric only
;option_end1 .= "\b" ;alphanumeric only
;option_start2 .= "\b" ;alphanumeric only
;option_end2 .= "\b" ;alphanumeric only
option_start1 .= "(?:(?<=[\s" op_and op_or op_not "()])|^)" ;alphanumeric + special characters
option_end1 .= "(?:(?=[\s" op_and op_or op_not "()])|$)" ;alphanumeric + special characters
option_start2 .= "(?:(?<=\s)|^)" ;alphanumeric + special characters
option_end2 .= "(?:(?=\s)|$)" ;alphanumeric + special characters
}
search1 := RegExReplace(RegExReplace(RegExReplace(search,"i)\bAND\b",op_and),"i)\bOR\b",op_or),"i)\bNOT\b",op_not)
while pos := RegExMatch(search1,"((?:(?=\S)[^\s" op_and op_or op_not "()]+\s*)+)",m,A_Index=1?1:pos+StrLen(m))
{
if a_index=1
search2 := RegExReplace(search1,option_start1 "\Q" trim(m) "\E" option_end1,RegExMatch(string,option_start2 "\Q" trim(m) "\E" option_end2)>0?1:0)
else
search2 := RegExReplace(search2,option_start1 "\Q" trim(m) "\E" option_end1,RegExMatch(string,option_start2 "\Q" trim(m) "\E" option_end2)>0?1:0)
}
search3 := RegExReplace(search2,op_and,"*") ;AND
search3 := RegExReplace(search3,op_or,"+") ;OR
search3 := RegExReplace(search3,op_not,"!") ;NOT
;Search3 :=RegExReplace(Search3,"(?<=\d)\s+?(?=\d)","*") ;In case regexmatch failed to match the entire sentence, search for space sorrounded by numbers and replace it with *
MsgBox % "String: " string "`nExpression0: " search "`nExpression1: " search1 "`nExpression2: " search2 "`nExpression3: " search3 "`nResult: " eval(search3)
if (eval(search3)>0)
MsgBox do something here
Last edited by ahcahc on 09 Sep 2014, 08:16, edited 1 time in total.
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
Hi ahcahc,
Thank you for the new code - everything I've tested has worked perfectly so far! It's great to have code that makes it so easy to alter the operators with different special characters and to control the case-sense and whole-word options. The ability to include non-alphanumeric characters in a search is a nice bonus, especially valuable for the hyphen, which I'm sure users will need in search terms.
Question: Based on experimentation, it seems that the operator precedence is NOT, AND, OR, but I wanted to confirm that with you. I realize it depends on Eval, which is being fed:
! for NOT
* for AND
+ for OR
At first I thought it was being evaluated left-to-right, but this test proved otherwise:
string = AutoHotkey is a better language than VBScript
search = AutoHotkey OR VBScript AND Microsoft
Left-to-right on that would result in 0, while AND before OR results in 1, which is the actual result. I'm going to recommend that users take advantage of parentheses to control precedence, but I still need to explain what the default precedence is. Thanks, Joe
Thank you for the new code - everything I've tested has worked perfectly so far! It's great to have code that makes it so easy to alter the operators with different special characters and to control the case-sense and whole-word options. The ability to include non-alphanumeric characters in a search is a nice bonus, especially valuable for the hyphen, which I'm sure users will need in search terms.
Question: Based on experimentation, it seems that the operator precedence is NOT, AND, OR, but I wanted to confirm that with you. I realize it depends on Eval, which is being fed:
! for NOT
* for AND
+ for OR
At first I thought it was being evaluated left-to-right, but this test proved otherwise:
string = AutoHotkey is a better language than VBScript
search = AutoHotkey OR VBScript AND Microsoft
Left-to-right on that would result in 0, while AND before OR results in 1, which is the actual result. I'm going to recommend that users take advantage of parentheses to control precedence, but I still need to explain what the default precedence is. Thanks, Joe
Re: Boolean Search as the needle searching a haystack
eval() treats the final expression just like any mathematical expression. The final expression should only contain 1, 0, +, *, ! or ().
search = 1 OR 1 AND 0
search = 1 + 1 * 0
search = 1 + 0
search = 1
Code: Select all
string = AutoHotkey is a better language than VBScript
search = AutoHotkey OR VBScript AND Microsoft
search = 1 + 1 * 0
search = 1 + 0
search = 1
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
The thing is, this evaluates to 0 left-to-right, but to 1 if the * operator is evaluated before the + operator.search = 1 + 1 * 0
Re: Boolean Search as the needle searching a haystack
Code: Select all
algebra:
x=1
y=1
z=0
a= x+yz ;1 OR 1 AND 0
a= 1+1*0
a= 1+0
a=1
Code: Select all
while pos := RegExMatch(search1,"((?:(?=\S)[^\s" op_and op_or op_not "()]+\s*)+)",m,A_Index=1?1:pos+StrLen(m))
search2 := RegExReplace(A_Index=1?search1:search2,option_start1 "\Q" trim(m) "\E" option_end1,RegExMatch(string,option_start2 "\Q" trim(m) "\E" option_end2)>0?1:0)
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
Yes, it's true that standard algebraic precedence is multiplication before addition, but in this case, it depends on how Eval was programmed, and I didn't study it well enough to figure that out. Also, there are different ways that unary operators (such as !) are handled, so that also depends on how Eval was programmed. For example, the calculator built into Windows does left-to-right precedence in Standard mode, so entering 1+1*0 results in 0, not 1; however, in Scientific mode, it does standard math precedence, so entering 1+1*0 results in 1. But it's good to know that Eval was programmed to perform multiplication before addition - thanks for explaining that (and it handles the unary operator ! first - also good).
Regarding the new, simplified code, so far it has worked perfectly! Thank you for your ongoing efforts to improve the solution. Regards, Joe
Regarding the new, simplified code, so far it has worked perfectly! Thank you for your ongoing efforts to improve the solution. Regards, Joe
Re: Boolean Search as the needle searching a haystack
It seems I missed this part in the eval() ; Logic operators: !, ||, &&; ternary operator: (_?_:_);
You can replace those search3 with:
You can replace those search3 with:
Code: Select all
search3 := RegExReplace(search2,op_and,"&&") ;AND
search3 := RegExReplace(search3,op_or,"||") ;OR
search3 := RegExReplace(search3,op_not,"!") ;NOT
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
Very nice! That eliminates results greater than 1. For example:
string = AutoHotkey is a better language than VBScript
search = AutoHotkey or language or VBScript
The Eval result with the previous code was 3. With the new code, it is 1. Excellent! Regards, Joe
string = AutoHotkey is a better language than VBScript
search = AutoHotkey or language or VBScript
The Eval result with the previous code was 3. With the new code, it is 1. Excellent! Regards, Joe
- FanaticGuru
- Posts: 1906
- Joined: 30 Sep 2013, 22:25
Re: Boolean Search as the needle searching a haystack
I know this has pretty much been solved to the satisfaction of the original post but it caught my eye and looked like an interesting parsing exercise.
Here is my more concise approach with everything in one function without the dependency of a library function which seems overkill.
It should accept any characters other than () AND OR NOT ¥ as those have special meaning. I toyed with allowing wildcards and such. And also with allowing searches that include () AND OR NOT as words but then you have to force the user to put things in quotes or some other escaping to differentiate between things to actually search for and conditional commands. Then I figured it is probably more important to keep things simple for the user.
I have tested this practically none but it seems to work ok. If you want to take this approach and find a Needle that it does not handle properly let me know.
EDIT: Added a way for power users to have some control without burdening the normal user. Any search item that begin and ends with %% are treated as RegEx expressions.
FG
Here is my more concise approach with everything in one function without the dependency of a library function which seems overkill.
Code: Select all
Haystack := "Hello world, this is a beta test" ; TRUE
Needle := "(hello AND world) AND (is a beta AND (test AND (NOT production)))"
;~ Haystack := "Hello world, this is a beta test" ; TRUE
;~ Needle := "(hello AND world) OR (is a alpha AND (test AND (NOT production)))"
;~ Haystack := "Hello world, this i% a beta test" ; TRUE
;~ Haystack := "Hello worlds, this i% a beta test" ; TRUE
;~ Haystack := "Helloworlds, this i% a beta test" ; FALSE
;~ Haystack := "Hello world , this i% a beta test" ; FALSE
;~ Haystack := "Hello world, this i% a beta test end" ; FALSE
;~ Haystack := " World, Hello, this i% a beta test" ; FALSE
;~ Needle := "(%%hello(?=.*world)%% AND %% worlds?,%%) AND (i% a beta AND (%%test$%% AND (NOT production)))"
;~ Haystack := "(Hello) worlds, this i% a beta test" ; TRUE
;~ Haystack := "Hello worlds, this i% a beta test" ; FALSE
;~ Needle := "(%%\(hello\)%% AND %% worlds?,%%) AND (i% a beta AND (%%test$%% AND (NOT production)))"
if SearchBoolean(Haystack, Needle, false)
MsgBox TRUE
else
MsgBox FALSE
SearchBoolean(H, N, C:=false) ; Haystack, Needle, Case-sensitive
{
NT := RegExReplace(N, "AND|OR|NOT", "¥")
Loop, Parse, NT, ¥
if T := Trim(A_LoopField, " ()")
if RegExMatch(T,"^%%(.*)%%$", M)
N := RegExReplace(N, (C ? "" : "i)")"\Q" T "\E", RegExMatch(H, (C ? "" : "i)") M1) ? 1 : 0,, 1)
else
N := RegExReplace(N, (C ? "" : "i)")"\Q" T "\E", RegExMatch(H, (C ? "" : "i)")"\Q" T "\E") ? 1 : 0,, 1)
Loop
{
NN := N, NNP := 1
Loop
if (P := RegExMatch(NN, "U)\(\s*([^\(]*)\s*\)", M))
NN := M1, NNP := P
else
break
NNN := NN, X := 1
Loop
{
if RegExMatch(NNN, "(NOT)\s*(\d)", M)
NNN := RegExReplace(NNN, "NOT\s*\d", X := ! M2,, 1)
if !RegExMatch(NNN, "(\d)\s*(AND|OR)\s*(\d)", M)
break
else if (M2 = "AND")
X := M1 & M3 & X
else if (M2 = "OR")
X := M1 | M3 & X
NNN := RegExReplace(NNN, "\d\s*(AND|OR)\s*\d", X,, 1)
}
N := RegExReplace(N, "\(?\s*" NN "\s*\)?", X,, 1, NNP)
if (N = 1 or N = 0)
return N
}
}
I have tested this practically none but it seems to work ok. If you want to take this approach and find a Needle that it does not handle properly let me know.
EDIT: Added a way for power users to have some control without burdening the normal user. Any search item that begin and ends with %% are treated as RegEx expressions.
FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
Re: Boolean Search as the needle searching a haystack
FanaticGuru wrote:I know this has pretty much been solved to the satisfaction of the original post but it caught my eye and looked like an interesting parsing exercise.
Here is my more concise approach with everything in one function without the dependency of a library function which seems overkill.
It should accept any characters other than () AND OR NOT ¥ as those have special meaning. I toyed with allowing wildcards and such. And also with allowing searches that include () AND OR NOT as words but then you have to force the user to put things in quotes or some other escaping to differentiate between things to actually search for and conditional commands. Then I figured it is probably more important to keep things simple for the user.Code: Select all
Haystack := "Hello world, this is a beta test" ; TRUE Needle := "(hello AND world) AND (is a beta AND (test AND (NOT production)))" ;~ Haystack := "Hello world, this is a beta test" ; TRUE ;~ Needle := "(hello AND world) OR (is a alpha AND (test AND (NOT production)))" ;~ Haystack := "Hello world, this i% a beta test" ; TRUE ;~ Haystack := "Hello worlds, this i% a beta test" ; TRUE ;~ Haystack := "Helloworlds, this i% a beta test" ; FALSE ;~ Haystack := "Hello world , this i% a beta test" ; FALSE ;~ Haystack := "Hello world, this i% a beta test end" ; FALSE ;~ Haystack := " World, Hello, this i% a beta test" ; FALSE ;~ Needle := "(%%hello(?=.*world)%% AND %% worlds?,%%) AND (i% a beta AND (%%test$%% AND (NOT production)))" ;~ Haystack := "(Hello) worlds, this i% a beta test" ; TRUE ;~ Haystack := "Hello worlds, this i% a beta test" ; FALSE ;~ Needle := "(%%\(hello\)%% AND %% worlds?,%%) AND (i% a beta AND (%%test$%% AND (NOT production)))" if SearchBoolean(Haystack, Needle, false) MsgBox TRUE else MsgBox FALSE SearchBoolean(H, N, C:=false) ; Haystack, Needle, Case-sensitive { NT := RegExReplace(N, "AND|OR|NOT", "¥") Loop, Parse, NT, ¥ if T := Trim(A_LoopField, " ()") if RegExMatch(T,"^%%(.*)%%$", M) N := RegExReplace(N, (C ? "" : "i)")"\Q" T "\E", RegExMatch(H, (C ? "" : "i)") M1) ? 1 : 0,, 1) else N := RegExReplace(N, (C ? "" : "i)")"\Q" T "\E", RegExMatch(H, (C ? "" : "i)")"\Q" T "\E") ? 1 : 0,, 1) Loop { NN := N, NNP := 1 Loop if (P := RegExMatch(NN, "U)\(\s*([^\(]*)\s*\)", M)) NN := M1, NNP := P else break NNN := NN, X := 1 Loop { if RegExMatch(NNN, "(NOT)\s*(\d)", M) NNN := RegExReplace(NNN, "NOT\s*\d", X := ! M2,, 1) if !RegExMatch(NNN, "(\d)\s*(AND|OR)\s*(\d)", M) break else if (M2 = "AND") X := M1 & M3 & X else if (M2 = "OR") X := M1 | M3 & X NNN := RegExReplace(NNN, "\d\s*(AND|OR)\s*\d", X,, 1) } N := RegExReplace(N, "\(?\s*" NN "\s*\)?", X,, 1, NNP) if (N = 1 or N = 0) return N } }
I have tested this practically none but it seems to work ok. If you want to take this approach and find a Needle that it does not handle properly let me know.
EDIT: Added a way for power users to have some control without burdening the normal user. Any search item that begin and ends with %% are treated as RegEx expressions.
FG
Code: Select all
Haystack:="hello word Microsoft"
Needle := "NOT (Microsoft word)"
- LinearSpoon
- Posts: 156
- Joined: 29 Sep 2013, 22:55
Re: Boolean Search as the needle searching a haystack
I had wrote this a few days ago, but I didn't post it because it seemed you had a solution. For those versed in CS, this is a recursive descent parser for the grammar:
I'm leaving for classes in a few minutes, so it's not as well tested as I would like. I'll be back in a few hours if you have questions.
Code: Select all
O -> A (OR A)* | A
A -> N (AND N)* | N
N -> NOT* T | T
T -> {text} | (O)
Code: Select all
searchText := "AutoHotkey is a better language than VBScript"
testcases =
(
(((((not hello)))))
(AutoHotkey or xyz) and not zzz
VBScript and AutoHotkey
x OR VBScript AND x
(x OR VBScript) AND VBScript
AutoHotkey is a better language AND VBscript
NOT a big long string of text AND VBscript
VBScript and not (language and xxx)
VBscript and AutoHotkey and VBscript and AutoHotkey and VBscript
Not Not not x
AutoHotkey and (VBscript and AutoHotkey and VBscript) and AutoHotkey
VBScript OR x OR language OR is OR z
)
Loop, Parse, testcases, `n
output .= A_LoopField " = " parse(A_LoopField, searchText) "`n"
Msgbox % searchText "`n" output
ExitApp
parse(search, byref haystack)
{
if (!search)
return 1
x := new symbols
search := RegexReplace(search, "i)(^|[\(\s])\K(and|or|not)(?=\s)", "|$2|")
search := RegexReplace(search, "(\)|\()", "|$1|")
Loop, Parse, search, |, %A_Space%%A_Tab%
{
if (A_LoopField = "")
continue
if (A_LoopField = "(")
x.Insert({token:"LPAREN", value:"("})
else if (A_LoopField = ")")
x.Insert({token:"RPAREN", value:")"})
else if RegexMatch(A_LoopField, "i)^(and|or|not)$")
x.Insert({token:A_LoopField, value:A_LoopField})
else
x.Insert({token:"TEXT", value:A_LoopField})
}
return O(x, haystack)
}
T(tokens, byref haystack)
{
if (tokens.current().token = "LPAREN")
{
tokens.advance()
r := O(tokens, haystack)
if (tokens.current().token != "RPAREN")
throw ("Expected "")"" but got " tokens.current().value "`n" tokens.pos)
tokens.advance()
return r
}
if (tokens.current().token != "TEXT")
throw ("Expected text but got " tokens.current().token "`n" tokens.pos)
r := Instr(haystack, tokens.current().value)
tokens.advance()
return r
}
N(tokens, byref haystack)
{
n := 0
Loop
{
if (tokens.current().token != "NOT")
{
v := T(tokens, haystack)
return n ? !v : v
}
n := !n
tokens.advance()
} until (tokens.end() || tokens.current().token = "RPAREN")
}
A(tokens, byref haystack)
{
r := N(tokens, haystack)
Loop
{
if (tokens.current().token != "AND")
return r
tokens.advance()
r := N(tokens, haystack) && r
} until (tokens.end() || tokens.current().token = "RPAREN")
return r
}
O(tokens, byref haystack)
{
r := A(tokens, haystack)
Loop
{
if (tokens.current().token != "OR")
return r
tokens.advance()
r := A(tokens, haystack) || r
} until (tokens.end() || tokens.current().token = "RPAREN")
return r
}
class symbols
{
__New()
{
this.pos := 1
}
advance()
{
return this[++this.pos]
}
current()
{
return this[this.pos]
}
end()
{
return this[this.pos] = ""
}
}
Re: Boolean Search as the needle searching a haystack
Yes, MONSTER's Evaluate Arithmetic Expression function is overkill. So I made a simple boolean logic evaluator. So far it has not failed (yet).
Edit: Updated the functionSearch with boolean logic evaluator
Edit: Updated the function
Code: Select all
expression = 1 && 0 || !(0|| 0) ;1
MsgBox % eval_logic(expression)
expression = (1 || 0|| 0 ) && 1 && !(0|| 0) ;1
MsgBox % eval_logic(expression)
expression = 1 OR 1 AND 0 ; 1
MsgBox % eval_logic(expression,"AND","OR","NOT")
expression = (1 + 1 ) * 1 * 0 ;0
MsgBox % eval_logic(expression,"*","+","!")
eval_logic(expr, op_and:="&&", op_or:="||", op_not:="!") {
tmp:=expr
expr:=RegExReplace(expr,"\s+")
expr:=RegExReplace(expr,"11+","1")
expr:=RegExReplace(expr,"00+","0")
if RegExMatch(expr,"[^10()" op_and op_or op_not "]",m)
return expr "`n" "Invalid Character: " m "`nAllowed Characters: 1 0 ( ) " op_and " " op_or " " op_not
Loop
{
;AND
x:=expr
while pos := RegExMatch(expr,"\Q" op_and "\E",m,A_Index=1?1:pos+StrLen(m))
{
StringReplace, x, x, 1%op_and%1, 1, All
StringReplace, x, x, 0%op_and%0, 0, All
StringReplace, x, x, 0%op_and%1, 0, All
StringReplace, x, x, 1%op_and%0, 0, All
StringReplace, x, x, 01, 0, All ; 0 AND 1 = 0
StringReplace, x, x, 10, 0, All ;1 AND 0 = 0
}
expr:=x
;OR
StringReplace, expr, expr, 1%op_or%1, 1, All
StringReplace, expr, expr, 0%op_or%0, 0, All
StringReplace, expr, expr, 0%op_or%1, 1, All
StringReplace, expr, expr, 1%op_or%0, 1, All
StringReplace, expr, expr, (1), 1, All
StringReplace, expr, expr, (0), 0, All
;NOT
StringReplace, expr, expr, %op_not%1, 0, All
StringReplace, expr, expr, %op_not%0, 1, All
if (expr="1" or expr="0")
return expr
if (A_Index>500)
return "Invalid or Incomplete Expression`n" tmp "`nAllowed Characters: 1 0 ( ) " op_and " " op_or " " op_not
}
}
Code: Select all
string=
(
@utoHo:tkey? is a
better
language``s's
than
hello VBScript is always a scr1ipt
Microsoft word
)
;string = word hello Microsoft
;search = (@utoHo:tkey? OR Auto Hotkey OR Auto-Hotkey) AND language``s's AND NOT (Microsoft OR VBScr|pt)
;search = NOT (Microsoft word)
search = (@utoHo:tkey? OR Auto Hotkey OR Auto-Hotkey) AND language``s's AND NOT (Microsoft word OR VBScript is always a script)
op_and=¤ ;AND
op_or=µ ;OR
op_not=¡ ;NOT
case_sense=1 ;case insensitive
whole_word=0 ;match whole word/sentence
option_start1=
option_start2=
option_end1=
option_end2=
if case_sense=1
{
option_start1 .= "i)"
option_start2 .= "i)"
}
if whole_word=1
{
;option_start1 .= "\b" ;alphanumeric only
;option_end1 .= "\b" ;alphanumeric only
;option_start2 .= "\b" ;alphanumeric only
;option_end2 .= "\b" ;alphanumeric only
option_start1 .= "(?:(?<=[\s" op_and op_or op_not "()])|^)" ;alphanumeric + special characters
option_end1 .= "(?:(?=[\s" op_and op_or op_not "()])|$)" ;alphanumeric + special characters
option_start2 .= "(?:(?<=\s)|^)" ;alphanumeric + special characters
option_end2 .= "(?:(?=\s)|$)" ;alphanumeric + special characters
}
search1 := RegExReplace(RegExReplace(RegExReplace(search,"i)\bAND\b",op_and),"i)\bOR\b",op_or),"i)\bNOT\b",op_not)
while pos := RegExMatch(search1,"((?:(?=\S)[^\s" op_and op_or op_not "()]+\s*)+)",m,A_Index=1?1:pos+StrLen(m))
search2 := RegExReplace(A_Index=1?search1:search2,option_start1 "\Q" trim(m) "\E" option_end1,RegExMatch(string,option_start2 "\Q" trim(m) "\E" option_end2)>0?1:0)
search3 := RegExReplace(search2,op_and,"&&") ;AND
search3 := RegExReplace(search3,op_or,"||") ;OR
search3 := RegExReplace(search3,op_not,"!") ;NOT
MsgBox % "String: " string "`nExpression0: " search "`nExpression1: " search1 "`nExpression2: " search2 "`nExpression3: " search3 "`neval_logic Result: " eval_logic(search3)
if (eval_logic(search3)>0)
MsgBox do something here
ExitApp
eval_logic(expr, op_and:="&&", op_or:="||", op_not:="!") {
tmp:=expr
expr:=RegExReplace(expr,"\s+")
expr:=RegExReplace(expr,"11+","1")
expr:=RegExReplace(expr,"00+","0")
if RegExMatch(expr,"[^10()" op_and op_or op_not "]",m)
return expr "`n" "Invalid Character: " m "`nAllowed Characters: 1 0 ( ) " op_and " " op_or " " op_not
Loop
{
;AND
x:=expr
while pos := RegExMatch(expr,"\Q" op_and "\E",m,A_Index=1?1:pos+StrLen(m))
{
StringReplace, x, x, 1%op_and%1, 1, All
StringReplace, x, x, 0%op_and%0, 0, All
StringReplace, x, x, 0%op_and%1, 0, All
StringReplace, x, x, 1%op_and%0, 0, All
StringReplace, x, x, 01, 0, All ; 0 AND 1 = 0
StringReplace, x, x, 10, 0, All ;1 AND 0 = 0
}
expr:=x
;OR
StringReplace, expr, expr, 1%op_or%1, 1, All
StringReplace, expr, expr, 0%op_or%0, 0, All
StringReplace, expr, expr, 0%op_or%1, 1, All
StringReplace, expr, expr, 1%op_or%0, 1, All
StringReplace, expr, expr, (1), 1, All
StringReplace, expr, expr, (0), 0, All
;NOT
StringReplace, expr, expr, %op_not%1, 0, All
StringReplace, expr, expr, %op_not%0, 1, All
if (expr="1" or expr="0")
return expr
if (A_Index>500)
return "Invalid or Incomplete Expression`n" tmp "`nAllowed Characters: 1 0 ( ) " op_and " " op_or " " op_not
}
}
Last edited by ahcahc on 10 Sep 2014, 11:16, edited 2 times in total.
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
Hi ahcahc, FanaticGuru, and LinearSpoon,
Sorry for the delayed response, but I am in the US Central Time zone and all of this occurred while I was sleeping. Thanks to all of you for your efforts in improving the solution. I'll try to do some testing later today and Friday (I'm tied up this morning and all day tomorrow). Regards, Joe
Sorry for the delayed response, but I am in the US Central Time zone and all of this occurred while I was sleeping. Thanks to all of you for your efforts in improving the solution. I'll try to do some testing later today and Friday (I'm tied up this morning and all day tomorrow). Regards, Joe
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
Hi ahcahc,
The new eval_logic code works great when the search term is valid (at least, none of my tests has failed yet). However, if the search term contains unpaired parentheses or one of the three invalid search characters, it hangs. For example, a search for
AutoHotkey)
or
AutoHotkey%op_and%
hangs. I don't care about this, but wanted to pass it along to you. I'm going to handle it by checking the search term right after the user inputs it to make sure that it (1) contains an equal number of left and right parentheses and (2) does not contain one of the three invalid characters. Not being a RegEx expert, I'll probably do something like this right after inputting the search term from the user:
I'm sure you could do it much more elegantly with RegEx, but the above is in my comfort zone.
I've defined the three "_out" variables to contain some text describing each character (e.g., "upside down exclamation mark - ASCII code 161"), so it's easier for the user to understand which characters are invalid. Regards, Joe
The new eval_logic code works great when the search term is valid (at least, none of my tests has failed yet). However, if the search term contains unpaired parentheses or one of the three invalid search characters, it hangs. For example, a search for
AutoHotkey)
or
AutoHotkey%op_and%
hangs. I don't care about this, but wanted to pass it along to you. I'm going to handle it by checking the search term right after the user inputs it to make sure that it (1) contains an equal number of left and right parentheses and (2) does not contain one of the three invalid characters. Not being a RegEx expert, I'll probably do something like this right after inputting the search term from the user:
Code: Select all
If (InStr(search,op_and) or InStr(search,op_or) or InStr(search,op_not))
{
MsgBox,4112,Fatal Error,These three characters are not allowed in a search`n%op_and_out%`n%op_or_out%`n%op_not_out%
ExitApp
}
StringReplace,search,search,(,(,UseErrorLevel
NumLeft:=ErrorLevel
StringReplace,search,search,),),UseErrorLevel
NumRight:=ErrorLevel
If (NumLeft<>NumRight)
{
MsgBox,4112,Fatal Error,There are %NumLeft% left parentheses and %NumRight% right parentheses in`n%search%
ExitApp
}
I've defined the three "_out" variables to contain some text describing each character (e.g., "upside down exclamation mark - ASCII code 161"), so it's easier for the user to understand which characters are invalid. Regards, Joe
- FanaticGuru
- Posts: 1906
- Joined: 30 Sep 2013, 22:25
Re: Boolean Search as the needle searching a haystack
Superfluous () were causing problems. NOT Microsoft word worked fine but the unneeded NOT (Microsoft word) did not.ahcahc wrote:Should be true. Need more tweaking.Code: Select all
Haystack:="hello word Microsoft" Needle := "NOT (Microsoft word)"
Add a check to remove unneeded ().
Code: Select all
;~ Haystack := "Hello world, this is a beta test" ; TRUE
;~ Needle := "(hello AND world) AND (is a beta AND (test AND (NOT production)))"
;~ Haystack:="hello word Microsoft" ; TRUE
;~ Needle := "NOT (Microsoft word)"
Haystack:="hello word Microsoft" ; TRUE
Needle := "word"
;~ Haystack:="hello word Microsoft" ; FALSE
;~ Needle := "NOT word"
;~ Haystack:="hello word Microsoft" ; TRUE
;~ Needle := "NOT Microsoft word"
;~ Haystack:="hello word Microsoft" ; FALSE
;~ Needle := "NOT ( ( (word Microsoft)) )"
;~ Haystack:="hello word Microsoft" ; FALSE
;~ Needle := "NOT (Microsoft word) AND NOT ((hello))"
;~ Haystack:="hello ((Microsoft word Microsoft))" ; FALSE
;~ Needle := "NOT Microsoft word"
;~ Haystack := "Hello world, this is a beta test" ; TRUE
;~ Needle := "(hello AND world) OR (is a alpha AND (test AND (NOT production)))"
;~ Haystack := "Hello world, this i% a beta test" ; TRUE
;~ Haystack := "Hello worlds, this i% a beta test" ; TRUE
;~ Haystack := "Helloworlds, this i% a beta test" ; FALSE
;~ Haystack := "Hello world , this i% a beta test" ; FALSE
;~ Haystack := "Hello world, this i% a beta test end" ; FALSE
;~ Haystack := " World, Hello, this i% a beta test" ; FALSE
;~ Needle := "(%%hello(?=.*world)%% AND %% worlds?,%%) AND (i% a beta AND (%%test$%% AND (NOT production)))"
;~ Haystack := "(Hello) worlds, this i% a beta test" ; TRUE
;~ Haystack := "Hello worlds, this i% a beta test" ; FALSE
;~ Needle := "(%%\(hello\)%% AND %% worlds?,%%) AND (i% a beta AND (%%test$%% AND (NOT production)))"
if SearchBoolean(Haystack, Needle, false)
MsgBox TRUE
else
MsgBox FALSE
SearchBoolean(H, N, C:=false) ; Haystack, Needle, Case-sensitive
{
NT := RegExReplace(N, "AND|OR|NOT", "¥")
Loop, Parse, NT, ¥
if T := Trim(A_LoopField, " ()")
if RegExMatch(T,"^%%(.*)%%$", M)
N := RegExReplace(N, (C ? "" : "i)")"\Q" T "\E", RegExMatch(H, (C ? "" : "i)") M1) ? 1 : 0,, 1)
else
N := RegExReplace(N, (C ? "" : "i)")"\Q" T "\E", RegExMatch(H, (C ? "" : "i)")"\Q" T "\E") ? 1 : 0,, 1)
while !(N =1 or N = 0)
{
N := RegExReplace(N, "\([\(\s]*(\d)[\)\s]*\)", " $1 ")
NN := N, NNP := 1
Loop
if (P := RegExMatch(NN, "U)\(\s*([^\(]*)\s*\)", M))
NN := M1, NNP := P
else
break
NNN := NN, X := 1
Loop
{
while RegExMatch(NNN, "NOT\s*(\d)", M)
NNN := RegExReplace(NNN, "NOT\s*\d", X := ! M1,, 1)
if !RegExMatch(NNN, "(\d)\s*(AND|OR)\s*(\d)", M)
break
else if (M2 = "AND")
X := M1 & M3 & X
else if (M2 = "OR")
X := M1 | M3 & X
NNN := RegExReplace(NNN, "\d\s*(AND|OR)\s*\d", X,, 1)
}
N := RegExReplace(N, "\(?\s*" NN "\s*\)?", X,, 1, NNP)
}
return N
}
FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon
Hotstring Manager - Create and Manage Hotstrings
[Class] WinHook - Create Window Shell Hooks and Window Event Hooks
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
Hi FG,
I've been testing your updated code that removes the superfluous parens. It has worked on many complex expressions, but amazingly fails on this simple one:
Haystack:="hello world"
Needle:="goodbye"
Should be FALSE, but comes up TRUE.
You're right about keeping things simple for the users, but I do want a case sensitive/insensitive option and a whole-word option. I'll get those from the users via GUI radio buttons, so it will be very easy for them to understand and use. I don't think that the ability to search for AND, OR, NOT is important, but users are sometimes hard to predict. Wildcards is an interesting idea, but my thinking right now is that a non-whole-word search is a good enough wildcard subset, without any of the complexity of wildcard characters.
Nice addition on treating a search that begins and ends with a % as a RegEx, although I already have a radio button for the type of search (Simple, Boolean, RegEx). I think that's an easier interface for the users, although anyone who can do a RegEx can certainly handle the %...% syntax. Regards, Joe
UPDATE: I just saw your change to handle single items that have no AND OR NOT in them. That fixed the problem shown above, although I haven't regression tested it. Thanks for the fix!
I've been testing your updated code that removes the superfluous parens. It has worked on many complex expressions, but amazingly fails on this simple one:
Haystack:="hello world"
Needle:="goodbye"
Should be FALSE, but comes up TRUE.
You're right about keeping things simple for the users, but I do want a case sensitive/insensitive option and a whole-word option. I'll get those from the users via GUI radio buttons, so it will be very easy for them to understand and use. I don't think that the ability to search for AND, OR, NOT is important, but users are sometimes hard to predict. Wildcards is an interesting idea, but my thinking right now is that a non-whole-word search is a good enough wildcard subset, without any of the complexity of wildcard characters.
Nice addition on treating a search that begins and ends with a % as a RegEx, although I already have a radio button for the type of search (Simple, Boolean, RegEx). I think that's an easier interface for the users, although anyone who can do a RegEx can certainly handle the %...% syntax. Regards, Joe
UPDATE: I just saw your change to handle single items that have no AND OR NOT in them. That fixed the problem shown above, although I haven't regression tested it. Thanks for the fix!
- JoeWinograd
- Posts: 2198
- Joined: 10 Feb 2014, 20:00
- Location: U.S. Central Time Zone
Re: Boolean Search as the needle searching a haystack
Hi LinearSpoon,
Your code has worked on everything I've tried. Nice approach on the test cases - great idea being able to process all of them in a single run via the loop/parse/`n technique.
It throws an error on un-paired parentheses, as you anticipated with the Throw command, and that's fine - as I mentioned to ahcahc, I'll make sure there are an equal number of lefts and rights before calling the search.
I don't understand the "class symbols" statement. It's not a label and not a function and I couldn't find it described in the AHK doc. What is it?
Thanks, Joe
Your code has worked on everything I've tried. Nice approach on the test cases - great idea being able to process all of them in a single run via the loop/parse/`n technique.
It throws an error on un-paired parentheses, as you anticipated with the Throw command, and that's fine - as I mentioned to ahcahc, I'll make sure there are an equal number of lefts and rights before calling the search.
I don't understand the "class symbols" statement. It's not a label and not a function and I couldn't find it described in the AHK doc. What is it?
Computer Science, I presume?I'm leaving for classes in a few minutes,
Thanks, Joe
Who is online
Users browsing this forum: doodles333 and 311 guests