AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Email validity check - isValidEmail() [function]

 
Reply to topic    AutoHotkey Community Forum Index -> Scripts & Functions
View previous topic :: View next topic  
Author Message
daonlyfreez



Joined: 16 Mar 2005
Posts: 949
Location: Berlin

PostPosted: Thu Jul 30, 2009 11:29 am    Post subject: Email validity check - isValidEmail() [function] Reply with quote

I came across this article on validating email addresses, and decided to convert the code into AHK.

Here it is. Please test.

Code:
#SingleInstance force

/*

Valid Email RegEx?

http://www.pgregg.com/projects/php/code/showvalidemail.php
http://www.pgregg.com/projects/php/code/validate_email.inc.phps

*/

emailTest =
(LTrim % Join`n
name.lastname@domain.com|true
.@|false
a@b|false
@bar.com|false
@@bar.com|false
a@bar.com|true
aaa.com|false
aaa@.com|false
aaa@.123|false
aaa@[123.123.123.123]|true
aaa@[123.123.123.123]a|false
aaa@[123.123.123.333]|false
a@bar.com.|false
a@bar|false
a-b@bar.com|true
+@b.c|false
+@b.com|true
a@-b.com|false
a@b-.com|false
-@..com|false
-@a..com|false
a@b.co-foo.uk|true
"hello my name is"@stutter.com|true
"Test \"Fail\" Ing"@example.com|true
valid@special.museum|true
invalid@special.museum-|false
shaitan@my-domain.thisisminekthx|false
test@...........com|false
foobar@192.168.0.1|false
"Abc\@def"@example.com|true
"Fred Bloggs"@example.com|true
"Joe\\Blow"@example.com|true
"Abc@def"@example.com|true
customer/department=shipping@example.com|true
$A12345@example.com|true
!def!xyz%abc@example.com|true
_somename@example.com|true
Test \\'.chr(10).' Folding \\'.chr(10).' Whitespace@example.com|true
HM2Kinsists@(that comments are allowed)this.is.ok|true
user%uucp!path@somehost.edu|true
)


Loop, Parse, emailTest, `n
{
  StringSplit, emailTestArray, A_LoopField, |
  isit := isValidEmail(emailTestArray1)
  ;If (emailtestarray2 != resArray1) ; error
    MsgBox,, Testing,
    (LTrim
     Email: %emailTestArray1%
     
     Should be: %emailTestArray2%
     
     Is reported as: %isit%
    )
}

MsgBox Done testing

Return

isValidEmail(emailstr)
{
  ; Get length
  emailstr_len := StrLen(emailstr)
  ; Remove whitespace (AutoTrim)
  emailstr = %emailstr%
  ; Make lowercase
  StringLower, emailstr, emailstr
  ; Split it up into before and after the @ symbol
  StringGetPos, atPos, emailstr, @, R
  If ErrorLevel
    Return false ; no @
  StringLeft, local_part, emailstr, %atPos%
  StringRight, domain_part, emailstr, % emailstr_len - atPos - 1
  ; Sanitize quoted parts 
  local_part := RegExReplace(local_part, "\\\.", "_")
  local_part := RegExReplace(local_part, """[^""]+""", ".")
  ; Comments ( this is a comment ) are permitted in domain parts
  domain_part := RegExReplace(domain_part, "\([^()]*\)", "")
  ; Make sure there are no more @ (we sanitized valid ones above)
  If InStr(local_part, "@")
    Return false ; too many @
  ; Check that the username is >= 1 char
  If StrLen(local_part) = 0
    Return false ; username missing
  ; Split the domain part into the dotted parts
  StringSplit, domain_components, domain_part, `.
  ; Check there are at least 2
  If domain_components0 < 2
    Return false ; not enough domain components
  ; Check each domain part to ensure it doesn't start or end with a bad char
  Loop %domain_components0%
  {
    domain_component := domain_components%A_Index%
    If (StrLen(domain_component) > 0)
    {
      StringLeft, firstChar, domain_component, 1
      StringRight, lastChar, domain_component, 1
      If RegExMatch(firstChar, "[\.-]") Or RegExMatch(lastChar, "[\.-]")
        Return false ; wrong start/end character in domain component
    }
    Else
      Return false ; domain component missing
  }
  ; Check the last domain component has 2-6 chars (.uk to .museum)
  domain_last := domain_components%domain_components0%
  If (StrLen(domain_last) < 2) Or If (StrLen(domain_last) > 6)
    Return false ; TLD too large or small
  ; Check for valid chars - Domains can only have A-Z, 0-9, ., and the - chars,
  ; or be in the form [123.123.123.123]
  If RegExMatch(domain_part, "^\[(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\]$")
  {
    If ip2long(domain_part) != 0
      Return true ; ip
    Else
      Return false ; ip error
  }
  If RegExMatch(domain_part, "^[a-z0-9\.-]+$")
    Return true ; domain
  ; If we get here then it didn't pass
  Return false ; end of function
}

ip2long(ip)
{  ; http://www.cflib.org/udf/ip2long
  ip := RegExReplace(ip, "\[|\]", "") ; remove xtra chars
  StringSplit, iparr, ip, `.
  If (iparr0 != 4)
    Return False
  If (iparr1 > 255 Or iparr2 > 255 Or iparr3 > 255 Or iparr4 > 255)
    Return False
  Else
    Return (iparr1*256^3) + (iparr2*256^2) + (iparr3*256) + iparr4
}


Edit: Clarified demo a bit more (I hope), and some changes.
_________________
mirror 1mirror 2mirror 3ahk4.me • PM or


Last edited by daonlyfreez on Thu Jul 30, 2009 6:55 pm; edited 2 times in total
Back to top
View user's profile Send private message
SoggyDog



Joined: 02 May 2006
Posts: 783
Location: Greeley, CO

PostPosted: Thu Jul 30, 2009 3:46 pm    Post subject: Reply with quote

Sample Usage?

All I get is "Done testing".
_________________

SoggyDog
Dwarf Fortress:
"The most intriguing game I've ever played."
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
n-l-i-d
Guest





PostPosted: Thu Jul 30, 2009 6:29 pm    Post subject: Reply with quote

That means the testing went well. If no other MessageBox shows up before that one. The idea is to change the input.

Wink
Back to top
daonlyfreez



Joined: 16 Mar 2005
Posts: 949
Location: Berlin

PostPosted: Thu Jul 30, 2009 6:55 pm    Post subject: Reply with quote

I changed the demo a bit. You'll now see a messagebox on every check.
_________________
mirror 1mirror 2mirror 3ahk4.me • PM or
Back to top
View user's profile Send private message
SoggyDog



Joined: 02 May 2006
Posts: 783
Location: Greeley, CO

PostPosted: Thu Jul 30, 2009 7:45 pm    Post subject: Reply with quote

I get it now;
Just didn't spend enough time with it earlier.

Thanks.
_________________

SoggyDog
Dwarf Fortress:
"The most intriguing game I've ever played."
Back to top
View user's profile Send private message Send e-mail Visit poster's website MSN Messenger
n-l-i-d
Guest





PostPosted: Sat Oct 10, 2009 9:14 pm    Post subject: Reply with quote

I just found out that the RegEx by arpad3, named on the site, also works.

So, here is an alternative function.

The regex line should be one line (wordwrap!):

Quote:
static regex := "is) ... "


Code:
isValidEmail(emailstr)
{
/* THIS NEEDS TO BE UNCOMMENTED, AND TRANSFORMED INTO ONE LINE!!!
static regex := "is)^(?:""(?:\\\\.|[^""])*""|[^@]+)@(?=[^()]*(?:\([^)]*\)
[^()]*)*\z)(?![^ ]* (?=[^)]+(?:\(|\z)))(?:(?:[a-z\d() ]+(?:[a-z\d() -]*[()a-
z\d])?\.)+[a-z\d]{2,6}|\[(?:(?:1?\d\d?|2[0-4]\d|25[0-4])\.){3}(?:1?\d\d?|
2[0-4]\d|25[0-4])\]) *\z"
*/
If RegExMatch(emailstr, regex)
  Return true
Else
  Return false
}
Back to top
fincs



Joined: 05 May 2007
Posts: 1162
Location: Seville, Spain

PostPosted: Sat Oct 10, 2009 9:35 pm    Post subject: Reply with quote

n-l-i-d wrote:
(code snip)


It can be shortened to this working snippet of code:
Code:
isValidEmail(emailstr){
    static regex := "is)^(?:""(?:\\\\.|[^""])*""|[^@]+)@(?=[^()]*(?:\([^)]*\)"
    . "[^()]*)*\z)(?![^ ]* (?=[^)]+(?:\(|\z)))(?:(?:[a-z\d() ]+(?:[a-z\d() -]*[()a-"
    . "z\d])?\.)+[a-z\d]{2,6}|\[(?:(?:1?\d\d?|2[0-4]\d|25[0-4])\.){3}(?:1?\d\d?|"
    . "2[0-4]\d|25[0-4])\]) *\z"
    return RegExMatch(emailstr, regex) != 0
}

_________________
fincs
Get SciTE4AutoHotkey v3.0.00 (Release Candidate)
[My project list]
Back to top
View user's profile Send private message
n-l-i-d
Guest





PostPosted: Sat Oct 10, 2009 9:41 pm    Post subject: Reply with quote

Duh. Ok, I thought you couldn't concatenate that way when initializing variables in functions, but I'm wrong again. Smile

I found another source of even more thoughts on perfecting this: RFC-compliant email address validator. There are also more testcases there.

So if you really need to be sure about your email-addresses, don't want to false-positive/negative any, the function could/should be tested and perfected more.
Back to top
n-l-i-d
Guest





PostPosted: Sat Oct 10, 2009 9:56 pm    Post subject: Reply with quote

Your version didn't work Confused

I needed to separately init the regex. I don't know if this still has the advantage of loading the variable only once.

Code:

isValidEmail(emailstr){
  static regex
  regex := "is)^(?:""(?:\\\\.|[^""])*""|[^@]+)@(?=[^()]*(?:\([^)]*\)"
    . "[^()]*)*\z)(?![^ ]* (?=[^)]+(?:\(|\z)))(?:(?:[a-z\d() ]+(?:[a-z\d() -]*[()a-"
    . "z\d])?\.)+[a-z\d]{2,6}|\[(?:(?:1?\d\d?|2[0-4]\d|25[0-4])\.){3}(?:1?\d\d?|"
    . "2[0-4]\d|25[0-4])\]) *\z"
  return RegExMatch(emailstr, regex) != 0
}
Back to top
fincs



Joined: 05 May 2007
Posts: 1162
Location: Seville, Spain

PostPosted: Sun Oct 11, 2009 3:30 pm    Post subject: Reply with quote

n-l-i-d wrote:
Your version didn't work Confused

I needed to separately init the regex. I don't know if this still has the advantage of loading the variable only once.


No, it's a typo (quotes need to be escaped via "" inside strings). Corrected version:

Code:

isValidEmail(emailstr){
  static regex := "is)^(?:""""(?:\\\\.|[^""""])*""""|[^@]+)@(?=[^()]*(?:\([^)]*\)"
    . "[^()]*)*\z)(?![^ ]* (?=[^)]+(?:\(|\z)))(?:(?:[a-z\d() ]+(?:[a-z\d() -]*[()a-"
    . "z\d])?\.)+[a-z\d]{2,6}|\[(?:(?:1?\d\d?|2[0-4]\d|25[0-4])\.){3}(?:1?\d\d?|"
    . "2[0-4]\d|25[0-4])\]) *\z"
  return RegExMatch(emailstr, regex) != 0
}

_________________
fincs
Get SciTE4AutoHotkey v3.0.00 (Release Candidate)
[My project list]
Back to top
View user's profile Send private message
n-l-i-d
Guest





PostPosted: Sun Oct 11, 2009 4:19 pm    Post subject: Reply with quote

Sorry, but that is not the case. I already escaped the quotes, you escape them again...

If I add a MsgBox to show me the regex, I get this:

Code:
is)^(?:""(?:\\\\.|[^""])*""|[^@]+)@(?=[^()]*(?:\([^)]*\)" . "[^()]*)*\z)(?![^ ]* (?=[^)]+(?:\(|\z)))(?:(?:[a-z\d() ]+(?:[a-z\d() -]*[()a-" . "z\d])?\.)+[a-z\d]{2,6}|\[(?:(?:1?\d\d?|2[0-4]\d|25[0-4])\.){3}(?:1?\d\d?|" . "2[0-4]\d|25[0-4])\]) *\z


Code:
isValidEmail("someone@somewhere.com")

isValidEmail(emailstr){
  static regex := "is)^(?:""""(?:\\\\.|[^""""])*""""|[^@]+)@(?=[^()]*(?:\([^)]*\)"
    . "[^()]*)*\z)(?![^ ]* (?=[^)]+(?:\(|\z)))(?:(?:[a-z\d() ]+(?:[a-z\d() -]*[()a-"
    . "z\d])?\.)+[a-z\d]{2,6}|\[(?:(?:1?\d\d?|2[0-4]\d|25[0-4])\.){3}(?:1?\d\d?|"
    . "2[0-4]\d|25[0-4])\]) *\z"
    msgbox % regex
  return RegExMatch(emailstr, regex) != 0
}


Only if I don't use the concatenation on initializing the variable, it works:

Code:
is)^(?:"(?:\\\\.|[^"])*"|[^@]+)@(?=[^()]*(?:\([^)]*\)[^()]*)*\z)(?![^ ]* (?=[^)]+(?:\(|\z)))(?:(?:[a-z\d() ]+(?:[a-z\d() -]*[()a-z\d])?\.)+[a-z\d]{2,6}|\[(?:(?:1?\d\d?|2[0-4]\d|25[0-4])\.){3}(?:1?\d\d?|2[0-4]\d|25[0-4])\]) *\z


Code:
isValidEmail("someone@somewhere.com")

isValidEmail(emailstr){
  static regex
  regex := "is)^(?:""(?:\\\\.|[^""])*""|[^@]+)@(?=[^()]*(?:\([^)]*\)"
    . "[^()]*)*\z)(?![^ ]* (?=[^)]+(?:\(|\z)))(?:(?:[a-z\d() ]+(?:[a-z\d() -]*[()a-"
    . "z\d])?\.)+[a-z\d]{2,6}|\[(?:(?:1?\d\d?|2[0-4]\d|25[0-4])\.){3}(?:1?\d\d?|"
    . "2[0-4]\d|25[0-4])\]) *\z"
    msgbox % regex
  return RegExMatch(emailstr, regex) != 0
}


So, I'm quite positive now that using concatenation while initializing variables in functions, does not work Confused
Back to top
PaulGregg
Guest





PostPosted: Mon Jul 26, 2010 12:27 am    Post subject: Re: Email validity check - isValidEmail() [function] Reply with quote

daonlyfreez wrote:
I came across this article on validating email addresses, and decided to convert the code into AHK.
...


Cool - I just came across this.

For some of the commenters, you may have missed the point of my *article
(* not really an article - I just wanted a methodical way of testing regexs that people insisted were great for email address validation).

Anyway, my point is not that you should strive for that holy-grail of a regex, but that they all have failings and to even get close to a working regex you end up with something so complex that you can never maintain it going forward.
Point of note: the arpad3 regex was written by arpad specifically to pass my tests - not to be a email address validator.

My conclusion is that, for this specific purpose, you should use a clearly documented step-by-step function to validate the email "rules". You won't thank me today, but you will in 5 years time when ICANN allows people to register their own TLDs for $500,000.

Regards,
PG
Back to top
daonlyfreez



Joined: 16 Mar 2005
Posts: 949
Location: Berlin

PostPosted: Mon Jul 26, 2010 4:11 pm    Post subject: Re: Email validity check - isValidEmail() [function] Reply with quote

PaulGregg wrote:
daonlyfreez wrote:
I came across this article on validating email addresses, and decided to convert the code into AHK.
...


Cool - I just came across this.

For some of the commenters, you may have missed the point of my *article
(* not really an article - I just wanted a methodical way of testing regexs that people insisted were great for email address validation).

Anyway, my point is not that you should strive for that holy-grail of a regex, but that they all have failings and to even get close to a working regex you end up with something so complex that you can never maintain it going forward.
Point of note: the arpad3 regex was written by arpad specifically to pass my tests - not to be a email address validator.

My conclusion is that, for this specific purpose, you should use a clearly documented step-by-step function to validate the email "rules". You won't thank me today, but you will in 5 years time when ICANN allows people to register their own TLDs for $500,000.

Regards,
PG


Hi Paul,

Cool that you respond. Cool

Good to know that the arpad3 regex is not meant to be used that way. I thought I had found the 'shortest' regex version, but I guess that doesn't count.

I almost literally transcoded your original into AutoHotkey. Are there any specific parts that might need to be altered (apart from the future "private" TLDs)?

Greetings,

daonlyfreez
_________________
mirror 1mirror 2mirror 3ahk4.me • PM or
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    AutoHotkey Community Forum Index -> Scripts & Functions All times are GMT
Page 1 of 1

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group