Friday, June 18, 2010

PHP: Email Sanitization code.

So the principal thought behind this is to have a function that determines weather an email address is crap or not crap based on the functions return. Return 1, your email is bad juju, return anything else your fine.

So an example of usage:

Your going to import email leads from a database and you need to check weather the email address is valid.

Query DB -> Grab $Info + $email -> sanitize($email)
-> return(1) -> Notify of Deletion -> Query(Delete($emails))
-> return(x) -> Notify of Completion -> Query(insert($info + $email))



Php Code starts here:

/* Character Omission:
!@#$%^&*()`'/\{}~ ,<>":;[] (and unicode characters like §, ™, ®, ⚛, ⚡, ö, π, and the like)

(1) ."a" is appended due to a bug in php. Found:
http://www.php.net/manual/en/function.mb-detect-encoding.php#81936
*/

function sanitizer($email){
$email = explode('@',$email);
$intergity = count($email);
if($intergity <= 2){
$email = $email[0].$email[1];
$email = preg_split('//', $email, -1);
foreach($email as $char){
if( mb_detect_encoding($char.'a', "auto") == "ASCII")
{switch($char){
case '!': { return 1; break; }
case '?': { return 1; break; }
case '@': { return 1; break; }
case '#': { return 1; break; }
case '$': { return 1; break; }
case '&': { return 1; break; }
case '^': { return 1; break; }
case '>': { return 1; break; }
case '<': { return 1; break; }
case '&': { return 1; break; }
case '*': { return 1; break; }
case '(': { return 1; break; }
case ')': { return 1; break; }
case '`': { return 1; break; }
case "'": { return 1; break; }
case '/': { return 1; break; }
case '\\': { return 1; break; }
case ' ': { return 1; break; }
case '{': { return 1; break; }
case '}': { return 1; break; }
case '~': { return 1; break; }
case ',': { return 1; break; }
case '"': { return 1; break; }
case ':': { return 1; break; }
case ';': { return 1; break; }
case "[": { return 1; break; }
case ']': { return 1; break; }
default: {break;}}
}else{return 1;}
}
}else{return 1;}
}

if(!sanitizer('test@test.com') == 1)
{ echo "it works"; }else{ echo "nope, crap";}
?>"



So Its Basically a few "if" statements to check logic of an email, another "if" statement combined with a encoding function then to round it out with a case statement. Obviously this is left really open for you to add your own status codes and such with the returns.

2 comments:

Stephen Fluin said...

You should just store the invalid characters in a string or an array, and use strpos to identify if it is in the list of invalid chars, rather than a huge switch statement. A lazy programmer is a good programmer.

Mel said...

I actually have a few variations on this type of setup. An alternate is use something like in_array( $characters), and the other version involves a regular expression :)