Sunday, January 30, 2011

Prevent mail from spam or junk by encryption from crawler

If you write a blog and put your email on it. There is a chance for you to get Spam/Junk mail
This can be cause of many reason but I'm focus on crawler method

How this spam works?
Basically this is the procedure
1. They find your email
2. They write email to you

Spam usually include 1000 email or more and of course it is done by using a bot (a system) and not just people manually type in

To find your email
They using web-crawler, spam-crawler or anything-crawler. These crawler will browse slowly through the internet (from blog to blog, website to website until forever). If your blog put an hyperlink of your friend blog (blogroll they called) these crawler will go into it

Thus in, programming these crawler detect 2 things

1st detection
Detect "http://" text. If they found and word, URL or string match. They will retrieve the link and put in on the "Next going to be crawl list"

2nd detection
Since they want to spam to your mail, they simply find any string or word with "@" symbol. Thus your email will be captured in these process

Write email to you (spam)
These is a process of producing mass email and of course they use the list they got from the first step and use a bot (system) to mass mail it


Prevention

There is 2 type of prevention
1. Email filter (thats why you have spam/junk mail categories)
2. Prevent your email from being on the spammer list

1st prevention - Filter your mailbox
This is already common and build in feature for almost all email system like gmail and others. So no need discussing much

2nd prevention - Prevent from listed
This is what I want to discussed about
Usually people will make their email not in easy format or encryption.

Using different email format like (let say your email is ajskdhqjeakjbdia@gmail.com)

Email: ajskdhqjeakjbdia (@gmail)

Email: ajskdhqjeakjbdia at gmail dot com

Or they use encryption like these

ajskdhqjeakjbdia@gmail.com
(It looks normal but it is actually encrpted)

So these encryption can be easily google. But I'm suggesting you the simplest one which taken from these website http://www.web-designz.com/tools/email_encoder.shtml
Note: The simplest method might also means it is easily detectable by the crawler

If you are a programmer and wandering how did the website do the answer is they are using convertToUnicode(...) function

If you view the page source. It wont show and "@" or your email at all because it is encrypted and of course the crawler is a system, they read the page source and not reading like you did (using web browser)

-End-

No comments: