Although less common nowadays with the pervasion of social media, you still see people posting their email addresses online and obfuscating them in some way. For example,
[email protected] may be displayed as
[email protected]. The purpose of this obfuscation is to allow regular visitors to learn your email address, but prevent automated bots from harvesting it to send you spam. There are many such techniques — both effective and ineffective — for obfuscating email addresses, and since I couldn’t find any quantifiable evidence of a comparison between the various methods I decided to compile some evidence myself.
In 2010, I set up a web page containing 9
davidlyness.com email addresses — one in plaintext and each other obfuscated using a different method. I then directed each of these email addresses to separate mail accounts, and ensured that no received messages were marked and deleted as spam by the email software. I ensured it was indexed by Google and other search engines by including a hidden link on this website’s homepage and including the page in the sitemap. Then I let two years pass, allowing bots to harvest the email addresses and send spam.
This email address was displayed in plaintext to act as a baseline for the other addresses.
Use [at] and [dot]
Nothing too special going on here — we simply replace the
[at] and the
[dot] as described above.
Also shown above, add a portion of the email address that the sender is expected to know to remove before sending.
Intersperse HTML comments
In HTML, a comment is included using the following syntax:
<!-- this is a comment -->
HTML comments are not parsed by browsers, meaning an email with an embedded comment should display as normal, but may fool a bot attempting to harvest email addresses — it is likely looking at the source code of the page, not the final rendered version. This email address looks like:
<!-- blah -->user<!-- blah -->@<!-- blah -->example<!-- blah -->.<!-- blah -->com
Using a method similar to PHP’s
urlencode function, we can encode each character of the email address as a percentage sign followed by two hexadecimal characters, which is automatically parsed by the browser when the page is rendered. Again, this is obfuscated in the source code but not in the final rendered output. For example,
[email protected] will be encoded as
ROT13 on alphabetic characters
The ROT13 function encodes alphabetic characters by shifting them 13 places. (Forwards or backwards makes no difference as there are 26 characters in total.) So,
[email protected] becomes
var string1, string2, string3; string1 = "user"; string2 = "example"; string3 = "com"; document.write(string1 + "@" + string2 + "." + string3);
Insert element with
display:none CSS property
A span element is added to the email string, similar to the HTML comment method above, but we also add a rule for this element to have the
display:none CSS property. This allows the user to copy the email address without obfuscation, but obfuscates the address at both a source code and non-stylesheet level.
Encoding the email address as a picture
The last method is also a common one, and it is to encode the email address as a picture so that the email address does not exist at all in text form on the page. So, [email protected] would be encoded as follows:
The ROT13 and picture methods of obfuscation were the only two methods that yielded a perfect score of zero spam messages received, which I believe means that the addresses themselves were not harvested. The CSS
display:none method was close behind, with only 36 messages received after a 2 year period. (What may be interesting to note is that all 36 messages came in the last 4 months of the experiment — meaning that spammers likely updated their harvesting software to decode this method.)
If you have to make your email address publicly available, from this experiment it looks like the ROT13 and picture methods of obfuscation are currently the most effective. However, they each have their own disadvantages:
- By placing your email address in a picture, the user can neither copy and paste the email address into their mail client nor click a mailto link for the address, discouraging them from sending an email in the first place. An image also takes up significantly more bandwidth than text, but as the email image above is less than 4KB in size this is not a major concern.
As mentioned at the beginning of the post, the prevalence of social media websites has mitigated the severity of this problem. Sites like Facebook and Twitter take on the responsibility of protecting your mailbox from spam, and you don’t have to deal with obfuscating a link to your profile. Alternatively, you can handle the email delivery yourself using a web-based contact form.