Out of the corner of my eye

Non Sibi, Sed Omnibus

Out of the corner of my eye header image 2

Email obfuscation does not work, and is easily broken

December 11th, 2007 · 2 Comments

I’ve read plenty of articles about email obfuscation over the years. Until I got an email address with decent spam protection, I occasionally made a half arsed effort at it too - the old davedx at gmail dot com. That was before I knew much about how powerful web scripting had become.

Email obfuscation is rubbish. Given time, I expect any obfuscation technique will be added to the scripts used by harvester bots that build their email databases to be sold to spammers. The only methods I believe work at preventing most spam are:

1. A decent spam filter

2. CAPTCHA (for web form/comment spam)

Here’s a script I wrote, that took me about 30 mins of googling to find the most popular obfuscation techniques (the ones most people linked to, bumping to the top of the search results, and are therefore probably using themselves), then 1hr 15 to write the actual code.

Email obfuscation breaker (rename to .php)

Here’s a testbed, proving it works.

The hardest part was plugging in the JavaScript interpreter - none of it was rocket science, however.

If your email doesn’t have decent spam filtering, don’t post it on public web sites.

Tags: Web dev

2 responses so far ↓

  • 1 tekkie // Sep 16, 2008 at 5:06 pm

    Considering the load of JavaScript on the web, no harvester can be put to scan it entirely. It’s plain madness and no bot is up to it. It would then be much more effective to use OCR-based bot.

    And the sample that your script breaks is a rather trivial form of obfuscation.

    Finally, those of you not familiar with the statistics, please take a look at the chart here.

    For Mac OS X users there’s a convenient Dashboard widget called obfuscatr. It provides JavaScript or just plain hexadecimal encoding (not so effective as also demonstrated by the above obfuscation breaker) of your email addy.
    See the details at flash tekkie.

    obfuscatr was also featured in MacWorld Italy of March 2008.

  • 2 admin // Oct 3, 2008 at 3:25 pm

    Obviously your bot wouldn’t execute every line of JavaScript it found. If you were going to write a harvester, you’d do what I did and target specific obfuscation methods and only run your VM when you find a match. Considering people are most likely to use obfuscation methods that are listed high in the SERPs and/or from respected sources, I bet a determined bot writer could cover the majority that are in actual use.

    My point is basically about economics. Unlike with a CAPTCHA, you don’t need to spend months of R&D to break simple email address obfuscation - it takes a couple of hours to code the solution while blog authors, etc. are spending almost as long implementing these things.

    Finally, the chart is interesting, but those stats are from one single page on the web.

Leave a Comment