Forum Moderators: Robert Charlton & goodroi
they dont link to us, and if they weren't showing up in SERPs all over the place looking like we had something to do with it I wouldnt care, but they are and it doesnt make us look good to have our site at the top and maybe 5 or 6 spammy clones of the same text below us in the serp.
does anyone any opinions / experience on what these people might be trying to achieve?
and if it were you, what would you do about it if anything?
This is a bit out of date, but it blocks a number of known scrapers with Mod_Rewrite... There are also a number of 'bad robot' and 'honey pot' blocks posted here @ WebmasterWorld you can learn how to install on a site...
If you don't want it to happen when it's important, I would take the time to learn how to ban and block access now when it's not, because it's a very good skill to have if you're going to 'live' off the Internet...
I have found it to be well worth the time invested, because I can make it very difficult for you to scrape one of my sites and reproduce it. (Is it possible? Sure, if someone is determined enough they can, but I do my best to give you a headache if you bother to try!)
Here's the 'out of date' Mod_Rewrite I have on one... Use at your own risk!
RewriteEngine on
### SCRAPER BANS & BLOCKS ###
RewriteCond %{HTTP_USER_AGENT} a((ip)?bot¦lexfDownload¦mzn_assoc¦SPSeek) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} c(herry¦on(tentSmartz¦veras)¦rescent) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} d(um¦II¦ataCha) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} e(asyDL¦-?mail¦x(abot¦tractorPro)) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} foobot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} g(i(gabaz¦joel)¦rub) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} h(atena¦tt(pdown¦rack)) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} i(EAuto¦ndy.?Library) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} l(arbin¦exiBot¦ink(.?walker)?¦mcrawler¦ocator) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} m(-crawl¦j12bot¦i(crosoft\.URL¦ssigua)¦ogren¦SProxy¦orpheus) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} n(etMechanic¦ICErsPRO) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} o(penfind¦ffline¦omni[-]?Explorer) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} p(hpcrawl¦ingALink¦sbot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} r(obot¦ufus) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} s(chmozilla¦earchIt¦eek(bot¦er)¦ogou¦proose¦imple¦l(eipnir¦ySearch)¦weeper¦zukacz) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} t(eleport¦ScholarsBot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} urlSpiderPro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} voyager [NC,OR]
RewriteCond %{HTTP_USER_AGENT} w(eb(Account¦Capt¦Copier¦rank¦Whack¦Strip¦Zip¦ster¦bandit)¦get) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^User-Agent [NC]
RewriteCond %{HTTP_USER_AGENT} !(Giga(blast¦bot)¦Walhello¦inktomi¦teoma) [NC]
RewriteRule .? - [G]
### ADDED: Replace the broken ¦ with a Bar or your site will break!