Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Spam sites & forums using homepage text

         

kevsta

7:19 am on Sep 4, 2009 (gmt 0)

10+ Year Member



over the last few weeks I've started noticing many (like 20 or 30) different spammy looking Chinese sites - mostly forums popping up in the serp for various non-important queries that have all taken the full homepage text including all nav links etc for a site of ours and put it up on an inner page surrounded by Chinese nonsense.

they dont link to us, and if they weren't showing up in SERPs all over the place looking like we had something to do with it I wouldnt care, but they are and it doesnt make us look good to have our site at the top and maybe 5 or 6 spammy clones of the same text below us in the serp.

does anyone any opinions / experience on what these people might be trying to achieve?

and if it were you, what would you do about it if anything?

tedster

9:02 pm on Sep 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



various non-important queries

As long as the situation stays like that, non-important, I would spend my efforts in other areas and ignore this.

kevsta

11:28 pm on Sep 4, 2009 (gmt 0)

10+ Year Member



actually i just looked and there's one of them at #9 for the company name on google (old) and 11, 12, 14 & 15 on Caffeine

signor_john

11:32 pm on Sep 4, 2009 (gmt 0)



You can always file a DMCA complaint about their use of your home page's text without permission. But that's a whole different issue from their rankings, which are well behind yours.

jd01

11:39 pm on Sep 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would start off simple and spend a few minutes (hours) learning how to block known scrapers with an .htaccess file and PHP...

This is a bit out of date, but it blocks a number of known scrapers with Mod_Rewrite... There are also a number of 'bad robot' and 'honey pot' blocks posted here @ WebmasterWorld you can learn how to install on a site...

If you don't want it to happen when it's important, I would take the time to learn how to ban and block access now when it's not, because it's a very good skill to have if you're going to 'live' off the Internet...

I have found it to be well worth the time invested, because I can make it very difficult for you to scrape one of my sites and reproduce it. (Is it possible? Sure, if someone is determined enough they can, but I do my best to give you a headache if you bother to try!)

Here's the 'out of date' Mod_Rewrite I have on one... Use at your own risk!

RewriteEngine on

### SCRAPER BANS & BLOCKS ###
RewriteCond %{HTTP_USER_AGENT} a((ip)?bot¦lexfDownload¦mzn_assoc¦SPSeek) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} c(herry¦on(tentSmartz¦veras)¦rescent) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} d(um¦II¦ataCha) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} e(asyDL¦-?mail¦x(abot¦tractorPro)) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} foobot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} g(i(gabaz¦joel)¦rub) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} h(atena¦tt(pdown¦rack)) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} i(EAuto¦ndy.?Library) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} l(arbin¦exiBot¦ink(.?walker)?¦mcrawler¦ocator) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} m(-crawl¦j12bot¦i(crosoft\.URL¦ssigua)¦ogren¦SProxy¦orpheus) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} n(etMechanic¦ICErsPRO) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} o(penfind¦ffline¦omni[-]?Explorer) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} p(hpcrawl¦ingALink¦sbot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} r(obot¦ufus) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} s(chmozilla¦earchIt¦eek(bot¦er)¦ogou¦proose¦imple¦l(eipnir¦ySearch)¦weeper¦zukacz) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} t(eleport¦ScholarsBot) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} urlSpiderPro [NC,OR]
RewriteCond %{HTTP_USER_AGENT} voyager [NC,OR]
RewriteCond %{HTTP_USER_AGENT} w(eb(Account¦Capt¦Copier¦rank¦Whack¦Strip¦Zip¦ster¦bandit)¦get) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^User-Agent [NC]
RewriteCond %{HTTP_USER_AGENT} !(Giga(blast¦bot)¦Walhello¦inktomi¦teoma) [NC]
RewriteRule .? - [G]

### ADDED: Replace the broken ¦ with a Bar or your site will break!