Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: goodroi
TIA for info
The technique is commonly called "referral log spamming". Someone is hired by a website to produce links to the site in weblogs and webstats of other sites (like mine). This will generate hits (curious webmsaters like myself will wonder who that referrer is and click the link) or ingoing links in blogs which in turn will e.g. increase search engine rankins.
This "someone", i.e. the spammer, is using a spider disguised often as a search engine spider and there's no way AFAIK to grab the IP the spider is coming from (the spammer) and there's a meaning of blocking the domains of the spammers clients since.
The above is what I understand. I might err here and there, though, but basicallym that's it.
You can do a Google "referral log spamming" and you'll get a few hits.
So, I still wonder how I configure the robots.txt to allow certain spiders but block the rest.
The way to actually block certain types of visitors from your site is to use the .htaccess file and identify them by User Agent or IP address (or various other parameters). There are literally thousands of references to this technique in the Apache Web Server forum.
The purpose of log spamming is not to lure curious webmasters into clicking on a link, but to get their link into your log stats and then indexed by Google which gives their site another referrer and a higher PR.
The spammers URL will get into your log stats whether you block them (403) or not. So, as I said above, the only way to stop them is to take away the incentive and make your log stats inaccessible to the general public (also can be done using .htaccess).
I haven't a clue how to make them inaccessible for the punlic. As a matter of fact, I took for granted that they were.
I'm on a *nix box with Apache and CPanel. Can you by that tell how I do it and which unpleasant consequences this might have, if any?
Appreciate your help!
I would imaging that most logfile analysis programs are configurable in this way.
I presume that this is what brett was describing