Forum Moderators: phranque

Message Too Old, No Replies

How to ban search engine robots from following links?

         

hostlead

3:00 am on May 31, 2006 (gmt 0)

10+ Year Member



Hello,

I would like to increase statistics accuracy by banning search engine robots from clicking banners. I already know which IP ranges to block.
Is there a way to do this effectively?
I would think there is a possibility using .htaccess, but am not sure how to do this as the link is not in a folder such as
http://www.example.com/ads/?click&id=1&pl=1
but like this:
http://www.example.com/?click&id=1&pl=1
http://www.example.com/?click&id=2&pl=1
http://www.example.com/?click&id=3&pl=1

Is it possible to block robots from following any link that looks like this?
http://www.example.com/?click

Obviously I don't want to ban the robots from the entire site.

HL

[edited by: engine at 3:16 pm (utc) on June 1, 2006]
[edit reason] examplified and de-linked [/edit]

jdMorgan

6:56 pm on May 31, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is the "domain.com" in your examples above *your* domain name, or the ad provider's domain name?

If it's yours, then you can use robots.txt for Google; they support including "?" in Disallow directives. For others, you could rewrite the request to a 'harmless' internal path -- one that won't affect your stats.

Jim

hostlead

8:13 pm on May 31, 2006 (gmt 0)

10+ Year Member



Domain.com is "my" domain. Ads are served on the same domain.

How would I do this? Can you maybe point me to a tutorial or guide?
Thanks.

HL

hostlead

2:47 pm on Jun 1, 2006 (gmt 0)

10+ Year Member



would this work? or would this be to block access to Google to (a) folder(s)?

User-agent: Googlebot
Disallow: /*click

jdMorgan

2:58 pm on Jun 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This should help:

Google robots.txt info [google.com]

Jim

hostlead

3:14 pm on Jun 2, 2006 (gmt 0)

10+ Year Member



I tried the following, but no luck:

User-agent: Googlebot
Disallow: /?click*

Can anybody hint me in the right direction?

BananaFish

4:12 am on Jun 3, 2006 (gmt 0)

10+ Year Member



I've found that with a robots.txt file, that you'll get more than a hand full of bots hitting on the disallowed pages looking for vulnerablities and what not.

hostlead

9:58 am on Jun 3, 2006 (gmt 0)

10+ Year Member



What do you recommend doing instead?

HL

hostlead

1:56 am on Jun 4, 2006 (gmt 0)

10+ Year Member



I figured it out.

User-agent: Googlebot
Disallow: /?click

The best way to test your robots.txt I found is to use the google sitemap tool.

HL