Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt Question

Will this cover it?

         

rover

12:21 am on Mar 25, 2005 (gmt 0)

10+ Year Member



I have the file abc.php in our main directory. I want to keep the spiders away from the file abc.php even if it has variables. For example:

example.com/abc.php?ID=1
example.com/abc.php?ID=2
example.com/abc.php?ID=3
etc.

Will simply specifying the file abc.php as below cover all of these above as well?

--------------

User-agent: *
Disallow: /abc.php

---------------

Lord Majestic

12:28 am on Mar 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes it will be fine.

rover

5:16 pm on Mar 25, 2005 (gmt 0)

10+ Year Member



Thanks, I thought it would be fine, too.

I use the php file abc.php to track clickthroughs for links, but I want to avoid showing any 'clickthroughs' from search engine spiders.

So I created javascript and links as follows (much like google does):

<script>
function clk(n) { if(document.images){ (new Image()).src="/abc.php?ID="+n; } return true;}
</script>

Sample link:

<a href="http://www.example.com" onmousedown="return clk(123)">

The abc.php file increments a hit counter when it is activated with an ID (e.g. abc.php?ID=123). It also tracks the host.

I was crawled by msn.bot and Yahoo Slurp 12 hours after I added the new robots.txt file, but I can see that abc.php has been incrementing the hit counter even when msn.bot or slurp are following the links.

I thought that excluding them from abc.php would avoid this situation.

Is it possible that these spiders actually do trigger the javascript 'onmousedown' event?

Lord Majestic

5:40 pm on Mar 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I was crawled by msn.bot and Yahoo Slurp 12 hours after I added the new robots.txt file

Its possible that it takes sometime for your robots.txt changes to take place, possibly over 24 hours.

I don't think their bots would follow JavaScript events.