Welcome to WebmasterWorld Guest from

Forum Moderators: goodroi

Message Too Old, No Replies

google follows links disallowed

Have I done something wrong?



4:25 pm on Jan 11, 2005 (gmt 0)

10+ Year Member

I am running a forum and do not want SEs to follow outbound links.

In an attempt to stop this, I have replaced all outbound links as follows:


I have created the following robots.txt:

User-agent: *
Disallow: /go.php

After all this, google is passing PR and listing my site as backlinks for the recipients.

Have I done something wrong or do I need to encrypt the URLs?

Best regards


11:03 am on Jan 12, 2005 (gmt 0)

10+ Year Member

I have noticed that Google is ignoring my robots.txt also.

A few months ago I noticed that google was spidering many instances of a form that is on my site e.g. Goggle had form.asp?id=2 form.asp?id=3 etc.

My robots.txt now reads:
User-agent: *
Disallow: form.asp

However more than 6 months later (and Google has been back and re-cached the page several times), Google is still including the page.

A clarification from Google would be nice.


12:41 am on Jan 15, 2005 (gmt 0)

10+ Year Member

After all this, google is passing PR and listing my site as backlinks for the recipients.

The purpose of a robots.txt file is to ask robots to not download files.

Was your "go.php" page actually downloaded and indexed by Google? If not, Google has followed your robots.txt directives.

Disallow: form.asp

That should be "Disallow: /form.asp".

Try to analyze your robots.txt files with a robots.txt validator.


1:59 pm on Jan 17, 2005 (gmt 0)

10+ Year Member

Hi LowLevel

Thanks for you advice regarding the use of "/" before the filename.

Actually many of the "authority" sites on Robots.txt do not specify the need for a preceeding forward slash and my robots.txt file seems to validate okay without it.

Anyway, I have updated my robots.txt with your suggestion, so hopefully the "/" will do the trick.


4:54 pm on Jan 18, 2005 (gmt 0)

10+ Year Member

Hello - I am having the same problem - but it is in regards to a cgi-bin folder which Google is indexing the query results. I did a site check on Google and am certain that it has indexed these, which I do not want it to do.

Here is my current Robots.txt file info:

User-agent: *
Disallow: /cgi-bin/

Should I specify the Googlebot as well? Thanks!


Featured Threads

Hot Threads This Week

Hot Threads This Month