Forum Moderators: goodroi
--robots.txt---
User-agent: googlebot
Disallow: CCS/PFCatalog.php
User-agent: googlebot
Disallow: PCR/Catalog/PFCatalog.php
-- from log----
XX.68.82.144 - - [15/Jan/2004:06:17:28] "GET /http://www.******.com/CCS/PFCatalog.php?Cat=90 HTTP/1.1 " 200 200 "None" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
XX.68.82.181 - - [15/Jan/2004:06:25:57] "GET /http://www.******.com/CCS/PFCatalog.php?Cat=91 HTTP/1.1 " 200 200 "None" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
[edited by: DaveAtIFG at 4:58 pm (utc) on Jan. 15, 2004]
[edit reason] "Nuetered" IPs [/edit]
1. Your robots.txt is unusual in that it has more than one record (group of lines separated by a blank line) for the one user-agent googlebot. It might well be that some robots interpret the file in a way that is not what was intended, e.g. only using the last record, or the first record, for Googlebot. Does your robots.txt include still more records for Googlebot?
2. The robots.txt specification [robotstxt.org] says:
DisallowThe value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved. For example, Disallow: /help disallows both /help.html and /help/index.html, whereas Disallow: /help/ would disallow /help/index.html but allow /help.html.
/CCS/PFCatalog.php?Cat=90 does not begin with CCS/PFCatalog.php . It begins with /CCS/PFCatalog.php .
You might try if the following works:
User-agent: *
insert paths to be disallowed to all spiders hereUser-agent: googlebot
Disallow: /CCS/PFCatalog.php
Disallow: /PCR/Catalog/PFCatalog.php
insert other paths to be disallowed to Googlebot here