Forum Moderators: goodroi

Message Too Old, No Replies

problem with robots.txt file

problem with robots.txt file

         

raybo75

5:16 pm on Aug 27, 2005 (gmt 0)

10+ Year Member



I appear to have an error in my robots.txt file, but can't seem to locate it.
Here is my robots.txt file

User-agent: Googlebot
Disallow: /prodView.asp
Disallow: /prodList.asp?idCategory

Yet, I see Googlebot hitting the following pages

/prodView.asp?idproduct=15
/prodList.asp?idCategory=14

The robots file has been in place for over 1 week and Googlebot hit it for the first time about 1 week ago. Any thoughts?

jdMorgan

5:23 pm on Aug 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There's nothig wrong with what you've posted. Those files should not have been fetched. Is there anything else in your robots.txt file. When Googlebot fetches it, does the number of bytes loaded match the size of your robots.txt file? Is a 200-OK status returned?

Jim

raybo75

5:28 pm on Aug 27, 2005 (gmt 0)

10+ Year Member



Status = 200. All looks well on the fetch.

Is robots.txt case sensitve? Does
prodview.asp = ProdView.asp

Thanks!

jbgilbert

5:30 pm on Aug 27, 2005 (gmt 0)

10+ Year Member



You may want to check this page at Google. [google.com...]

It appears that Googlebot may be the ONLY spider that actually honors wildcards in a robots.txt file.

You will also find a thread where somebody has found success using them at [webmasterworld.com...]

hope this helps

encyclo

5:36 pm on Aug 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is robots.txt case sensitve?

You should always use the same case as the linked pages yourself. I can't see anything particularly wrong with your current syntax.

One thing you can do, seeing as you're talking about dynamic pages, is to use a belt and braces approach and include a meta noindex tag on those pages too.

<meta name="robots" content="noindex">