phranque

msg:3817203 | 1:14 am on Jan 1, 2009 (gmt 0) |
if you are trying to exclude only google and ask from those 4 pages, you need this: User-agent: Googlebot User-agent: Ask Jeeves/Teoma Disallow: /my_page_1.htm Disallow: /my_page_2.htm Disallow: /my_page_3.htm Disallow: /my_page_4.htm in your example, the blank line after the User-agent: list stops that set of exclusions and the wildcard User-agent specification applies to all robots.
|
Durnovaria

msg:3817298 | 12:08 pm on Jan 1, 2009 (gmt 0) |
Thanks for your reply :-) I was trying to follow the other example given and modify it for my needs. The bit at the top with Google and Ask Jeeves in it was a separate section just for their robots. What it was supposed to do was direct Google and Ask Jeeves robots to go to the pages where they would find the "noindex" meta tag, due to the way they apparently log or index pages. The remaining part of the file was for all other robots who apparently wouldn't have a problem excluding the pages in the list. So what I think I need is an instruction under the Googlebot and Ask Jeeves section (but before the User-agent: * section) to send Google and Ask Jeeves to my pages. Apparently if I don't do that they will just use the normal Disallow list and still log the page URLs. Mike
|
phranque

msg:3817308 | 12:49 pm on Jan 1, 2009 (gmt 0) |
yes that is slightly different from what you had: User-agent: Googlebot User-agent: Ask Jeeves/Teoma Disallow: User-agent: * Disallow: /my_page_1.htm Disallow: /my_page_2.htm Disallow: /my_page_3.htm Disallow: /my_page_4.htm (note the "blank" disallow)
|
Durnovaria

msg:3817315 | 1:26 pm on Jan 1, 2009 (gmt 0) |
Thank you :-) It's all new to me, so I didn't really know how it worked! I have updated my file now, so hopefully that should work. Thanks again, Mike
|
phranque

msg:3817317 | 1:47 pm on Jan 1, 2009 (gmt 0) |
you can validate you rules with GWT: Checking robots.txt - Webmaster Help Center [google.com]
|
Durnovaria

msg:3817333 | 2:52 pm on Jan 1, 2009 (gmt 0) |
I tried my robot.txt file in the Google Webmaster Tools and it said the following: Allowed by line 3: Disallow: Detected as a directory; specific files may have different restrictions Also, out of curiosity I tested it on this site as well: [searchenginepromotionhelp.com...] On that site it said: Line/Contents 1/User-agent: Googlebot The line below must be an allow, disallow or comment statement 2/User-agent: Ask Jeeves/Teoma 3/Disallow: Missing / at start of file or folder name So it appears not to like Line 1, saying that there should be a comment after it, and doesn't like Line 2, saying that there's a forward slash missing. I've no idea what it's going on about, so I thought I would mention it! Mike :-)
|
g1smd

msg:3817335 | 2:58 pm on Jan 1, 2009 (gmt 0) |
Some parsers cannot cope with multiple User-agent: lines preceding the Disallow: statement(s). There must be one or more Disallow: statements after the User-agent: line(s). There must be a blank line after the last Disallow: statement of each block (i.e. before the next User-agent: line). If there is a specific section for Google then it reads only that section of the file. That is, it does NOT read the User-agent: * section at all. This is the correct syntax if everything is allowed: Disallow: If a checker says otherwise, then it is the checker that is faulty.
|
Durnovaria

msg:3817344 | 3:17 pm on Jan 1, 2009 (gmt 0) |
Okay, thanks. :-) I didn't doubt that what I was told here was correct, but I did wonder why that checker came up with those comments! Mike
|
phranque

msg:3817702 | 7:24 am on Jan 2, 2009 (gmt 0) |
so try this then: User-agent: Googlebot Disallow: User-agent: Ask Jeeves/Teoma Disallow: User-agent: * Disallow: /my_page_1.htm Disallow: /my_page_2.htm Disallow: /my_page_3.htm Disallow: /my_page_4.htm
|
Durnovaria

msg:3817818 | 1:43 pm on Jan 2, 2009 (gmt 0) |
I'm happy with the one you did for me before, Phranque. Again, out of curiostity I ran that latest one through the checking program and it didn't like that either! For lines 2 and 5 it said 'Missing / at start of file or folder name' and for line 11 (the final line) it said 'The line below must be an allow, disallow, comment or a blank line statement.' I don't think I'll be using that checking program again! Mike :-)
|
|