Forum Moderators: goodroi

Message Too Old, No Replies

wildcard disallow for pages with varibles

would this work for pages ending with?keep= : Disallow: *keep=*?

         

The_Rabbit

5:10 am on Dec 3, 2003 (gmt 0)



Hi folks!

This is the first post I dear to make at this awesome forum ;)

Short problem description:
In the past my site had a timestamp for pages as a varibale. Changed the system and they are all now redirecting to a static page (301 status)
However google is still going amok and is spidering hundreds of "clones":

"GET /publications/publications_body.html?keep=20:10 HTTP/1.0"
"GET /publications/publications_body.html?keep=22:14 HTTP/1.0"
"GET /publications/publications_body.html?keep=04:28 HTTP/1.0"

I did read someplace else that this should work:

Disallow: *keep=*

Now my robots.txt looks like this (in short):

User-agent: *
Disallow: /pics/
Disallow: /images/
Disallow: /php/
Disallow:

User-agent: googlebot
Disallow: /pics/
Disallow: /images/
Disallow: /php/
Disallow: *keep=*
Disallow:

I am wondering if someone knows if I missed something and if google will be able to "handle" that special wildcard construct.

Would be happy if you could help me out!

Thanks

DaveAtIFG

5:25 pm on Dec 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I did read someplace else that this should work:
Quit reading there! ;) I've never encountered anything even remotely similar...

Google details their proprietary extensions to the robots exclusion protocol here [google.com] in the first question of the Googlebot Technology Questions section.