Forum Moderators: goodroi
This is the first post I dear to make at this awesome forum ;)
Short problem description:
In the past my site had a timestamp for pages as a varibale. Changed the system and they are all now redirecting to a static page (301 status)
However google is still going amok and is spidering hundreds of "clones":
"GET /publications/publications_body.html?keep=20:10 HTTP/1.0"
"GET /publications/publications_body.html?keep=22:14 HTTP/1.0"
"GET /publications/publications_body.html?keep=04:28 HTTP/1.0"
I did read someplace else that this should work:
Disallow: *keep=*
Now my robots.txt looks like this (in short):
User-agent: *
Disallow: /pics/
Disallow: /images/
Disallow: /php/
Disallow:
User-agent: googlebot
Disallow: /pics/
Disallow: /images/
Disallow: /php/
Disallow: *keep=*
Disallow:
I am wondering if someone knows if I missed something and if google will be able to "handle" that special wildcard construct.
Would be happy if you could help me out!
Thanks
I did read someplace else that this should work:Quit reading there! ;) I've never encountered anything even remotely similar...
Google details their proprietary extensions to the robots exclusion protocol here [google.com] in the first question of the Googlebot Technology Questions section.