| Welcome to WebmasterWorld Guest from 188.8.131.52 |
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
|Pubcon Platinum Sponsor 2014|
|robots.txt and wildcards|
| 10:22 am on Jul 3, 2010 (gmt 0)|
I want to block urls like /search/?t=blahblah
I have two options but I don't know which is the correct:
| 1:32 pm on Jul 3, 2010 (gmt 0)|
wildcards aka pattern matching is not officially part of the robots.txt protocol. this means most of the big search engines support it but most of the smaller one won't.
According to Google's page [google.com...]
|To match a sequence of characters, use an asterisk (*). For instance, to block access to all subdirectories that begin with private: |
| 10:45 am on Jul 4, 2010 (gmt 0)|
according to the robots exclusion protocol (which doesn't include any wildcard extensions as supported by google) the matching occurs left-to-right and the correct option would be:
| 3:14 pm on Jul 4, 2010 (gmt 0)|
Without * at the end?
| 9:20 pm on Jul 4, 2010 (gmt 0)|
Yes, without the * at the end.
The standard is that any URL which starts with the string you specified will be matched.
| 9:36 pm on Jul 4, 2010 (gmt 0)|
Ok, thank you
| 2:08 pm on Jul 7, 2010 (gmt 0)|
Finally Google took the wildcards.
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved