| Welcome to WebmasterWorld Guest from 126.96.36.199 |
register, login, search, subscribe, help, library, PubCon, announcements, recent posts, open posts,
|Subscribe to WebmasterWorld|
|robots.txt and wildcards|
| 10:22 am on Jul 3, 2010 (gmt 0)|
I want to block urls like /search/?t=blahblah
I have two options but I don't know which is the correct:
| 1:32 pm on Jul 3, 2010 (gmt 0)|
wildcards aka pattern matching is not officially part of the robots.txt protocol. this means most of the big search engines support it but most of the smaller one won't.
According to Google's page [google.com...]
|To match a sequence of characters, use an asterisk (*). For instance, to block access to all subdirectories that begin with private: |
| 10:45 am on Jul 4, 2010 (gmt 0)|
according to the robots exclusion protocol (which doesn't include any wildcard extensions as supported by google) the matching occurs left-to-right and the correct option would be:
| 3:14 pm on Jul 4, 2010 (gmt 0)|
Without * at the end?
| 9:20 pm on Jul 4, 2010 (gmt 0)|
Yes, without the * at the end.
The standard is that any URL which starts with the string you specified will be matched.
| 9:36 pm on Jul 4, 2010 (gmt 0)|
Ok, thank you
| 2:08 pm on Jul 7, 2010 (gmt 0)|
Finally Google took the wildcards.
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld ® and PubCon ® are a Registered Trademarks of Pubcon Inc.
© Pubcon Inc. 1996-2012 all rights reserved