homepage Welcome to WebmasterWorld Guest from 54.145.182.50
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
robots.txt and wildcards
robots.txt wildcarda
chms

5+ Year Member



 
Msg#: 4163906 posted 10:22 am on Jul 3, 2010 (gmt 0)

Hello,

I want to block urls like /search/?t=blahblah

I have two options but I don't know which is the correct:

/search/*t*
/search/?t*

Thank you

 

goodroi

WebmasterWorld Administrator goodroi us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4163906 posted 1:32 pm on Jul 3, 2010 (gmt 0)

wildcards aka pattern matching is not officially part of the robots.txt protocol. this means most of the big search engines support it but most of the smaller one won't.


According to Google's page [google.com...]
To match a sequence of characters, use an asterisk (*). For instance, to block access to all subdirectories that begin with private:

User-agent: Googlebot
Disallow: /private*/

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4163906 posted 10:45 am on Jul 4, 2010 (gmt 0)

according to the robots exclusion protocol (which doesn't include any wildcard extensions as supported by google) the matching occurs left-to-right and the correct option would be:
/search/?t

chms

5+ Year Member



 
Msg#: 4163906 posted 3:14 pm on Jul 4, 2010 (gmt 0)

Without * at the end?

Dijkgraaf

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4163906 posted 9:20 pm on Jul 4, 2010 (gmt 0)

Yes, without the * at the end.
The standard is that any URL which starts with the string you specified will be matched.

chms

5+ Year Member



 
Msg#: 4163906 posted 9:36 pm on Jul 4, 2010 (gmt 0)

Ok, thank you

chms

5+ Year Member



 
Msg#: 4163906 posted 2:08 pm on Jul 7, 2010 (gmt 0)

Hello,

Finally Google took the wildcards.

Thank you

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved