Welcome to WebmasterWorld Guest from 54.163.49.19

Forum Moderators: goodroi

Message Too Old, No Replies

robots.txt and wildcards

robots.txt wildcarda

     
10:22 am on Jul 3, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Dec 19, 2009
posts: 41
votes: 1


Hello,

I want to block urls like /search/?t=blahblah

I have two options but I don't know which is the correct:

/search/*t*
/search/?t*

Thank you
1:32 pm on July 3, 2010 (gmt 0)

Administrator from US 

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:June 21, 2004
posts:3080
votes: 67


wildcards aka pattern matching is not officially part of the robots.txt protocol. this means most of the big search engines support it but most of the smaller one won't.


According to Google's page [google.com...]
To match a sequence of characters, use an asterisk (*). For instance, to block access to all subdirectories that begin with private:

User-agent: Googlebot
Disallow: /private*/
10:45 am on July 4, 2010 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


according to the robots exclusion protocol (which doesn't include any wildcard extensions as supported by google) the matching occurs left-to-right and the correct option would be:
/search/?t
3:14 pm on July 4, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Dec 19, 2009
posts: 41
votes: 1


Without * at the end?
9:20 pm on July 4, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 31, 2005
posts:1108
votes: 0


Yes, without the * at the end.
The standard is that any URL which starts with the string you specified will be matched.
9:36 pm on July 4, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Dec 19, 2009
posts:41
votes: 1


Ok, thank you
2:08 pm on July 7, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Dec 19, 2009
posts:41
votes: 1


Hello,

Finally Google took the wildcards.

Thank you