Welcome to WebmasterWorld Guest from 50.16.84.67

Forum Moderators: goodroi

Wildcard blocking of dynamic - robots.txt

To: GoogleGuy

   
3:44 am on Jul 31, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I need some verification that I can stop Googlebot from crawling ALL dynamic content.

User-agent: Googlebot
Disallow: /*?

Will this stop googlebot from requesting all dynamic data? The only way to test this is to have a high ranked PR. And I don't feel testing it on a production enviroment when I have the high PR because who knows, it could drop the site from the index for one or two months.

4:00 am on Jul 31, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ick. If I were a spider I would assume you're disallowing everything.

I can't find the original on webcrawler, but I think the glob there only refers to the UA.

4:09 am on Jul 31, 2002 (gmt 0)

WebmasterWorld Senior Member pageoneresults is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Hi Lisa, wasn't there another topic on this yesterday? I'm having Senior Moments these days.

User-Agent: Googlebot
Disallow: /*.asp$
Disallow: /*.cgi$
Disallow: /*.php$

If I'm reading this statement correctly from Google's website, then the above method will prevent Google from indexing dynamic content.

> In addition, Googlebot understands some extensions to the robots.txt standard: Disallow patterns may include * to match any sequence of characters, and patterns may end in $ to indicate that the $ must match the end of a name. For example, to prevent Googlebot from crawling files that end in gif, you may use the following robots.txt entry:

User-Agent: Googlebot
Disallow: /*.gif$

<edit> Ah, never mind. Now that I review this again, I see what you are trying to do...

User-agent: Googlebot
Disallow: /*?$

I wonder?

5:49 am on Jul 31, 2002 (gmt 0)

10+ Year Member



User-agent: Googlebot
Disallow: /*?$

That would only disallow URLs that end with a question mark.

6:13 am on Jul 31, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am looking to block Google for any url that contains a "?", When your PR is high Google will crawl anything. ick
 

Featured Threads

Hot Threads This Week

Hot Threads This Month