| excluding dynamic pages in all search engines excluding dynamic pages in robots.txt |
toddmc04

msg:1526787 | 8:31 pm on Jun 30, 2004 (gmt 0) | I have a site that has one set of html pages and a duplicate set of .php pages. How do I exclude all search engines from indexing my dynamic pages? My site is using .php files that I don't want to have indexed. Google says to use this: User-agent: Googlebot Disallow: /*? It's my understanding that not all Search Engines use the wildcard. How can I keep my .php files out of the search engines?
|
djtaverner

msg:1526788 | 8:58 am on Jul 12, 2004 (gmt 0) | Hello, Im also waiting on this question, could the moderator post a response? Cheers
|
DaveAtIFG

msg:1526789 | 2:49 pm on Jul 13, 2004 (gmt 0) | The code you posted will only be recognized/respected by Googlebot, it's non-standard and unrecognized by other spiders. In addition, most other spiders do not recognize wildcards in the "Disallow" since this is also non standard. RE: [searchengineworld.com...] To insure none of your PHP pages are spidered, you will need to do two things: 1. Move all of them into a unique folder/directory and include the following in robots.txt: User-agent: * Disallow: /PHP Folder Name 2. Add <meta name="robots" content="noindex"> in the head section of each page to insure they aren't indexed as a result of spiders following external links to them.
|
|
|