Welcome to WebmasterWorld Guest from 184.108.40.206
Forum Moderators: goodroi
I've added the exclusion in my robots.txt:
However I haven't uploaded the file yet b/c I know it takes some time for all googlebots to get word that they should not follow links to that page.
Does anyone have an estimate on how long I should wait before uploading the page?
Use the WebmasterWorld server headers checker [webmasterworld.com] to determine what your Expires and Cache:max-age settings are for your existing robots.txt. Google adheres to these settings reliably (if you provide them).
HTTP/1.1 200 OK
Date: Wed, 06 Aug 2003 16:14:01 GMT
Server: Rapidsite/Apa/1.3.27 (Unix) FrontPage/220.127.116.110 mod_ssl/2.8.12 OpenSSL/0.9.7a
Cache-Control: must-revalidate, max-age=7200
Expires: Wed, 06 Aug 2003 18:17:30 GMT
Last-Modified: Sat, 02 Aug 2003 05:40:41 GMT
This shows that my robots.txt is to be considered valid for only two hours, and must be re-fetched if the user-agent has an older copy.
I can't add the meta tags to the page b/c its a dynamic page that essentially performs some database work and redirects to another page.
I haven't previously set the expires and cache:max-age but I will now.
Here is my scenario that didn't work:
I updated my robots.txt to Disallow a page.
Within one day I saw at least one googlebot access my new robots.txt
One week later I upload the new page I don't want spidered.
One more week later I find it in the index.
Remove a single page using meta tags.