| 10:40 pm on May 23, 2005 (gmt 0)|
If you can use the noindex, follow/nofollow meta tags.
In my experience using Disallow via robots.txt will remove your page as well. But it takes a long long time. First the cache and snippet will be removed but your page is still listed without description in the index. Additionally the page can still come up in the SERPs....
Therefore... use metatags...
URL removal tool will also work but I have heard of problems using this tool.
| 2:36 am on May 24, 2005 (gmt 0)|
I agree with Dawg in how search engines practice.
But i disagree that it is the best method ... robots should just clear those pages it would be a much better world.
To speed the process it may help to feed a 404 response to the search engine you want to drop the page via a cloaking script.
| 4:14 pm on May 24, 2005 (gmt 0)|
Google will not autmatically remove pages mentioned in the robots.txt file.
To remove the pages from their index you need to submit the URL of the robots.txt file to the Google URL console. The pages will then be removed within days, and will stay out of the index for 6 months. They will continue to stay out only if the pages are still mentioned in the robots.txt file after that time.
Alternatively, the robots meta tag will see pages dropped from the index within a matter of a week or so.
| 8:56 am on May 28, 2005 (gmt 0)|
The reason that robots will continue to request a URL that returns a 404 is because servers do go down.
They don't want to drop the cache or have to re-index the site at the drop of a hat. That takes precious bandwidth. So they will continue to request it - waiting for the server to come back online.
410 gone should cause it to be removed though.
Thanks for all the input guys - Personally I would use the robots.txt submission method but I was just wondering...
| 9:30 am on May 28, 2005 (gmt 0)|
A server that has gone down should not be returning 404s. Either it wouldn't respond (timeouts/connection refused) or it would return one of the 5xx server errors. It shouldn't return a 4xx client error.
| 11:20 am on May 29, 2005 (gmt 0)|
let me rephrase that 'pages go down'
how many times have I found 404 on some important outbound link but just waited 24 hrs and see it come back.
| 4:35 pm on May 29, 2005 (gmt 0)|
|Alternatively, the robots meta tag will see pages dropped from the index within a matter of a week or so. |
I have a website I took down, leaving the pages up there with a META robots noindex.
The pages went supplemental within a week, and have stayed supplemental for 6 months now...
Since it's a free ISP, I have no access to ROBOTS.TXT or 404s on that particular site, so I've modified the pages to point to my new site, and am still waiting for them to go!