Forum Moderators: open

Message Too Old, No Replies

Pages from old host still in index

even though it was always disallowed

         

nancyb

11:09 pm on Jan 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I moved my site at the beginning of October '02 to a new host. A few minutes ago I found a link (no cached page) in Google to the shopping cart that was in my cgi-bin at the old host. The cgi directory at the old host was always disallowed via robots.txt.

Googlebot completely crawled my site at the new IP within two days of the DNS change and within a week all deep/fresh bots were going to the new IP.

I can't believe after almost four months and thousands of page "gets" by both fresh/deep googlebots that the old, now non-existent, page is still in their index - especially since they weren't even supposed to "get" in the first place?

Even the deepest page on my site has been both deep and fresh crawled between 4 and 20 times since the move. And, since early December the index page has been deep crawled 17 times and fresh crawled 42 times. Yet the cache for almost all of my pages is in late November.

Any ideas why that old shopping cart page is still in the index and/or why the cache is so old?

ciml

1:29 pm on Jan 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Apart from Everflux effects, it should take one or two Google updates [webmasterworld.com] for the cache to show new content. Currently, anything from November 27 to January 1 is normal.

Is the cgi-bin listing just a 'URL only' listing, or does it have proper title and snippet?

If the former, then it can be listed from a link found some time between November 27 and January 1. If the URL wasn't fetched (due to /robots.txt or just not enought PageRank) then it can be listed regardless of what headers would have been returned.

nancyb

5:22 pm on Jan 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Site is, and has been, only a PR2, but the cache date has always been from the last deep crawl by g'bot to my site, didn't realize a cache date from the previous deep crawl she made would be normal.

Yes, only the URL, no title, description or cache. Must still be a page out there on some engine with the old shopping cart link and googlebot found it.

Thanks ciml.