| 7:31 pm on Oct 7, 2003 (gmt 0)|
It might have found a link on your site or another site that had the typo.
If googlebot is actively looking for the page, instead of returning it an error message at least send it to your site map or just add this page to your site.
| 7:40 pm on Oct 7, 2003 (gmt 0)|
I have spent the past 2 hours looking for the bad link. It just doesn't exist.
It's not just looking for one page, it's looking for hundreds of pages in a non existent directory. It's like it' stuttering on the directory name.
I have a /widgets/page.html and about 300 more pages in /widgets/. It's trying to find each page that exists in /widgets/ but looking in /widgets/widgets/ instead.
I have MANY 404's in my log (my custom 404 includes a sitemap).
| 10:19 pm on Oct 7, 2003 (gmt 0)|
Well, my problem has gotten worse and I'm going to start crying soon. Googlebot is starting to go to actual files now but every one it hits gets redirected (302) to my 404 page.
I don't get it! I have no redirects in my .htaccess.
Anyone have any ideas?
| 11:17 pm on Oct 7, 2003 (gmt 0)|
As a wild guess, I'd start out by suspecting you have semi-relative links in your site (that is, rather than href="http://mysite.com/widgets/foo.htm" or href="widgets/foo.htm" you have href="/widgets/foo.htm".
And did you just move some sites up or down a level (that is, from "/" to "/widgets" or from "/widgets/shopping" to "/widgets") or perhaps just around? Or perhaps the next theory is by itself enough to account for it.
And/or you're using FrontPage or .NET or some M$ tool -- my experience is M$ tools are all VERY bad about portability of categories within projects, and of course while nobody else on earth could aspire to duplicate ALL of M$'s program bugs, many other programmers are capable of recreating specific ones; so it could even be a tool from some other source -- and perhaps the root of your website on your machine is something like c:\widgets\ .
| 11:41 pm on Oct 7, 2003 (gmt 0)|
No, to all of the above. I haven't changed file structure at all. My pages are all hosted on apache. No M$ here.
I just don't get this redirect thing. It had been grabbing pages for days, then sometime yesterday morning it took my robots.txt (for the umteenth time) and now it's acting bizarre. No other spiders seem affected, Slurp has been busy all day, without problems.
I don't know, I just don't know.