Welcome to WebmasterWorld Guest from 54.159.50.111

Forum Moderators: open

Message Too Old, No Replies

Googlebot looking for Non-Existent pages?

     
6:48 pm on Oct 7, 2003 (gmt 0)

New User

10+ Year Member

joined:Jan 24, 2003
posts:11
votes: 0


For the past 24 hours GoogleBot has been acting very odd.

Instead of it going to:
"http://mydomain.com/redwidget/page.html"

It's looking for
"http://mydomain.com/redwidget/redwidget/page.html"

The /redwidgets/redwidgets/ directory has never existed.

Has anyone else experienced this problem?

TIA,
Ellen

7:31 pm on Oct 7, 2003 (gmt 0)

Junior Member

joined:Jan 20, 2003
posts:105
votes: 0


It might have found a link on your site or another site that had the typo.

If googlebot is actively looking for the page, instead of returning it an error message at least send it to your site map or just add this page to your site.

7:40 pm on Oct 7, 2003 (gmt 0)

New User

10+ Year Member

joined:Jan 24, 2003
posts:11
votes: 0


I have spent the past 2 hours looking for the bad link. It just doesn't exist.

It's not just looking for one page, it's looking for hundreds of pages in a non existent directory. It's like it' stuttering on the directory name.

I have a /widgets/page.html and about 300 more pages in /widgets/. It's trying to find each page that exists in /widgets/ but looking in /widgets/widgets/ instead.

I have MANY 404's in my log (my custom 404 includes a sitemap).

10:19 pm on Oct 7, 2003 (gmt 0)

New User

10+ Year Member

joined:Jan 24, 2003
posts:11
votes: 0


Well, my problem has gotten worse and I'm going to start crying soon. Googlebot is starting to go to actual files now but every one it hits gets redirected (302) to my 404 page.

I don't get it! I have no redirects in my .htaccess.

Anyone have any ideas?

11:17 pm on Oct 7, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 30, 2001
posts:1739
votes: 0


As a wild guess, I'd start out by suspecting you have semi-relative links in your site (that is, rather than href="http://mysite.com/widgets/foo.htm" or href="widgets/foo.htm" you have href="/widgets/foo.htm".

And did you just move some sites up or down a level (that is, from "/" to "/widgets" or from "/widgets/shopping" to "/widgets") or perhaps just around? Or perhaps the next theory is by itself enough to account for it.

And/or you're using FrontPage or .NET or some M$ tool -- my experience is M$ tools are all VERY bad about portability of categories within projects, and of course while nobody else on earth could aspire to duplicate ALL of M$'s program bugs, many other programmers are capable of recreating specific ones; so it could even be a tool from some other source -- and perhaps the root of your website on your machine is something like c:\widgets\ .

11:41 pm on Oct 7, 2003 (gmt 0)

New User

10+ Year Member

joined:Jan 24, 2003
posts:11
votes: 0


No, to all of the above. I haven't changed file structure at all. My pages are all hosted on apache. No M$ here.

I just don't get this redirect thing. It had been grabbing pages for days, then sometime yesterday morning it took my robots.txt (for the umteenth time) and now it's acting bizarre. No other spiders seem affected, Slurp has been busy all day, without problems.

I don't know, I just don't know.