Forum Moderators: open
I haven't changed anything in the directories
Google-bot is "getting confused" in.
(I've seen other bots in the past generate
wrong file names -- and yes, I've read
about how bot algorithms sometimes
do this to "check for cheating" of some kind,
but I've never seen the Google-bot
go "this wild" before, )
Of course, I suspect if anyone else was
seeing this behavior, there'd be a comment
already ... but anyway. Just checking. {smile}
e.g.:
mydomain.com/exec/obidos/tg/detail/-/B0002NY8GW
I can't figure out where Google is getting these links. I tried doing a search for the links in Google via the "link:" prefix and they're not in the index. Also tried with a plain text search, and still nothing.
Google replied to my email and said Googlebot crawls links it finds on other pages and to search Google to find the page(s) that are linking to me. They also mentioned that there's no way for them to provide the referrer. :(
Since there's no one else talking about
"Google bot gone wild" but us {smile}
that's probably what happened.
Thanks again.,
What Mr. Googlebot is doing is looking for a file from one directory in another - very odd. He's been doing since three weeks now, but all the errors don't seem to be affecting my site's serp's. I was going nuts for a while though, trying to find the 'mistake' in my coding that was causing Googlebot to do this.
<a href=javascript:window.open('a'+'b')>text</a>
the goolebot then tries to crawl mydomain.com/a and mydomain.com/b
i was using javascript precisely to prevent bots from crawling url 'ab'.
Another wave of "file not found" errors tonight.
No javascript linking on my site.
Nothing "sneaky" about the directory structure
(normal use of Unix structure organization)
WHERE STRUCTURE IS
.../domain/subdirectory/file1.html
.../domain/subdirectory/file2.html
GOOGLE-BOT IS INSTEAD LOOKING IN
.../domain/file1.html
.../domain/file2.html
While I do have a handful of domains on the
same server, Google-bot seems to be
not just looking for "something wrong"
with that ... but even under just one domain
Google-bot is leaving out subdirectory names
to generate the bad file paths. Surely Google
doesn't care how I organize a Unix file structure.
This baffles me. {smile} Not the first time,
but this is NEW -- getting a log full of junk
from Google-bot generating path errors.
Google is looking for files like products/widgets/gizmos/widgets/widgets/widgets/gizmos/gizmos/gizmos/widget1.htm! and
products/widgets/products/widgets/products/widgets/products/widgets/gizmos/widget1.htm!
It ran up more notfound pages in one day than I get traffic all month. And it's done this several times in the past year.
Is your internal link structure consistent? Or do you link with .../default.aspx in some places?
What's up with external backlinks? Sure they all use the pure directory link?