Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Yahoo crawler(s) requesting mangled URLs



11:36 pm on May 25, 2011 (gmt 0)

5+ Year Member

I have not seen this exact 'weirdness' discussed here before...

For some time now I've been seeing wildly bogus URLs in the crawling
of b3090809.crawl.yahoo.net and b3090944.crawl.yahoo.net
(and maybe others...) (67.195.115.___).
I'm seeing this across several different domains (all mingled below as /BASE_URL/).
Most, if not all, of the file name requests are badly mangled text
strings of existing html files.

/home/ftp/BASE_URL/public_html/iaciovties html
but,exists in lower dir


/home/ftp/BASE_URL/public_html/W3DHJ/><img 4eght:=
? total crap

"index.html" of course

/home/ftp/BASE_URL/public_html/irted html

but, exists in lower dir




"index.html" of course

but, exists in lower dir

? index.html ?

/home/ftp/BASE_URL/public_html/index.html The FerryHom Pg s
? appended, bogus,
blank-delimted garbage

Additionally, I see the same crawler(s) requesting valid html filenames
in the doc root that only exist and have ONLY EVER existed in lower
directories -- and getting 404's as a result.

Anybody else seeing this?


8:54 pm on May 28, 2011 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Just a guess, but I've seen similar stuff before and it's typically a bad scraper page that links back to your site. The SEs will crawl all the malformed links and they just 404 to your site.

However, I often wonder if feeding bad 404s to a site isn't some black hat attempt to make your site look bad in the eyes of the SE, just a though.

Featured Threads

Hot Threads This Week

Hot Threads This Month