I have not seen this exact 'weirdness' discussed here before...
For some time now I've been seeing wildly bogus URLs in the crawling
of b3090809.crawl.yahoo.net and b3090944.crawl.yahoo.net
(and maybe others...) (67.195.115.___).
I'm seeing this across several different domains (all mingled below as /BASE_URL/).
Most, if not all, of the file name requests are badly mangled text
strings of existing html files.
/home/ftp/BASE_URL/public_html/iaciovties html
"activities.html"
but,exists in lower dir
/home/ftp/BASE_URL/public_html/PuebloMasters/Du
"/Dues.html"
/home/ftp/BASE_URL/public_html/W3DHJ/><img 4eght:=
? total crap
/home/ftp/BASE_URL/public_html/UBSC/inde_x.html"
"index.html" of course
/home/ftp/BASE_URL/public_html/irted html
?
/home/ftp/BASE_URL/public_html/iooms,html
"rooms.html"
but, exists in lower dir
/home/ftp/BASE_URL/public_html/imaphtml
?
/home/ftp/BASE_URL/public_html/PuebloMasters/Dues.htmk
"Dues.html"
/home/ftp/BASE_URL/public_html/UBSC/fubsc_mmorial
"ubsc_memorial.html"
/home/ftp/BASE_URL/public_html/UBSC/inde_x.html
"index.html" of course
/home/ftp/BASE_URL/public_html/itanvelhtml
"travel.html"
but, exists in lower dir
/home/ftp/BASE_URL/public_html/UBSC/fid-exhtml
? index.html ?
/home/ftp/BASE_URL/public_html/index.html The FerryHom Pg s
? appended, bogus,
blank-delimted garbage
Additionally, I see the same crawler(s) requesting valid html filenames
in the doc root that only exist and have ONLY EVER existed in lower
directories -- and getting 404's as a result.
Anybody else seeing this?
Jonesy