Page is a not externally linkable
Pfui - 9:38 pm on Jul 16, 2006 (gmt 0)
Naughty Yahoo User Agents Addressed in his response are: Yahoo! China . >> rlx-1-2-1.labs.corp.yahoo.com ----- ----- ----- ----- 203.141.52.47 ont211014008240.yahoo.co.jp ----- ----- proxy2.search.scd.yahoo.net proxy3.search.scd.yahoo.net ----- mmcrm4070.search.mud.yahoo.com opnprc1.search.mud.yahoo.com ----- ----- . morgue2.corp.yahoo.com And what's this new referer? "http://yq.search.yahoo.com/" . www.io.com phad.cc.umanitoba.ca . Where's the Help page with the code to prevent a thumbnail grab? Because apparently this list missed the little sucker: [Note: Not for copy-pasting! This board's program alters crucial bits.] . Month after month, I simply don't get enough traffic from Yahoo to justify the ever-increasing work, guesswork, and bandwidth. From July 1 to date, for Yahoo -- .inktomisearch.com HITS: 4499 And Google -- 3,406 less hits and 2,436 more ID'd referers: .googlebot.com HITS: 1093 'Nuff said. Sorry, Yahoo.
1.) I'm pleased and appreciative that Yahoo_Mike took the time to respond to a number of concerns raised in the original post:
[webmasterworld.com...]
Yahoo! Mindset
proxyn.search.dcn.yahoo.net
proxyn.search.acd.yahoo.net
Mozilla/4.0 (Overture).
2.) Alas, still unaddressed are the majority of entries I listed in what is now message "#:400187" including the following (with still more new ones, below; plus cut-to-the-chase stats in #6):
-----
dp131.data.yahoo.com
Mozilla/4.0
Mozilla/4.0
r17.mk.cnb.yahoo.com
m23.mk.cnb.yahoo.com
(multi)
Gaisbot/3.0+(robot05[@gais.cs.ccu.edu.tw;+http://gais.cs.ccu.edu.tw/robot.php)
Gaisbot/3.0+(robot06@gais.cs.ccu.edu.tw;+http://gais.cs.ccu.edu.tw/robot.php)
urlc1.mail.mud.yahoo.com
urlc2.mail.mud.yahoo.com
urlc3.mail.mud.yahoo.com
urlc4.mail.mud.yahoo.com
Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
ts2.test.mail.mud.yahoo.com
(68.142.203.133)
Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
203.141.52.37
203.141.52.39
203.141.52.44
(multi)
Y!J-BSC/1.0 (http://help.yahoo.co.jp/help/jp/blog-search/)
Y!J-BSC/1.0 (http://help.yahoo.co.jp/help/jp/search/indexing/indexing-15.html)
Y!J-BSC/1.0 (http://help.yahoo.co.jp/help/jp/search/indexing/indexing-15.html)
mmcrm4070.search.mud.yahoo.com
Yahoo-MMCrawler/3.x (mms dash mmcrawler dash support at yahoo dash inc dot com)
proxy1.search.scd.yahoo.net
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0), Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1)
proxy1.search.dcn.yahoo.net
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.1) Gecko/20060111 Firefox/1.5.0.1
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0), Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Alexa Toolbar; mxie)
proxy2.search.scd.yahoo.net
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0), Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0), Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Alexa Toolbar; mxie)
proxy3.search.scd.yahoo.net
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0), Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
msfp01.search.mud.yahoo.com
(side-scroll edited)
Nokia6682/2.0 (3.01.1) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 configuration/CLDC-1.1 UP.Link/6.3.0.0.0
(compatible; Windows CE; Blazer/4.0; PalmSource; MOT-V300; SEC-SGHE315;
YahooSeeker/MA-R2D2;mobile-search-customer-care AT yahoo-inc dot com)
Yahoo-MMCrawler/3.x (mms dash mmcrawler dash support at yahoo dash inc dot com)
Yahoo-Blogs/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; [help.yahoo.com...] )
oc4.my.dcn.yahoo.com
YahooFeedSeeker/1.0 (compatible; Mozilla 4.0; MSIE 5.5; [publisher.yahoo.com...]
All referers beginning: "http://rds.yahoo.com/"
<<
3.) And here's another Yahoo Host/UA that never asks for robots.txt during its weekly visit:
Mozilla/4.05 [en]
4.) And then there are these new -- licensees? Fakes? And again, no robots.txt by either:
Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
07/06 09:36:11 /
Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
07/15 12:19:20 /
5.) I try to cooperate with Yahoo but I'm repeatedly abused by them, by their scores of obvious and covert bots and UAs and IPs, by their retrieving info and licensing it -- e.g., thumbnails to Viewpoint -- info that they're not supposed to retrieve in the first place. RewriteCond %{REMOTE_HOST} \.inktomi\.com$ [NC,OR]
RewriteCond %{REMOTE_HOST} \.inktomisearch\.com$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Slurp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Slurp [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Yahoo [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Yahoo-Robot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Yahoo-MMCrawler [NC,OR]
RewriteCond %{REMOTE_HOST} \.yahoo\.com$ [NC,OR]
RewriteCond %{REMOTE_HOST} \.search\.mud\.yahoo\.com$ [NC]
RewriteCond %{REQUEST_URI}!^/robots\.txt$
RewriteRule \.(cgi¦pl¦mid¦wav¦hqx¦ZIP¦xml¦ico¦jpg¦gif¦txt)$ - [F,L]
6.) Bottom Line:
(incl. 1425 robots.txt)
search.yahoo.com REFERERS: 419
(incl. 261 robots.txt)
google... /search? REFERERS: 2855