| 6:01 pm on Jun 13, 2006 (gmt 0)|
Gary, not to fret. You're trying to help Yahoo, and us, solve problems. That's a breath of fresh air because it's always easier to sit around and complain. And if it turns out Yahoo's folks dismiss detailed, debugging-oriented data from Web professionals? C'est la vie.
But hey, kick back and give things time. I know you're eager but they've got channels upon channels. (You were the first respondent in this, your own thread, because you didn't think we'd reply or that we weren't replying quickly enough. Heck, with mod-approval time and work skeds and such, I hadn't even seen your initial post until after you'd replied to it!)
Regardless of outcome, thank you for stepping up to the plate. Now get back to work:)
| 9:23 pm on Jun 13, 2006 (gmt 0)|
Thanks for your support and understanding. One of these days I promise to grow up. :)
| 4:12 am on Jun 14, 2006 (gmt 0)|
Sorry to double post. I just wanted to let you all know I heard from Warren and he told me he got the last batch of messages I sent him and he's working on the robots.txt user agent problem with Slurp China.
Bill, if you see this I've been trying to get in touch with you but your mailbox here always says it's full.
| 4:45 am on Jun 14, 2006 (gmt 0)|
|Bill, if you see this I've been trying to get in touch with you but your mailbox here always says it's full. |
LOL - sorry, I dumped a bunch of sticky's the other night, will try killing more, let me know ;)
Back to Yahoo...
| 12:48 pm on Jun 16, 2006 (gmt 0)|
(Hope this belongs in this thread)
I can't figure out why we've been seeing this in our logs:
220.127.116.11 "GET /mod_ssl:error:HTTP-request HTTP/1.0" 404 316 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
Also from 18.104.22.168 and 22.214.171.124
| 11:48 pm on Jun 29, 2006 (gmt 0)|
Anyone else seeing this Slurpy sloppiness?
access_log (re the last entry, below)
wj500040.inktomisearch.com - - [29/Jun/2006:12:48:11 -0700]
"GET /SlurpConfirm404/letters/magasin/BasicTabbedPaneUI.TabSelectionHandler.htm HTTP/1.0" 404 2336 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
[Thu Jun 29 12:41:08 2006] [error] [client 126.96.36.199] File does not exist:
[Thu Jun 29 12:41:38 2006] [error] [client 188.8.131.52] File does not exist:
[Thu Jun 29 12:42:44 2006] [error] [client 184.108.40.206] File does not exist:
[Thu Jun 29 12:43:14 2006] [error] [client 220.127.116.11] File does not exist:
[Thu Jun 29 12:43:44 2006] [error] [client 18.104.22.168] File does not exist:
[Thu Jun 29 12:44:14 2006] [error] [client 22.214.171.124] File does not exist:
[Thu Jun 29 12:44:47 2006] [error] [client 126.96.36.199] File does not exist:
[Thu Jun 29 12:45:41 2006] [error] [client 188.8.131.52] File does not exist:
[Thu Jun 29 12:47:42 2006] [error] [client 184.108.40.206] File does not exist:
[Thu Jun 29 12:48:11 2006] [error] [client 220.127.116.11] File does not exist:
I thought it was a new set of exploits until I verified one of the IP as inktomi's:
IP address: 18.104.22.168
Reverse DNS: wj500040.inktomisearch.com
Reverse DNS authenticity: [Verified]
I can see doing one 404 test (well, not really, but I know some SEs do a one-file check). But 10? And from 10 IPs in under 10 minutes? Gimme a break. Besides, inktomi already asks for robots.txt about 50 times a day. So wow, why the sudden 404 assault?
| 4:38 am on Jun 30, 2006 (gmt 0)|
Oh, man. At it again as I type --
[Thu Jun 29 21:19:35 2006] [error] [client 22.214.171.124] File does not exist:
[Thu Jun 29 21:20:47 2006] [error] [client 126.96.36.199] File does not exist:
[Thu Jun 29 21:21:17 2006] [error] [client 188.8.131.52] File does not exist:
[Thu Jun 29 21:23:30 2006] [error] [client 184.108.40.206] File does not exist:
[Thu Jun 29 21:24:00 2006] [error] [client 220.127.116.11] File does not exist:
[Thu Jun 29 21:24:30 2006] [error] [client 18.104.22.168] File does not exist:
[Thu Jun 29 21:25:00 2006] [error] [client 22.214.171.124] File does not exist:
[Thu Jun 29 21:26:53 2006] [error] [client 126.96.36.199] File does not exist:
[Thu Jun 29 21:28:05 2006] [error] [client 188.8.131.52] File does not exist:
No one else is seeing this?
| 7:05 am on Jun 30, 2006 (gmt 0)|
It's on one of my sites right now:
...and the list goes on and and on. None of these files has ever existed on any of my websites.
...and the list goes on and on. They all belong to Inktomi.
I'll take a chance and forward this to Warren at Inktomi when I wake up.
| 1:46 pm on Jun 30, 2006 (gmt 0)|
Slurp just checks for 404 response.
Official FAQ may help:
| 2:39 pm on Jun 30, 2006 (gmt 0)|
Thanks for the link, thetrasher. People have talked about deliberate 404s but I didn't know Slurp might request up to 10 URLs at once. Usually people ask about one or maybe two oddities.
Apparently the testing is not as "rare" as stated by that page -- unless yesterday was my lucky day. Shoot. Now I find out! :)
| 12:19 am on Jul 13, 2006 (gmt 0)|
After a long absence, "Yahoo! Slurp China;" has returned to my sites, and now seems to heed robots.txt in this format:
|User-agent: Slurp China |
I can't vouch for whether it will obey specific directory or file Disallows, or whether it will obey
or any other variants.
However, it does seem to recognize that it should go away when it sees the code above, rather than accepting the
record and subsequently hitting my user-agent blocking code in .htaccess.
Now if I can just get "Yahoo! Slurp;" to quit listing my .css files in SERPs... Grumble, grumble... I've never seen this done by any other search engine before, but I had to add the Disallows for my .css files so that they wouldn't show up when search terms coincided with the terms I used in my .css file comments...
[edited by: jdMorgan at 12:20 am (utc) on July 13, 2006]
| 4:57 pm on Jul 13, 2006 (gmt 0)|
Don't think I've been crawled by this strain of Slurp before, or if I was it was a long time ago but it's baaaaack:
184.108.40.206 "Mozilla/5.0 (compatible; Yahoo! DE Slurp; [help.yahoo.com...]
Why in the heck can't Yahoo just crawl pages from one place and let everyone share the pages?
I already block Yahoo China, don't make me block more...
| 5:23 pm on Jul 13, 2006 (gmt 0)|
Slurp DE is their Yahoo Directory engine, according to GaryK's earlier post (# 400194 above). I certainly don't think I'd want to block it, since I've got several 'grandfathered' free listings in their directory, and you have to pay to get in (and pay again annually to stay in) now... Blocking Slurp DE Could cost me thousands!
| 3:02 pm on Jul 14, 2006 (gmt 0)|
Thanks for all your comments and suggestions on this thread.
I have posted a response from Yahoo! Search on a new thread started on this forum.
Please check the information in the thread entitled Yahoo! Crawlers - A response from Yahoo! Search at
| 5:04 pm on Jul 14, 2006 (gmt 0)|
Thanks for your reply Mike. Thanks also to Warren and of course Mason. Without Mason our concerns never would have made it this far because he was initially my only contact at Yahoo!
| This 45 message thread spans 2 pages: < < 45 ( 1  ) |