Welcome to WebmasterWorld Guest from 54.167.159.151

Forum Moderators: Ocean10000 & incrediBILL

Strange Inktomi Corporation IP

   
5:49 pm on May 5, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This has been going on for a few days:

74.6.13.111 - - [05/May/2012:13:34:37 -0400] "GET /....html HTTP/1.1" 200 6297 "-" "Mozilla/5.0 (X11; Linux i686 on x86_64; rv:7.0.1) Gecko/ /7.0.1"

The IP belongs to "Inktomi Corporation, Sunnyvale". I think this is Yahoo.

It loads the whole page with all javascript every few minutes.
10:03 pm on May 5, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I think this is Yahoo.

I sure hope so, because if it isn't, I've been blocking innocent humans at 74.6 without cause ;)
1:24 am on May 6, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If their ability to persist is your only concern?
The following will stop them in their tracks:

RewriteCond %{HTTP_USER_AGENT} Linux
RewriteCond %{REMOTE_ADDR} ^74\.6\.
RewriteRule .* - [F]
2:14 pm on May 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for this. Today Inktomi started to bomb all my sites with thousands of page loads and I had to block them.

They even load the Adsense code with ech request. This could lead to an Adsense ban.
6:04 pm on May 9, 2012 (gmt 0)

10+ Year Member



Thanks. Found this one in my logs as well, and now blocked.
6:26 pm on May 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



IP range block was really necessary:

SetEnvIf Remote_Addr "^72\.30\." get_out
SetEnvIf Remote_Addr "^74\.6\." get_out
SetEnvIf Remote_Addr "^98\.137\.72\." get_out


<FILES *>
Order Allow,Deny
Allow from all
Deny from env=get_out
</FILES>
8:19 pm on May 9, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I have the range 74.6.13.87 - 74.6.13.151 listed as a yahoo bot range. It's hit a few dozen times on the server this month but with a non-slurp IP (as noted in the OP) so it's been rejected.

I have a lot of slurp bot ranges listed at 74.6/16 but the rest is "allowed". Does anyone have a real reason to block the rest of the /16 or is part of it sometimes used by humans?

Same as above for 72.30/16 and 98.137/16, much of which is banned, but the "parent" range 98.136.0.0 - 98.139.255.255 is "allowed".

My current annoyance is 98.139.241.224 - 98.139.241.252 which is used by yahoomobile. I get loads of bad hits on those Ips, plus a few that may be valid. Anyone have anything on those?
9:06 pm on May 9, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is not a crawler for me. A crawler will NOT execute client side Javascript. What would be the reason for them to do this and even load Adsense ads code thousands of times a day?

What is a Yahoo stealth crawl good for? They have Bingbot.
11:16 pm on May 9, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



A while back, I threw in the towel and went to

BrowserMatch Yahoo keep_out


I'm now trying to figure out why I'm suddenly showing up in Yahoo image search-- with results from roboted-out directories that I keep a close watch on.
9:16 pm on May 22, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This bot really loves eating 403 and it never learns.
1:44 pm on Nov 6, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



In another thread (unknown to me), I had mentioned that I had not seen the activity of Inktomi that I was used pre-2010.

Since re-activation in Feb 2012 the only Slurp requests I've seen are occasional full-page-requests (with supporting images and CSS), and from the 74.6. range.

This morning the following:
72.30.142.221 - - [06/Nov/2012:12:27:30 +0000] "GET /robots.txt HTTP/1.0" 200 2719 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http:// help.yahoo.com/help/us/ysearch/slurp)"

There were two additioanl requests (same IP and UA) for a sub-sub directory page and CSS.

There's some interesting reading on Inktomi:
Inktomi Traffic Server source
Apache Traffic Server (TS)
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month