| 9:31 pm on Nov 6, 2012 (gmt 0)|
I block all known AMAZON-EC2 ranges, including this one:
184.108.40.206 - 220.127.116.11
| 10:20 pm on Nov 6, 2012 (gmt 0)|
:: quick detour to whois ::
Yup, still listed as 18.104.22.168/14
As long as you're there, I've also got the adjoining 23.19 blocked. Server farm, apparently.
And, in the other direction: if 22.214.171.124/14 is still Comcast Business, they're probably expendable as well.
Come to think of it, I don't see much of anything in 23 except for a couple of Canadian IPs. Most of the range is-- or was until recently-- unassigned, so I'd expect a lot of filling-in over the next couple of years.
| 1:00 am on Nov 7, 2012 (gmt 0)|
Thanks for the Ubiquity range.
Just a FYI - Comcast Biz also includes all those employees who surf from their desks. We had a lengthy discussion about this last year. I spot checked through a year's logs and found thousands of human hits from inside that range (I had also considered blocked it.)
| 2:13 am on Nov 7, 2012 (gmt 0)|
|Comcast Biz also includes all those employees who surf from their desks. |
Ah ha. I don't think I've formally blocked any of their (many, many, many) ranges for that very reason: Just when you think it's completely useless, you get a bona fide human. And in my case it can be very hard to tell if they're goofing off or looking up something work-related ;)
| 3:52 am on Dec 25, 2012 (gmt 0)|
IP Address: 126.96.36.199
User Agent: NING/1.0
Seems Yahoo is using this user agent, as well?
| 4:12 am on Dec 25, 2012 (gmt 0)|
|IP Address: 188.8.131.52 |
I have this noted (and denied) as Inktomi Cache.
Please see the WayBack thread and reference to NOARCHIVE io the thread immediately below this thread.
| 4:26 am on Dec 25, 2012 (gmt 0)|
184.108.40.206 - 220.127.116.11 18.104.22.168/16
Listed as INKTOMI-BLK-5 maintained by Yahoo.
| 4:59 am on Dec 25, 2012 (gmt 0)|
wilderness, this topic: [webmasterworld.com...] - I didn't see anything mentioned, specifically?
Pardon my ignorance if it's plain as day. I searched the NING/1.0 user-agent and this topic came up. Nothing about Wayback machine/noarchive did.
What's their relation?
| 5:21 am on Dec 25, 2012 (gmt 0)|
My apologies brokaddr.
My reference was to the perils of allowing SE's and others to "cache" pages as per the NOARCHIVE link provided by Bill and repeated by myself.
Slurp/Inktomi and Yahoo all cache pages and even send their bots on solitary pages requests (with full supporting files) for their cache.
It's a good idea to separate that valid SE bots from all the SE's auxiliary tools, and then only allow the valid bots.
| 7:21 am on Dec 25, 2012 (gmt 0)|
|It's a good idea to separate that valid SE bots from all the SE's auxiliary tools, and then only allow the valid bots. |
I wasn't even aware Yahoo did that. Is there an easy way to decipher which is which?
| 11:03 am on Dec 26, 2012 (gmt 0)|
I'm not aware of any Yahoo crawlers, rather AFAIK Yahoo uses the crawls by MSN/Bing for their SERPS.
Somebody else may be able to provide IP's.
All I have documented since reactivation in February are Yahoo utilities, which I do not allow.
I haven't had any full crawls from anything that has identified itself itself as Yahoo, Slurp, or Inktomi since same reactivation.
| 8:35 pm on Dec 26, 2012 (gmt 0)|
I'm not aware of any Yahoo crawlers
There are actually several different bots still run by Yahoo, mostly in Europe and Asia, but Inktomi and Slurp still hit my US site, possibly because I have a lot of inbound from Europe.
| 9:22 pm on Dec 26, 2012 (gmt 0)|
I'm not getting any full crawls, rather solitary page requests from a few select pages (with complete accompanying files), and the same pages are repeating.
| 2:51 am on Dec 27, 2012 (gmt 0)|
I'm not getting full crawls either. Never did from the 2nd level bots. Although Slurp and Inktomi did full site crawls a couple years ago. They may have been re-assigned for other purposes.