Welcome to WebmasterWorld Guest from 54.163.49.19

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Jakarta Commons-HttpClient

from Google App Engine

     
12:54 pm on Oct 2, 2009 (gmt 0)

Full Member

10+ Year Member

joined:Dec 20, 2002
posts:234
votes: 0


74.125.46.* and 216.239.50.*
yw-out-*.google.com and kc-out-*.google.com
Jakarta Commons-HttpClient/3.1
GET /googlehostedservice.html

I found Jakarta under "Included Software and Licenses for the Java Language Version of App Engine" here:
[code.google.com...]

Why are they looking for googlehostedservice.html? And who is "they"? Is this Google using their own App Engine, or a 3rd party hosted on Google's cloud computing?

Between all the crap from Google now, how can we differentiate and verify Googlebot, Google Adsbot, Google stealth checks, Google manual site reviews, Google employees just browsing, Google Wireless Transcoder, translate.google.com, Google Keyword Tool and Google-Sitemaps -- some of which all use the same IP addresses?

6:00 am on Oct 26, 2009 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14624
votes: 88


Looks like it's regarding the Google Apps engine and here's a specific document relating to the googlehostedservice.html file:
[google.com...]

However, it's really sloppy programming on Google's part not to identify the user agent so we can make some actual sense of what it's supposed to be doing.

That would get blocked on my server and I would simply stop using Google Apps opposed to letting all the default Jakarta user agents run amok on my server.

3:43 am on Oct 27, 2009 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:6160
votes: 284


For several years I've had Jakarta* in 403. Will I have to rethink this?
4:44 am on Oct 27, 2009 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:5815
votes: 64


I 403 Jakarta w/ 12 IPs currently whitelisted.
1:55 am on Nov 17, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


Speaking of 74.125.46.* a.k.a. Google... Since yesterday -- related?

74.125.46.81
Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3
robots.txt? NO
referer: None

74.125.46.82
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
robots.txt? NO
referer: http://www.google.com/search?hl=en&q=www.mysitename.com+filename

(The ref's filename was incomplete both as to title and suffix.)