homepage Welcome to WebmasterWorld Guest from 50.19.169.37
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Google Translate
wilderness




msg:4628906
 5:25 am on Dec 9, 2013 (gmt 0)

would appear has changed its procedures. May require a header check.


66.249.81.241 - - [08/Dec/2013:20:45:01 -0700] "GET /MyFolder/MySub/mypage.html HTTP/1.1" 200 8483 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; MAMD; BOIE9;ENUS; rv:11.0) like Gecko,gzip(gfe)"
212.61.239.zz - - [08/Dec/2013:20:45:02 -0700] "GET /ImageFolder/MyImage.gif HTTP/1.1" 403 644 "http://translate.googleusercontent.com/translate_c?anno=2&depth=1&hl=nl&rurl=translate.google.nl&sl=en&tl=nl&u=http://www.example.com/MyFolder/MySub/mypage.html&usg=ALkJrhi2qIaC7VPNIL-UsNLSx2LzNpitqA" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; MAMD; BOIE9;ENUS; rv:11.0) like Gecko"

 

wilderness




msg:4629012
 4:39 pm on Dec 9, 2013 (gmt 0)

This will suffice

RewriteCond %{HTTP_USER_AGENT} gzip
RewriteCond %{REMOTE_ADDR} ^66\.249\.81\.
RewriteRule .* - [F]

Angonasec




msg:4629224
 9:44 am on Dec 10, 2013 (gmt 0)

Splendid, thank you!

Will Bing translate follow suit?

lucy24




msg:4629412
 8:54 pm on Dec 10, 2013 (gmt 0)

Only if it also has "gzip" in the UA. And, ahem, if you add lines or pipe-separated alternatives for all Bing Translate IPs.

keyplyr




msg:4629458
 12:46 am on Dec 11, 2013 (gmt 0)



Can't think of any legit human browser that carries "gzip" in the US string, so the simplest method IMO would be just to block that attribute as a general rule. YMMV.

Angonasec




msg:4629478
 1:32 am on Dec 11, 2013 (gmt 0)

Yes, we block gzip UA as default.

Anyone noticed Bing alter its translate bot characteristics yet?

lucy24




msg:4629505
 2:43 am on Dec 11, 2013 (gmt 0)

:: detour to raw logs ::

Occurrences of "gzip" anywhere in logs within this calendar year:

6 January
74.125.189.17 - - <snip> "http://translate.google.com/translate_p<snip>" "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A403 Safari/8536.25,gzip(gfe)"

19 February (unedited) *
74.125.189.20 - - [19/Feb/2013:23:44:53 -0800] "GET /games/LucysDownloads.html HTTP/1.1" 200 5522 "-" "Mozilla/5.0 (Windows NT 5.1; rv:19.0) Gecko/20100101 Firefox/19.0,gzip(gfe)"
(Surrounding requests point to Translate, though referer didn't say so.)

14 March:
two occurrences of 74.125.189.nnn

... Well, this is getting boring ;) In June for variety's sake we move to a 66.249.aa.bb IP. Further detour to 2012 confirms impression that they changed their default IP from 74.125 to 66.249 about six months ago. Now, obviously I've got a tiny little site-- but I don't see any occurrence of the element "gzip" in UAs other than Google Translate.

Some people also look at the X-Forwarded-For header-- either its content, or whether it exists at all. I know Google Preview sends one while Bing Preview (which is probably not a preview) doesn't. My second-to-last Translate request came with an X-Forwarded-For. Also a "Via:" header, if anyone cares.

In the specific case of google, you'll also see the element "translate" in the referer for non-page files. But that's after-the-fact information, unless you've also got hotlinking issues.


* Looking at this again, I really hope that google translate successfully rendered "for Macintosh only". I seriously doubt your average Windows system comes with a Mac Classic OS emulator.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved