Forum Moderators: open

Message Too Old, No Replies

STOPPING people from translating using Google

         

mrlumpy

7:39 am on Feb 14, 2003 (gmt 0)

10+ Year Member



Hello,

I'm happy with my placement in the Google search engine.

However, I would like to stop people from translating my pages from one language to another. This is made difficult because Google fetches the page for the user and then on-the-fly translates.

Is there a specific block of IP's I can block to prevent translating? Or will that also block the google spider and indexing (which I don't want to do).

The reason I ask is because I am prohibited from offering some of my content in languages other than English. Even though it is the user who ultimately translates the content using Google, I would like to stop them from doing so.

Thanks for your help in advance!

chiyo

8:15 am on Feb 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



interesting question. im interested in the answer too, but it may be more useful to take it up with your supplier and explain the problem. They may accept that is is impossible to do. As far as i know, all the web translation servics, at least the free one, are failry rough, and i dount whether a document trsnlated this way would in any way compete with one proffessionally translated.

piskie

8:44 am on Feb 14, 2003 (gmt 0)

10+ Year Member



There are many other translation tools both online and desktop. Even if you find a way to bar the Google translator, You cannot bar them all.

IMHO you are only offering the Text Content in its intended primary language as is your agreement. If the visitor uses a translation tool that is "Outside your Site" then it is equaly outside your control.

You should make your content supplier aware that the visitor will be able to language process the text content if they wish using whatever is their chosen tool and you have no assured way to prohibit this.

Good Luck with this one.

mrlumpy

9:12 am on Feb 14, 2003 (gmt 0)

10+ Year Member



chiyo, piskie,

Thanks for your responses!

Unfortunately, I am a bit stuck with this one.

Suffice to say the content provider has the ability to remove my license if he's unhappy, and the on-the-fly Google translations vex him (he's the one who brought it to my attention). He doesn't really see the subtle difference between Google doing the translating and me doing the translating. To him, clicking on a Google link to my site, and seeing it in Spanish, is not acceptable.

I'm going out of my mind trying to find a solution.

I really don't know what to do. I need this content, and I certainly can't ban Google. I wish I knew if Google just used a few IPs for translation. I really wish they offered a robots.txt or similar solution. sigh...

Thanks for your help guys. I will soldier on...

stevedob

9:57 am on Feb 14, 2003 (gmt 0)

10+ Year Member



You could always provide that content in the form of one or more images, which would render it non-translatable. To get around G's ability to index PDF files you could convert something like Text -> PDF -> jpg/gif/whatever.

seindal

11:32 am on Feb 14, 2003 (gmt 0)

10+ Year Member



You can look at the referrer for 'translate_c' which is a bit google's mark. Block those requests and see if that disables it for your content provider. Some users that don't send referrer information will still get through, but as long as your content provider doesn't, your safe :-)

René.

yetanotheruser

11:56 am on Feb 14, 2003 (gmt 0)

10+ Year Member



Mr Lumpy, seindal, et al..

The problem is that while I too thought you could spot the referrer, which should have 'translate_c' in it somewhere.. this'll only allow you to stop the images being fetched..

The actual page is grabbed as normal, and not that easy to spot.

What I would suggest, (appart from getting a new content provider! ;) ) is this..

The page that google displays is identical to your original page (including all your script), just with any text content translated. It does however appear to the browser as the product of their translation program.

I guess therefore the easiest thing is to use javaScript. This way, when the page loads you can get it to check it's own url. If the url contains 'translate_c' (or infact - if it doesn't start with your web address would work for all translators, not just google) you can re-direct the person to the proper page...

Since we're just talking google, you can then look for the url of your page in the href. "...u=[myPageHere]&..." and re-direct the person to your page..

Not sure if it's a good idea to post the url of the test page I just made, so I'll sticky it to you.. but feel free to use it..

<aside>
I still don't agree with it, and I don't think you should be respnsible for resolving this.. your content provider should be if he's so worried about it! I personally think it's anti-user and a bad idea.. but I appreciate you may be stuck between the rock and the preverbial
</aside>

Anyhow.. good luck..

ATB :)

Brett_Tabke

12:20 pm on Feb 14, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It generally comes from proxy.google.com 216.239.35.5

duggelz

11:24 pm on Feb 14, 2003 (gmt 0)



Send the URL to webmaster@google.com and explain the situation. Translation can be disabled without affecting crawling and indexing.

amznVibe

12:47 am on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have about 20 alternative translation sites in my translation favorites, besides Google. I can also translate the Google cache page of your site or any copy on the Wayback Machine. I could also use a proxy to your site and just translate the proxy. Your page is either on the internet or its not.

Your only option is to convert the entire page into Flash (though AllTheWeb can read Flash now, so Google isn't far behind) or into pure graphics and let dial-up visitors wait a few minutes while the page loads. I hope that content is really incredible for this much restriction and effort.

Key_Master

1:04 am on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



amznVibe,

Translators can be blocked or redirected, the Google cache can be blocked, and the Wayback machine can be blocked. All with little effort.

amznVibe

2:01 am on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



you cannot realistically stop over 2000 open proxies [openproxies.com] and I can just translate the proxy instead of your direct site
- anyone with a web host and a cgi-bin can also setup their own free personal proxy [jmarshall.com] as well

if you don't want people to view your source, copy, save, print, or translate your page, don't put it on the web

Key_Master

2:31 am on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Proxies are not difficult to beat [banbots.com].

Woz

2:35 am on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



mrlumpy, perhaps you could give your client an analgous example. Such as a person reading an English book, published for sale only in the USA, over the phone to a friend in Mexico and translating into Spanish on the fly. No way you can feasably stop it.

Onya
Woz

seindal

4:10 am on Feb 15, 2003 (gmt 0)

10+ Year Member



As I have understood mrlumpy, the problem is not really a technical one but a problem of relations with a content provider. I think I would just block what the content provider have brought to my attention, in casu google's translations service, and leave the rest alone.

Just satisfy the content provider in a minimal way, by redirecting google's translate_c referrers to an explanatory page. If the content provider then comes back with another complaint, fix that then. After all, mrlumpy doesn't make the translation services, so mrlumpy can't possibily know them all, right?

There is no way to translation can be block 100% effectively, so just accomodate the content provider on the specific issue and leave the others be.

René

Key_Master

4:39 am on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A proxy fetches a page, parses the page for hyperlinks, and delivers the modified page to the client under the proxy server's own domain name.

All that you need, is a frame buster JavaScript in the header of your pages where the URL is unrecognizable to proxy server software. The code would also prevent other sites from showing your pages in a frame on their site.

Simple example (redirects to WebmasterWorld):

<script type="text/javascript">
window.location.replace("\x68\x74\x74\x70\x3a\x2f\x2f\x77\x77\x77\x2e\x77\x65\x62\x6d\x61\x73\x74\x65\x72\x77\x6f\x72\x6c\x64\x2e\x63\x6f\x6d\x2f");
</script>

NickCoons

4:47 am on Feb 15, 2003 (gmt 0)

10+ Year Member



Key_Master,

<Proxies are not difficult to beat.>

And your code to beat proxies is not difficult to beat.

Key_Master

5:13 am on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It does not need to be difficult to beat. No proxy server is going to waste time interpreting and decoding JavaScript URLs.

Simple is better.

NickCoons

6:46 am on Feb 15, 2003 (gmt 0)

10+ Year Member



Key_Master,

<It does not need to be difficult to beat. No proxy server is going to waste time interpreting and decoding JavaScript URLs.>

They always do when I write them. Too many sites have <javascript src="script.js"> in them, and not parsing the JavaScript means that you miss everything in this file.

Key_Master

6:59 am on Feb 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Unless you are interpreting the JavaScript, it's easy to work around a parse.

var page = 'ht' + 'tp:/' + '/www' + '.' + 'example' + '.' + 'com/';

seindal

7:35 am on Feb 15, 2003 (gmt 0)

10+ Year Member



For the persistent, several open source javascript engines exist, like spidermonkey and rhino from the mozilla project. It is fairly easy for a good programmer to embed a javascript interpreter in a program.

René.

i18nguy

6:11 pm on Feb 24, 2003 (gmt 0)



Perhaps instead of focusing on ip addresses, write a java script function to verify a line of text in the file is unchanged. Either compare it with a string inside of the Javascript which won't be changed, or run a small checksum.

mfagan

7:41 pm on Feb 24, 2003 (gmt 0)

10+ Year Member



Seems to me the simplest thing would be to add this in between your <head> tags.

<script type="text/javascript">
if (location.href.toLowerCase().indexOf("translate.google")>-1) {location.href=self.href}
</script>

Of course, it doesn't work if they don't have javascript or it is disabled, but chances are that a browser that supports frames will also support javascript.

And this still doesn't prevent someone from using another translation service, or saving your code and removing the javascript, or just copying your text into a translation service. Not to mention getting a human to translate it...

david323

8:02 am on Mar 3, 2003 (gmt 0)

10+ Year Member



You can prevent your pages from being translated by using encryption. HTML Guardian is the de facto standard for web page encryption. It will prevent your pages from being cached, translated, ripped, filtered out, etc. MANY benefits. Go to [protware.com...]

Also, Google may have proprietary meta tags for preventing their machine translation. Here is their contact info:

Contact Information
Google Inc.
2400 Bayshore Parkway; Mountain View, Calif. 94043
Telephone: 650.318.0200
Fax: 650.618.1499
Email: info@google.com
Web: www.google.com

hakre

8:13 am on Mar 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi pplz,

javascript, encryption etc. pp.. i won't think that all this is an acceptable manner. the referrer blocking and ip-ban should do the job. why don't you setup a page for testing purposes and log the ip-adresses by request. then translate this page sometimes with google and ban these ip-adresses. maybe even the tool has got a unique useragent string you can block, too. this way you can prevent google by translating your pages.

nevertheless as stated here, it's a hard job to block any translation tools for your website. maybe it's better your boss does not know, but i think this can be as hard as blocking any spider.

so if you block google translation and your boss is only using this, fine. prob solved. with less discussion.

david323

6:21 pm on Mar 4, 2003 (gmt 0)

10+ Year Member



I sent an e-mail to googlebot@google.com and received the response below. This settles the matter. They are willing to stop it for you!

Hi David,

Thanks for contacting Google.

If you do not want your page translated by Google, please send us the url of your site and we can forward your request on to the appropriate department.

Regards,
The Google Team

Original Message Follows:
------------------------
From: David Walker <david323@pacbell.net>
Subject: Meta Tag for Preventing Machine Translation
Date: Mon, 03 Mar 2003 09:17:08 -0800

Is there a meta tag to prevent Google's machine translation from translating my web page? I need one to meet a no-translation requirement of my employer.

Thank You,
David Walker

-------------------------

My first suggestion to use HTML Guardian, [protware.com...] would be needed to prevent ALL translating. And you would have to use the Disable Text Selection feature of that program to prevent people from selecting text and pasting it into things like babelfish.