Forum Moderators: phranque

Message Too Old, No Replies

A Script is mirroring my clients site with google translate?

Problem with a russian programmer mirroring my site with google translate

         

maximus12

4:57 pm on Feb 24, 2009 (gmt 0)

10+ Year Member



First of hi from a newbie! I have been a reader here for a few months but just joined today. I am looking for advice and help, I hope I am posting this in the correct thread.

Last few weeks I have noticed allot of traffic coming from one IP number every time I made changes to a clients Joomla CMS site. I have a forum and a blog attached in sub folders to the site (3 separate DB's in phpmyadmin / mysql)?

The IP numbers were coming from Google translate. I did some digging and saw that another site (Russian guy) copied the entire CMS (joomla) website and is using this script:


<script>_infowindowVersion=1;_intlStrings._originalText = "&#nnnn; ... &#nnnn;:";_intlStrings._interfaceDirection="ltr";_intlStrings._interfaceAlign="left";
_intlStrings._langpair="enŠit";_parentUrl="http://translate.google.com";
_intlStrings._feedbackUrl=_parentUrl+"/translate_suggestion";
_intlStrings._suggestTranslation="&#nnnn; ... &#nnnn;" ;_intlStrings._submit="
&#nnnn; ... &#nnnn;";_intlStrings._suggestThanks="&#nnnn; ... &#nnnn; Google
&#nnnn; ... &#nnnn;.";_intlStrings._reverse=false;</script>

(The code is not displaying correctly due to the Russian text?)

What it is doing is retrieving the sites info (realtime) and is sending it translated to another mirror site (identical to mine but in Italian).

He is inserting this after the <head> and ending it before the <title> Every time someone makes a forum post or even if I make a change on the site, Google translate automatically comes to the site and mirrors it back in Italian to the mirror site? And it updates in seconds?

I cannot block his IP address because he is not accessing the site, Google translate is? I do not want to block Google of course? Plus the IP number comes from different Google datacenters depending on which country the user accessing the mirror site is in?

Now if I turn the site off for maintenance, the Italian mirror site goes down as well? So he is dependent on my databases? Does anyone have any ideas on what I can do, has this happened to you as well? Is this a SQL injection problem or cross-site scripting?

Thanks in advance for your help and I hope I am posting this in the right forum.

Maximus

[edited by: phranque at 4:44 am (utc) on Feb. 25, 2009]

[edited by: tedster at 7:29 am (utc) on Feb. 25, 2009]
[edit reason] removed encoded text, added line breaks [/edit]

ogletree

1:29 am on Feb 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Boy you could have a lot of fun with that. You could detect the google translate ip an send it something different. Let your imagination go wild. You could put in code that would redirect visitors back to your site.

maximus12

4:27 pm on Feb 27, 2009 (gmt 0)

10+ Year Member



Hmmm, I do not fully understand you Ogle? Can you be more specific, how would I redirect googles IP? If I completely block google translate IP will that affect SEO and the google spider from coming to my site? I am hesitant to block googles IP due to this? But the mirror site is using two main google tranlate IP's in which 95% of the mirroring is coming from?

Thanks for your response..

londrum

4:56 pm on Feb 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



you can use this tiny little php script at the top of all your pages, which blocks most translation sites, including googles, without blocking their IP.
but you'll probably lose a few innocent visitors as well.

if(isset($_SERVER['HTTP_X_FORWARDED_FOR'])) {

header('HTTP/1.1 503 Service Unavailable');

print("<html><head>\n");

print("<title>Error</title>\n");

print("</head><body>\n");

print("<p>This page is blank</p>\n");

print("</body></html>");

exit; }

[edit]... actually, instead of serving up a blank page, it would probably make more sense to redirect them to your homepage or something

maximus12

5:08 pm on Feb 27, 2009 (gmt 0)

10+ Year Member



Wow cool thanks londrum, let me see if that works, but will this affect my SEO ranking or scare google away from crawling my site?

londrum

5:11 pm on Feb 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



nope, it won't affect normal google stuff. i've been using it for years.
but remember that people sometimes use proxies innocently... so you might lose a few visitors.

maximus12

5:21 pm on Feb 27, 2009 (gmt 0)

10+ Year Member



Londrum, I did a sub directory test (test/index.php) and it worked cool, he can't mirror it? Now I have to figure out how to apply this code to the entire site? I use Joomla with a template and have a /forum and /wordpress blog in sub directories? Any idea how I can append this function to the whole site?

Thakns for all your help, you are saving my bum here..
THANKS

maximus12

5:22 pm on Feb 27, 2009 (gmt 0)

10+ Year Member



Oh and not worried about loosing a few visitors, I will take the code out once this guy sees he can;t do it anymore and gives up... :)

ogletree

5:52 pm on Feb 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You are going to have to put that at the top of your pages. All the things you mention use templates so just find the header file for the templates and change.

maximus12

6:02 pm on Feb 27, 2009 (gmt 0)

10+ Year Member



I just did that and it worked for the main site, now I am working on finding the main index.php file for the wordpress template file and Vbulletin one as well. But it works guys, thank you so much, he cannot mirror the site any more. I am so happy. I am just worried that it will impact Google crawlability and rankings? Or is this completely different and is this php script harmeless for SEO?

I guess now no one legit can translate the site right? You also mentioned proxies, what does this mean and how can it affect the site? My plan is to leave the script in for a few weeks until this guys moves on and mirrors another site and then I will remove it.....

londrum

6:16 pm on Feb 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



you don't have to worry about normal search engines and normal visitors. it's only going to affect stuff that takes your page and forwards it onto somewhere else. because they are the only ones that provide the $_SERVER['HTTP_X_FORWARDED_FOR'] header.

legit sites (like googles translation service) do provide it. so the script blocks it. proxies are supposed to provide it as well, but most don't (because they are trying to keep their users details private). so it will work with some proxies, but not others.

maximus12

7:41 pm on Feb 27, 2009 (gmt 0)

10+ Year Member



Londrum,

Ok so I installed the script in all main index.php files (wordpress, Vbulletin and Joomla) and sure enough it works for most of the pages (100% on the homepage) but for some internal links it still does not work and displays the mirror for that internal page? It's like 60% of the mirror sites has been blocked but 40% is still getting through.

For example:
www.mirrorsite.com (is not displaying)
www.mirrorsite.com/concert_times.html (is not displaying)
www.mirrorsite.com/upcoming_events.html (IS Displaying)

www.mysite.com/wordpress (is not displaying)
www.mysite.com/wordpress/2009/01/12/townsed-concert-live/ (is not displaying)
www.mysite.com/wordpress/2009/01/18/weekend-retreat-for/ (IS DISPLAYING)

Same for forum....

Is this a cache issue that will resolve over time? Or are some internal pages not feeding the php script of the index.php file?

Thanks again for all the help..

maximus12

8:07 pm on Feb 27, 2009 (gmt 0)

10+ Year Member



I meant to say www.mirrorsite.com on both the examples not www.mysite.com on the second example.. :)

londrum

8:21 pm on Feb 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



that sounds like a caching issue. are you sure that you're not just seeing the page that's cached in your browser?

[edit]... just thought... have you got one of those caching plugins installed on wordpress? you might have to clear that cache as well.

[edited by: phranque at 10:40 pm (utc) on Feb. 27, 2009]

maximus12

8:59 pm on Feb 27, 2009 (gmt 0)

10+ Year Member



Londrum,

You were right, it is a caching issue. To test this I made a change on one of the pages it was still mirroring (PUT test in h2 tag) and it did not show up on the mirror, 10 seconds later the page it was still mirroring was blank and could not mirror any more. So the rest of the pages it is still mirroring are just cached and should clear up over the next few days :) THANK YOU SO MUCH THIS HAS HELPED SO MUCH!

As it stands now google translate cannot access my site, so any legit user in Argentina (example) who wants to translate the site into Spanish cannot and would see a blank page (I tested this). I do not mind that because I can remove the script once nobody is trying to mirror it and put it back if someone tries it again.

So you are sure this will have no NEGATIVE impact on SEO, rankings, or google accessing my pages and content for indexing correct? The only negative impact would be legit users trying to translate?

Thanks...

wilderness

10:00 pm on May 12, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hmmm, I do not fully understand you Ogle? Can you be more specific, how would I redirect googles IP? If I completely block google translate IP will that affect SEO and the google spider from coming to my site?

NO it will NOT.
I've had portions of the google translator UA (as well as other translators) and their (google) IP Translator range denied for longer than I'm able to recall.

Don