homepage Welcome to WebmasterWorld Guest from 23.20.61.85
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Site Hijacking Theft - Can't Stop Site Stealing
Another site showing our web site content.
friendlyseo

10+ Year Member



 
Msg#: 3033 posted 8:14 pm on Mar 6, 2005 (gmt 0)

A web site in China is getting listings on Google for MY pages content AND all MY linked URL's on the pages are changed to their site AND they add a link to their other site on the bottom of the page - powered by 1bu.net

My link to WebmasterWorld now is seen in their source as:

webmasterworld.com.theirdomain.com

All other links too - internal and external are hijacked.

I tried the anti-hotlink .htaccess file from [webmasterworld.com...] but it doesn't stop this new kind of SITE theft hijacking which I haven't seen before quite like this.

How can they use my HTML pages like that intact except for changing URL's?

Shouldn't an anti hotlink rewrite rule stop the images from showing on their URL? It doesn't...

Any suggestions on how to help? There must be a dupe content penalty that is gonna hurt us bad. They already rank on the first page of G where I should be.

<snip>

Right now Google has indexed 175,000 of their HIJACKING pages so far.

How is it they get away with this and Google hasn't banned them. They're too banning the wrong people.

Looking at the source code for one of their pages. Identical except it has <their own domain name> appended to all URLS

Hope someone can help us bust these <guys> or at least find a way to stop the un-authorized use of our content.

Thanks!

[edited by: jdMorgan at 5:31 pm (utc) on Mar. 8, 2005]
[edit reason] Obscured specifics, etc. per TOS. [/edit]

 

sitz

5+ Year Member



 
Msg#: 3033 posted 3:37 am on Mar 7, 2005 (gmt 0)

Looking at the source of that page, it looks like the images are still being sourced from the origin server. This means you could use mod_rewrite to do Referer-based blocking of the images, for whatever that's worth (details on how to accomplish this are in the forum archives). If you were feeling bored, you could even re-work the RewriteRule to return images you made "just for them", which could be anything you want:

RewriteEngine on
RewriteCond %{HTTP_REFERER}!^$
RewriteCond %{HTTP_REFERER} ^(https?://)?([^\.]+\.)+[i]theirdomain[/i].com
RewriteRule ^(.*)\.gif $1-stopstealing.gif [L]

Then create (for instance) /images/nav/c-webmaster-stopstealing.gif which could be any image you want, really. For instance, a large gif file (in physical size, not bytesize; you want to inform the user, not punish them with a large download) with a textual warning that the contents of the site had been stolen. If you go this route, I'd suggest a 800x600 image, just black text on a white background, with maximum GIF compression. It would be keen if you could redirect the browser completely, but since the browser is going to take whatever you send it for the request and try to use it like an image, your (technical) options are a bit limited. It would also be keen if one could code up some javascript to handle things like this, however since the javascript script src URLs appears to get rewritten to copies on the 1bu.com servers, they presumably have the ability to edit the javascript to filter out things like that. Of course, you could always run your javascript through an obfuscator ([javascript-source.com ]), although I'm not really sure how much that buys you; that obfuscation isn't really all that impressive. =)

That said, this may not be as malicious as you think. A quick google search turned up the following:

[threadwatch.org ]

Also note that they *appear* to have an opt-out form, but you have no way of knowing what they'll actually *do* with the data you send them. ;)

[edited by: jdMorgan at 5:33 pm (utc) on Mar. 8, 2005]
[edit reason] Obscured specifics per TOS. [/edit]

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 3033 posted 3:43 am on Mar 7, 2005 (gmt 0)

Two things....

1) File a DMCA notice with Google ASAP:
[google.com...]

2) Check your web logs for volume of downloading and block their IPs.
Just be careful and don't block a search engine!

Heck, I just blocked a european site that was downloading my entire site today, and two others from asia last week. Happens all the time and you have to be vigilant to keep it under control.

[edited by: jdMorgan at 5:35 pm (utc) on Mar. 8, 2005]
[edit reason] Internationalized. [/edit]

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3033 posted 4:10 pm on Mar 7, 2005 (gmt 0)

Previous discussion of this particular "hotlinker" has indicated that they are simply acting as a proxy -- A user request to "their copy" of your page results in their server loading the page from your site, and passing it on to the requestor. It is therefore a simple matter of blocking their servers' IP address range to stop this.

And because they do not actually keep a copy of your site on their server, a DMCA or copyright infringement claim will not apply. As a good indication of this, note that they do the same thing with Google's pages, and Google has more and better internet-savvy lawyers than most.

Block their IP address range, and that should take care of the problem.

Jim

privacyfanatic

5+ Year Member



 
Msg#: 3033 posted 7:03 pm on Mar 7, 2005 (gmt 0)

They hijacked Microsoft, too!

And, I just let Microsoft know about it too. If anybody can squash this guy it's MS.

I did a SmartWhois on www.microsoft.com.theirdomain.com

It reported:

<ip address/hostname>
202.96.140.x
www.microsoft.com.theirdomain.com
Host reachable, 383 ms. average

<owner>
CHINANET Guangdong province network
Data Communication Division
China Telecom

==================================================

[edited by: jdMorgan at 5:38 pm (utc) on Mar. 8, 2005]
[edit reason] Removed specifics per TOS. [/edit]

pendanticist

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3033 posted 7:37 pm on Mar 7, 2005 (gmt 0)

Seem they've got mine in there too and they've even added the tagline theirdomain.com onto all the logo links on my front page. Like, safesearch and surfnetkids and all the others.

Well, well, well. Got my Vietnam Veterans Memorial page too. Every pic.

I've banned via IP Number. How long can I, or could we expect until they can no longer serve it up?

Day, weeks, what?

[edited by: jdMorgan at 5:39 pm (utc) on Mar. 8, 2005]
[edit reason] Obscured specifics per TOS. [/edit]

bobothecat



 
Msg#: 3033 posted 7:57 pm on Mar 7, 2005 (gmt 0)

simply ban this range: 202.96.128.0 - 202.96.191.255 - seems to stop them in their tracks :)

BTW - you can do any domain/address and it will appear... they seem to send out a spider or something that takes the page requested.

[edited by: bobothecat at 8:21 pm (utc) on Mar. 7, 2005]

[edited by: jdMorgan at 5:40 pm (utc) on Mar. 8, 2005]
[edit reason] Obscured specific IP per TOS. [/edit]

Livenomadic

10+ Year Member



 
Msg#: 3033 posted 8:12 pm on Mar 7, 2005 (gmt 0)

how do i can a IP range?

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 3033 posted 8:20 pm on Mar 7, 2005 (gmt 0)

AH HA! I found they are using my site but it's the old (still active) domain name.

This is not the same bunch I previously blocked that I thought they were, this is just a straight pass-thru proxy as it's showing data updated on my site 10 minutes ago and it's running the scripts directly from my server.

Let the IP blocking commence....

[edited by: jdMorgan at 5:41 pm (utc) on Mar. 8, 2005]
[edit reason] Internationalized. [/edit]

Livenomadic

10+ Year Member



 
Msg#: 3033 posted 8:21 pm on Mar 7, 2005 (gmt 0)

Lol, I sent an email to these abuse email address and it was turned because the mailbox was full...

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3033 posted 9:49 pm on Mar 7, 2005 (gmt 0)

As I posted above in msg4, this is not a "copying" issue. This is simply a proxy server at work. Blocking the IP range that bobothecat posted in ms8 above will stop it immediately.

Here's a sample of the regular expression for use in mod_setenvif, mod_access, or mod_rewrite code:

SetEnvIf Remote_Addr ^202\.96\.1(2[89][3-8][0-9]9[01])\. ban

-or-
Deny from 202.96.128/18

-or-
RewriteCond %{REMOTE_ADDR} ^202\.96\.1(2[89][3-8][0-9]9[01])\.
RewriteRule .* - [F]

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved