Welcome to WebmasterWorld Guest from 35.175.120.174

Forum Moderators: not2easy

Message Too Old, No Replies

Copyright problem?

redirecting the entire site

     
10:55 pm on May 2, 2005 (gmt 0)

New User

10+ Year Member

joined:Aug 3, 2004
posts:5
votes: 0


I have run across a problem that I thought maybe someone could tell me if this is something to worry about.

I was searching for mysite.com using <snip>. I found the following address, <snip>

I clicked on the site to see where it would go. It took me to a site just like mine, but all the urls were rewritten to include the <snip> in them.

Then I did a search on Google using the words redirect and <snip>. Up came a whole bunch of sites with the same problem, <snip> in the addresses. One was <snip>

Check our your sites, using this address; <snip> and see what you get.

<snip> is a company in France. <snip> <snip>

[edited by: Brett_Tabke at 5:43 pm (utc) on May 3, 2005]
[edit reason] please no specifics [/edit]

2:38 am on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


KeeeeRIPES! You are absolutely right!

<snip> copied my entire site it seems, I only checked the index page. I checked the html source for that page, and EVERY SINGLE LINK, internal and outgoing has the <snip> tacked onto the end..
Even my <Base HREF= statement was so altered.

Can somebody explain all this? Are the major search engines notified yet? We should note <snip> contact info before that vanishes. -Larry

[edited by: Brett_Tabke at 5:43 pm (utc) on May 3, 2005]

[edited by: engine at 8:21 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]

3:11 am on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


You all gotta see this. <snip> has jacked Webmasterworld, Google, Yahoo and gawd knows who else. <snip> takes you to Google France instead of Google.com (USA). The Yahoo jacking goes to regular Yahoo.com

Amazingly, a search in my niche even shows Yahoo search results pages jacked, instead of referring back to Yahoo itself.

Does THIS explain what some heavy crawls from mystery robots was all about?

How on earth does this work? How long has it been going on?
What's the purpose? <snip> is a jobs placement agency in France -Larry

[edited by: Brett_Tabke at 5:44 pm (utc) on May 3, 2005]

[edited by: engine at 8:21 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]

5:18 am on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Isn't anyone interested in this?

Type in <snip> and see what happens.

So far it looks like <snip> reads your page, rewrites in on the fly
with <snip> modified URLs, and presents that to the browsers.

Check all internal links in source code. Now THAT's a <snip>. - Larry

[edited by: Brett_Tabke at 5:45 pm (utc) on May 3, 2005]

10:20 am on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 27, 2003
posts:1648
votes: 2


Yeo, my site is there too.
I will be interested to see my logs when they roll over - I am willing to bet they are doing a live grab.
You see, I have just redesigned, and they have the redesign. The odds of them grabbing it in the last 4 hours is pretty darn low.
They arent copying the site, I reckon, they are just (what would it be called?) leaching it.
Can't imagine why!
10:37 am on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Hi Leadegroot: Its a live grab for sure. I just made a minor change on a page and called it up with the <snip>.com attached. It showed up immediately. This requires high speed.
Look what they have to do:
1) Parse the real URL, and fetch that page.
2) Deconstruct, parse and reconstruct the page with their buggared URLs.(they screwed up one image URL, but the rest did as they intended.)
3) Download the buggared page to whoever was browsing for it.

There is no other explanation how they can offer Google and Yahoo results, page after page of stuff they couldn't do alone. Still, this requires some electronic horsepower.

<snip> is a major player in the European online job-search
category. Given the proper French search words, they are at or near the top of the SERPs.

Here's some fun. I did a Google Advanced Search restricting to the <snip> domain. I added the keyword for my niche. I did NOT find my pages listed in G, but one other site I know. ALL results were "supplementary", i.e. G knows they are crap. The I clicked on the site I knew. NOTHING came up. Disallowed I presume.

I did the same with the keyword "news". Many many <snip> sites, and again all "supplemental results"

Now I will see how Yahoo likes them. If Y likes them too much, I might send a nice message. - Larry

[edited by: Brett_Tabke at 5:45 pm (utc) on May 3, 2005]

[edited by: engine at 8:23 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]

10:48 am on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


OK. Now I did the same test with Yahoo; i.e. I restricted domain to <snip> and tested simple keywords looking for .redirect.<snip>

NOTHING! Yahoo has totally delisted all the .redirect.kjs
If I use the kw 'emploi' (employment is kj's main biz) then many thousands of pages show. Yahoo cleaned out all the crap. -Larry

[edited by: engine at 8:24 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]

11:13 am on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 27, 2003
posts:1648
votes: 2


hmmm... I think I shall check my logs and write a htaccess line to block them :)
Oh goody - I have a nice timestamp from my previous post :)
11:18 am on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Two things really puzzle me here.

1) <snip> is obviously a very valuable domain. Who and why would anyone risk being entirely banned from the major SEs with such a stunt?

2) G and Y are obviously on to this. Why indeed were they NOT banned entirely, instead of just penalizing the phony redirect pages?

- Larry

[edited by: Brett_Tabke at 5:48 pm (utc) on May 3, 2005]

[edited by: engine at 8:24 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]

12:50 pm on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 27, 2003
posts:1648
votes: 2


subdomains are seen as separate from the main domain (by the engines), so the penalty isn't passed on?
Unlikely...
1:03 pm on May 3, 2005 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9074
votes: 6


It's a real-time proxy: in other words, they are grabbing your site each time your domain name is requested. Check your stats and ban the IP. They are not scraping, copying or anything like that: put any domain name in and it will fetch it.

I suspect that it is an unintentional side-effect of their redirect mechanism, and I think all accusatory statements should be taken with a pinch of salt until the full facts are known. Has anyone actually written to them to ask?

1:07 pm on May 3, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:June 11, 2003
posts:146
votes: 0


Someone have the ip number so we can all ban them?

EVO

5:37 pm on May 3, 2005 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9074
votes: 6


I've got back some information from the company regarding the problem. As I said earlier, this is a quite unintentional side-effect of their setup, and they are already in the process of rectifying the situation.

The site is a job search engine, and the proxy servers (2 redundant machines) were put in place to bring authorized third-party content under their domain for the benefit of their users. However, the *.redirect.<snip> servers were initially set up in such a way that any site was visible through their proxy. The service was abused and subsequently was indexed in error by Google, Yahoo, MSN and others.

The company was already in the process of having the redirect pages removed from the search engines, and they are implementing a white-list of partner sites to close the proxy loophole.

This is the big problem with "outing" offenders in public forums: an incorrect assumption of guilt without any semblance of investigation, thus hurting innocent parties with disinformation.

[edited by: encyclo at 5:55 pm (utc) on May 3, 2005]

5:40 pm on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 20, 2002
posts:735
votes: 1


Someone have the ip number so we can all ban them?

I think their IP address is "213.41.66.245". What sort of thing should be added to the .htaccess file to prevent this unauthorized "mirroring"?

Eliz.

5:50 pm on May 3, 2005 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38257
votes: 115


as encyclo said - it is just a open proxy server. There are about a dozen found in google every month. Mostly from China.
8:54 pm on May 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Let me get this straight.

An 'open proxy server' that rewrites any and all of your pages and mine, substituting their URLs for yours on the fly, and presenting those altered pages to anyone passing by, is innocent.

I understand the part about finding a lot of them from China. -Larry

[edited by: engine at 8:25 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]

11:47 pm on May 3, 2005 (gmt 0)

New User

10+ Year Member

joined:Aug 3, 2004
posts:5
votes: 0


I should have said in my first post that I have emailed Yahoo, Google, MSN, CNN, Amazon.com, and others telling them of the problem. I don't know if they have fixed it or not.

They never returned my email, except to acknowledge that they had got it.

11:54 pm on May 3, 2005 (gmt 0)

New User

10+ Year Member

joined:Aug 3, 2004
posts:5
votes: 0


"This is the big problem with "outing" offenders in public forums: an incorrect assumption of guilt without any semblance of investigation, thus hurting innocent parties with disinformation."

That may be their "excuse" for what happened. I would like to think that it is true. I don't care what they do as long as they fix it. I emailed them several times and they never emailed me back. If they were going to fix it, shouldn't they have emailed me and told me? I was the first to find out about this mess. I have heard nothing from them.

My guess one of the "big boys" got after them.

[edited by: rogerd at 2:06 am (utc) on May 4, 2005]
[edit reason] No specifics, please. [/edit]

12:03 am on May 4, 2005 (gmt 0)

New User

10+ Year Member

joined:Aug 3, 2004
posts:5
votes: 0


I should have also added that I ran a search on my web site name and it is still there, <snip>.

[edited by: rogerd at 2:04 am (utc) on May 4, 2005]
[edit reason] No specifics, please. [/edit]

2:04 am on May 4, 2005 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9074
votes: 6


When I contacted the site in question this morning using the contact details on their site, I got a comprehensive and detailed explanation within the hour. My earlier paraphrased account left out much of the precise detail. I do happen to speak French, though, so the language barrier may have been your problem.

An 'open proxy server' that rewrites any and all of your pages and mine, substituting their URLs for yours on the fly, and presenting those altered pages to anyone passing by, is innocent.

Yes, I am convinced that they are "innocent" - it was never their intention to have an open proxy for all sites, just for certain partner sites. A proxy server such as you describe it does have a perfectly legitimate use.

My guess one of the "big boys" got after them.

Their servers got absolutely hammered due to the open proxy, mostly by bots. Would you like to pay the bandwidth bill for the entire AOL site plus dozens of others, all fetched by your server? They were onto this problem well before this thread was started, and they didn't need any "getting after".

It's like I said, outing innocent sites without knowing all the facts is grossly unfair - it's not surprising that the thread has been edited. Cases like these are far more often cock-up rather than conspiracy.

2:57 am on May 4, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


OK. Lets presume innocence.

What am I or anyone supposed to think when I find my entire site done up like this? Every single link on every single page surgically altered back to <snip>?

What are we supposed to think when Google has so many pages indexed with the surgically implanted redirect.<snip.com? and then downgrades them all to supplimental results? .. when Yahoo sh**cans all such pages entirely?

Are we supposed to presume white-hat innocence in the face of all that? If anyone ELSE here pulled something like this, what would be the reaction?

So <snip> was aware of all this before this thread came up.
How long does is take to put up a white-list? My site is still <snip>-jobbed. You should see their source code for my sitemap. Not fixed yet.

Checking again, it appears G has removed many or most of the "supplemental" results I noted yesterday. Would this have happened without this thread coming up?
-Larry

[edited by: engine at 8:26 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]

3:26 am on May 4, 2005 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38257
votes: 115


There are a couple dozen threads on the issue of open proxy servers going back 6 years here.

> was aware of all this before this thread came up.

I don't think so. The company in question has been contacted and the admin has made changes.

> If anyone ELSE here pulled something like this,
> what would be the reaction?

Hang around - it will happen again next month.

Remember when it happened to AOL itself in 97?

5:07 am on May 4, 2005 (gmt 0)

New User

10+ Year Member

joined:Aug 3, 2004
posts:5
votes: 0


May 4, 2005

My site is still listed with "the unmentioned site."

Nothing has yet been done. I have been trying to get them to respond to me over a week before I started this thread here.

I was not accusing anyone of a crime. Reread my first post on this thread. I am asking for help.

5:35 am on May 4, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Hello Ang0ie:

I'm not accusing you of anything. You just pointed out some facts, and I think you deserve credit for this. Just think of all the newbies, or persons like myself who were blindsided entirely.

Either I missed all the old threads, am getting forgetful in my old age, or there's something new in the method and format of the redirects with this latest episode. I couldn't have forgotten something like this if my site were at all affected, let alone Y, Y, AOL ...

Brett: Your input is very well taken here.

I think that is is a great disservice to NOT out problem sites like this. IF the webmaster is innocent, he should be the first to want to know about it, so he can fix them before he gets penalized.What do you make of the fact that the webmaster(s) don't reply to Ang0ie?

Perhaps more to the point:

I vote we should out ANY site pulling stunts like this. Sure, notify the webmaster in advance, in case it is innocent. Consider the alternatives. I cannot imagine the damage this could do to honest sites and the web in general, should the same methods be used for highly black-hat purposes. -Larry

[edited by: engine at 8:20 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]

10:43 am on May 4, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


" Someone have the ip number so we can all ban them?
" I think their IP address is "213.41.66.245".

I hit several little used pages of mine as a test using the kind services of .redirect.<deleted>. Now I have IP numbers, two of them:
213.41.67.XX and ~.XX. yaddayadda.XX never came in but its a possibility.

Brett: I Googled for "open proxy servers". Found loads of stuff, all about hackers, spammers, security measures etc etc. but nothing that really zeroed in on this specific problem.

I want to call it <deleted>-jobbing but that's clumsy. If nobody minds, I would like to start a separate thread on sites that use yoursite.redirect.badguy type tricks to mirror valid sites with surgically cloned urls back to themselves.

Issues would naturally be:
a) What are the methods used?
b) What harm, if any, could this cause?
c) How can we find / identify such sites?
d) What can / should we do about it? - Larry

PS: Just now checked again. <deleted> is still 'mirroring' my entire site.

[edited by: engine at 5:19 pm (utc) on May 4, 2005]
[edit reason] formatting [/edit]

1:25 pm on May 4, 2005 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9074
votes: 6


I received some more information from the company today: they are in the process of banning known bots (Googlebot, Slurp and MSNBot in particular), but they can't take down the proxy entirely without affecting their main site's operations. Expect to see the proxy in operation for normal users (not bots) for a month or two more until their whitelist system is fully tested and in place. As the bots are banned, the listings should fade away pretty quickly. If you are banning them, the company uses two load-balanced servers for their proxy, so expect two IP addresses.

The company was hit by a huge amount of unwanted traffic due to the open proxy, and it was this enormous bandwidth usage which alerted them to the problem. I am guessing that the open proxy was discovered by others who then used it to suit their needs.

I vote we should out ANY site pulling stunts like this.

I get it, you're actually Brett's lawyer and you fancy a new ocean-going yacht! The vast majority of open proxies are accidental on the part of the owner, but they can quickly be abused by others. Also, how would you like it if one of your sites was "outed" for some "black-hat" activity or some such? Think you're squeaky clean? If I contact my local link farm and add 100,000 doorway pages pointing to your site before denouncing you, you won't be.

9:00 pm on May 4, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Encyclo: I can't make any sense of this:

"I get it, you're actually Brett's lawyer and you fancy a new ocean-going yacht! The vast majority of open proxies are accidental on the part of the owner, but they can quickly be abused by others. Also, how would you like it if one of your sites was "outed" for some "black-hat" activity or some such? Think you're squeaky clean? If I contact my local link farm and add 100,000 doorway pages pointing to your site before denouncing you, you won't be. "

Are you saying we should NOT out the dirty sites, or those misused by third parties?

Are you threatening to tar the clean ones in retaliation? - Larry

1:14 am on May 5, 2005 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9074
votes: 6


Larry, the site first outed in this thread was a completely innocent victim who had committed no more than making a mistake in their server configuration - yet they were denounced by you and others as being involved in massive copyright infringement. If the accusations had stood, it would have been libellous in the extreme.

Are you saying we should NOT out the dirty sites, or those misused by third parties?

Yes, that is exactly what I'm saying, not because I'm a supporter of "dirty sites", just that it is impossible to tell whether a site is bad or not. It would merely be mob justice at work, where facts would take a back seat.

Are you threatening to tar the clean ones in retaliation?

Of course not, but imagine the following scenario: one of your competitors adds a ton of doorway pages pointing to your main site (which is above his in the SERPs). He then "outs" you publically on a forum for having used those doorway pages. A Google representative sees the thread, and gets your site banned. You lose your site and a truckload of money. Does that kind of vigilante justice sound fair to you?

If you want to do a name-and-shame campaign, you should post a message in the Community forum. I think you'll find that you're in a minority.

6:26 am on May 5, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2004
posts:1425
votes: 0


Encyclo: " they were denounced by you and others as being involved in massive copyright infringement. "

The only time copyright was mentioned at all, was in the thread title "copyright problem?" I never used the word, not knowing what to call it. No mention of "infringement" that I can see reading back thru the entire thread, yourself excepted.

As I recall, libel is defined as spreading untruths about somebody/something. The outing above is largely factual, with parties and motives left up in the air.

I see the problem with vigilante actions. I don't want to start a witch hunt, no telling where that could lead.

What I WOULD like is some simple way to find out if other sites have very similar "open server" problems. IF SO, the webmaster should be notified first, hopefully resulting in an explanation, a promised fix etc. etc. to the benefit of all.

What I DON'T want is to sweep all this under the rug and pretend it ain't there.

While I'm at it, who are the bad-guys / abusers you keep referring to? If somebody were abusing the <snip> site's open server, why do they surgically implant the <snip> URL instead of their own?

Just curious. -Larry

3:33 am on May 6, 2005 (gmt 0)

New User

10+ Year Member

joined:Aug 3, 2004
posts:5
votes: 0


I did a search on Google for the words; the <unmentionable> site and lawsuit.

You will get lawsuits against them for 5 years ago!

Very interesting.

This 38 message thread spans 2 pages: 38