homepage Welcome to WebmasterWorld Guest from 54.198.148.191
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 31 message thread spans 2 pages: 31 ( [1] 2 > >     
Page-jacked in Google by a proxy scheme?
Lost in Google
Vienix




msg:3102966
 2:57 am on Sep 30, 2006 (gmt 0)

Dear People,

I have a site with (currently) more then 6000 backlinks, a year or what ago it had a pr7 (for a month or so), but went down to pr6.

I has a picture gallery, a forum and a blog.

I didn't pay much attention to the site, and it build up a lot of spam comments (in the gallery). I also had DP Co-op links on it...

After my brother (his pic is in the gallery) kindly reported the spam I removed those comments and looked at rankings and pr of the site. I also removed the DP Co-op.

I then saw that the visible pr on (only) some pages had gone to zero.

And.. the search term where I used to rank high for was taken by a proxy site, listing my content. Even if one searches for my family name, which is the domain name, the proxy site shows up.

This proxy site lists about 7000 pages from other sites in Google.

The proxy site doesn't seem to do any 302's at the moment.

I blocked the ip addresses belonging to the proxy site, filed spam reports, added a sitemap to my site and did a reinclusion request telling about removal of the porn spam and DP co-op.

Until now no luck.

The current pr update seems to zero my pr for most pages (but not all) on my site.

I wonder, if this proxy site is doing this on purpose, (looks like they have (had) some form of 302 in effect at a certain point)

Or maybe there is a nasty "competitor" using a hole in both the proxy and Google (other sites from me are also listed in the proxy / Google, but with little effect)

I am lost....

[edited by: tedster at 3:23 am (utc) on Sep. 30, 2006]

 

CainIV




msg:3103008
 3:44 am on Sep 30, 2006 (gmt 0)

This recently happened to me, and I got some results at least from the site point of view. A proxy site had my content as well. This is much easier than most think with the php proxy type programs.

In my case what made it very odd is that search history results in the program generated by real user searches created a string that was indexed by Google. I have no idea how (many some members can pitch in here)

Effectively, the cache date, design and content of this website was my content.

I contacted Google, and did notice that a search for my content two weeks later showed 0 results in google, so I am assuming was done.

Regards,
Todd

Vienix




msg:3103019
 4:02 am on Sep 30, 2006 (gmt 0)

It looks like there is more to it, at least in this case.

The proxy has effectivly taken my place in the serps.

Another strange thing is that I used one of the sites that checks PR over various datacenters, and this PR-DC site also returns an url with its reported PR, I assume a reply from the datacenter.

For all my sites, the url I enter in the PR-Checker is also returned, but for the particular site that has its content listed in the proxy the PR checker returns the page "caught" by the proxy...

I asked the webmaster how the PR checker works, but had no reply yet...

regards,

Bert Vierstra

[edited by: Vienix at 4:03 am (utc) on Sep. 30, 2006]

SuddenlySara




msg:3103024
 4:08 am on Sep 30, 2006 (gmt 0)

Are you guys Adsense only sites?

Vienix




msg:3103031
 4:09 am on Sep 30, 2006 (gmt 0)

Me Guy, not adsense only site.

regards,

Bert

CainIV




msg:3103055
 4:29 am on Sep 30, 2006 (gmt 0)

The proxy has effectivly taken my place in the serps.

Yes, I understand - this is what happened to me.

My cache. code, design and content.

optimist




msg:3103072
 4:46 am on Sep 30, 2006 (gmt 0)

Did you try to Ban the IP address of the proxy site? Sometimes this can work and will return a 403. Use the LIMIT/GET commands in Unix and deny from their IP C class (the third set of numbers), not sure how to do it in Windows.

It has worked for me on many proxy sites. You need to do this before the listing goes supplemental or you will have a duplicate page indexed in Google for a year or so.

Otherwise there is nothing that can be done if the server is using a URL based application. Something that can look like this:

pico/cache.php?domain=

Google cannot protect you from some of these sites unless you get lucky and they respond to a request. This is something they will hopefully put a fix to, but seeing as how we have waited since Googles inception for fixes like 302 redirects and canonical issues... You can figure the rest out for youself, we call this Legacy Code.

Vienix




msg:3103077
 4:57 am on Sep 30, 2006 (gmt 0)

I have banned all ip's from the proxy.

But my content (and 7000 other pages) is still listed in Google through the proxy.

ALL my pages have gone supplemental.

And that while there are more then 1 million pages in Google that mention my domain name :(

optimist




msg:3103095
 5:28 am on Sep 30, 2006 (gmt 0)

You should contact Google and explain this. It is possible you are one of hundreds of webmasters affected by this proxy. Sometimes they actually read their mail.
:)

Personally I hate proxy's for this hijack reason. I am curious to see if the proxy passes a 302 redirect or not.

Another possibiliy is to use the URL removal tool, but with 7000 pages this would be painful. I have found that it will remove individual pages if they return 403 Forbidden.

[edited by: tedster at 3:04 pm (utc) on Sep. 30, 2006]

Vienix




msg:3103101
 5:34 am on Sep 30, 2006 (gmt 0)

It doesn't do a 302 redirect at the moment, but at a certain stage they must have been doing that, since the proxy shows up if you search for my name.

And, you cannot use the removal tool if you are not the owner of the site... Which is a good idea :)

Solution1




msg:3103114
 5:59 am on Sep 30, 2006 (gmt 0)

Would it be possible to contact the owner of the proxy and explain the situation? Perhaps he's willing to remove the cached content.

optimist




msg:3103118
 6:02 am on Sep 30, 2006 (gmt 0)

Sometimes with proxy servers they actually show only the results of the URL. This server may be timing out URLs so it becomes Invalid in time. If you remove the IP Block, does the link in Google still go to an Invalid.php page?

Your banning the Proxy, may actually cause you to loose traffic from your search results in this case, because the site would get your content even though it is on their server. Cutting it off also cuts you off. Its a difficult decision.

You need to contact Google I would try and you may get lucky, but you have no choice now. If that fails, the only way to recover is by blocking them and re-submitting your site.

You are definitely 302 hijacked! As I tested this and it passed a 302 redirect.

This is one of those unfortunate cases where you have been hurt by a 302. Time may resolve this if the URLs time out and cause the Invalid.php page, but the Google cache still has this page, so you need to try to get in touch with them.

It may also have been intentionally done if someone submitted the URL to Google via a text link or their add url submission form. I do not see any way for Google to find it from their home page. So how did it get there?

NOTE: I have also had luck submitting sites with issues to Google's Add URL and have been able to add comments, that are actually read.

vincevincevince




msg:3103120
 6:04 am on Sep 30, 2006 (gmt 0)

I presume they are forwarding all files from your domain, including robots.txt.

Use PHP to output robots.txt depending upon the requesting IP - use your normal robots.txt for everyone apart from the offending proxy - use a disallow everything robots.txt when the file is requested from the proxy.

You can then use the Google removal tool without problem or risk and remove all those proxy pages.

optimist




msg:3103127
 6:10 am on Sep 30, 2006 (gmt 0)

Solution1

The cached content is only in Google now, his pages now go to an Invalid.jsp page on the proxy server. It actually gets a session that will timeout, so it is a temporary replacement. What effect this will have on the site in the future is unknown.

Vienix




msg:3103130
 6:18 am on Sep 30, 2006 (gmt 0)

I don't think that contacting the owner of the proxy site will have effect. It has more then 7000 pages listed in Google with content from other sites all redirecting to its homepage. It must be intentional, or users of the proxy have submitted 7000 pages with mal-intend.

Its a matter of time (short I hope) and Google will ban the proxy domain.

The domain lists a lot of "stolen" content, most of it listed as supplemental (like the Adbrite page), but some of their "stolen" stuff, like my site, not...

optimist




msg:3103134
 6:39 am on Sep 30, 2006 (gmt 0)

Vienix,

This is considered by some a Black Hat approach to delisting a URL. First replace the site (duplicate it and over power it) get it supplemental, then dump the listings and point it to an error rather than a 404 using another redirect.

It does not redirect to the home page it goes to Invalid.jsp, Google may respider the URL and then see Invalid URL. But here's the kicker. It gets to the Invalid URL from the indexed URL that is now passing a 302 to the Invalid.jsp page.

The 302 first says Found, so Google thinks the URL is still there, you are already punished and may no longer be considered the original owner of the content, its hard to get out of supplemental hell.

So the circle continues via the 302 directive that Google has never resolved most likely due to Legacy Code in the programming, or so people still have something to talk about on Webmasterworld.
:)

Your best bet is to simply contact Google, it may clear up on its own, but with the session IDs timing out and causing additional redirects, you may be hurt even more, if it disregards your URL and does not properly reassess or find your URL you could be hurt by never returning.

IMHO, its better safe then sorry. If your site is clean, just send in an email. I think you will be justified by them with this. Its definitely a 302 hijack, and seeing as how when we go to the proxy site there is no way to find any proxied searches, it may be an intentional move to try and hurt you since you rank #1 for your terms.

Good Luck, please keep us informed.

kanowins




msg:3104655
 9:59 pm on Oct 1, 2006 (gmt 0)

Hello guys.

I'm the kproxy aministrator.

There isn't any intention on hurt you or your sites. I have no idea that you have this problem until someone has written to me this weekend. If you have problems, you can write to me in the kproxy forum or email me, support[AT]kproxy.com

All direct kproxy requests to your pages will redirect to invalid.jsp. I have made this change some weeks ago. People have to go to KProxy main page to surf. I had some problems with some sites because I had KProxy opened to any request.

That means that google now will never find your sites with kproxy requests.

Sorry if you had some problems.

Best regards.

Vienix




msg:3104772
 1:48 am on Oct 2, 2006 (gmt 0)

Good....

It looks like Google has responded to my requests in the mean time...

Pirates




msg:3104780
 2:13 am on Oct 2, 2006 (gmt 0)

Or maybe there is a nasty "competitor" using a hole in both the proxy and Google (other sites from me are also listed in the proxy / Google, but with little effect)

They are pretty arrogant about it. Try searching your site name and see who is using your site or company name in the title of there advert.

Vienix




msg:3104788
 2:35 am on Oct 2, 2006 (gmt 0)

The proxy still has some other sites of me "listed", Google just solved the problem for one of my sites. (Thank You, Thank You)

Also they "solved" my problem with the spammers.

While the spam sites showed up when you did a :related, they also have dissappeared there....

I think they sort of did a "reset" for my site.... Hope I don't get sandboxed because of that :)

theBear




msg:3104808
 3:23 am on Oct 2, 2006 (gmt 0)

I don't think the proxy code itself issues any 302s. That may be coming from the admins fix for the issue.

This looks like a straight duplicate content "problem",

One that shouldn't cause as much trouble as it appears to.

The base code is availible online and looks like a simple pass through operation with url subsitution. It is late for me to be parsing java and maybe I missed something.

I'm cruising for examples of current duplicate content issues. Now I'll butt out of your thread.

[edited by: theBear at 3:26 am (utc) on Oct. 2, 2006]

coosblues




msg:3104889
 5:12 am on Oct 2, 2006 (gmt 0)

Vienex, I'm not sure I understood half of what was written but it's great to know when you have a problem as great as yours their are some very wise webmasters here to help you (myself not included). Google seems to have really stepped up to the plate on this one, and for that it's nice to be able to say thanks to Google and Brett for listening to those of us just trying to make an honest living. :)

jomaxx




msg:3104917
 6:01 am on Oct 2, 2006 (gmt 0)

I don't know the specifics of what kproxy did, but NO proxy site should allow Google to spider any proxied page. Period. This is the easiest thing in the world to do with a robots.txt file.

canuck




msg:3105930
 10:07 pm on Oct 2, 2006 (gmt 0)

One of my sites just got hijacked by a similar proxy... now the entire site is listed under the proxy in G.

Any ideas on the .htaccess code to ban this proxy site? Our site is currently listed in G as: https://www.proxywidgets.com/...

theBear




msg:3106049
 12:00 am on Oct 3, 2006 (gmt 0)

canuck,

First you have to identify how they are accessing your site by using the proxy to access your site and then locating those accesses in your raw server logs.

Then a variation of:

RewriteCond %{REMOTE_HOST} ^WW.XX.YY.ZZ
RewriteRule . [G,L]

the WW.XX.YY.ZZ is the IP address that the proxy used to access your site from(note this could be a range of addresses)

would do the trick.

My favorite would be to reflect the proxies home page back at them using the RewriteRule.

[edited by: theBear at 12:01 am (utc) on Oct. 3, 2006]

jomaxx




msg:3106213
 3:28 am on Oct 3, 2006 (gmt 0)

Go get a few pages of your site via the proxy (or force a few 404 errors, easier to find). Look at your logs and see what IP address is retrieving those pages from your server. Ban that IP address using a "deny from" statement in your .htaccess file. Then try again and make sure the proxy site is truly banned. Using RewriteRule seems unnecessarily complicated.

theBear




msg:3106218
 3:34 am on Oct 3, 2006 (gmt 0)

jomaxx,

Trust me on this, the deny isn't anywhere near as much fun as providing the proxy its home page to feed the bot army with ;).

Bewenched




msg:3106339
 5:23 am on Oct 3, 2006 (gmt 0)

theBear,
I like the way you think ;)

jomaxx




msg:3106369
 6:02 am on Oct 3, 2006 (gmt 0)

Frankly I'd be afraid of getting the two servers into an infinite loop. But I'm no guru; it takes me hours to get any rewrite to work.

canuck




msg:3112068
 1:36 pm on Oct 7, 2006 (gmt 0)

Thanks... they had a couple IPs which is where I was having problems.

Anyways, sent a report to Google, got in contact with the Proxy (they did a 404)... plus I put up a 403. End result back in Google 4-5 days later. ;)

This 31 message thread spans 2 pages: 31 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved