Forum Moderators: Robert Charlton & goodroi
In some cases both the original and proxy versions are present.
I can find no links to the proxy site, and the "copied" pages have no toolbar page rank, some of my pages that have gone from the index are well linked to.
I have blocked the proxy's IP (which seems to be working), and sent Google a spam report (from WMT).
What more can I do.
The url for my best keyword has also been replaced by an obsolete url with a query string that used to switch the page style. Instead of example.com/page, Google has indexed example.com/page?querystring. There is only one link to this from an old blog post, whereas the correct version has lots of external and internal links. I do not know if this is connected with the other problem but it did happen at the same time.
I think the proxy is malicious, but that I am not the victim. Their front page redirects to proxy another site, which I think is the main target, and I think my pages have been indexed to add relevant content to the site (as seen by Google).
The general implication the proxy problem is that your site is not stable enough to withstand fairly common issues - most likely due to a lack of links from the right sites.
The site has fairly good backlinks. Some of the affected pages are first or second in the SERPS - Not only are the proxied pages beating my originals, they are beating Wikipedia and my long established competitors (sites in the top few thousand by traffic, if you believe Alexa). The pages usually go in the SERPS exactly where mine did (my ranking have been fairly stable recently).
Incidentally, the page with the query string on the url is first in the SERPS for its keyword. I have added a 301 anyway.
Why should a three year old site that has never suffered this sort of problem before suddenly be vulnerable to this?
Remember that for your site's ranking to change (or for others to improve relative to yours) it only takes one of the following to occur:
- Your site/page changes or the external references to it changes
- A competitor's site/page changes or the references change
- Google changes
In most cases, all three are happening all the time.
From your description, it sounds like the proxied site/pages may have attracted greater numbers of external links, your site has lost links and/or Google has re-evaluated some of the external links to one or both sites, It might be that the site was precarious anyway, but there was little significant change up until now.
Incidentally, in case you haven't seen it, there is a Hot Topic [webmasterworld.com] thread on the subject: Proxy Server Hijacks - and how to defend [webmasterworld.com].
As far as I can see the proxy site has hardly any incoming links, and none to its copy of my site. It seems that Google is ranking it simply on the volume of material there: over 20,000 text heavy pages indexed, many from well know sites like ivillage (although Google ought to see that those, at least, are duplicates). On the other hand, my links look fine from Google Webmaster Tools, but fewer than usual are shown by a link: search.
My real question is whether there is anything I can do to reverse the damage faster, or whether I can now just wait for Google to re-index the proxy.
This approach is mentioned in the other thread quite a few times, but I'll quote it again here:
jdMorgan:
Given an understanding of what a proxy *is* and how it works, the only step really needed is to verify that user-agents claiming to be Googlebot are in fact coming from Google IP addresses, and to deny access to requests that fail this test.If the purported-Googlebot requests are not coming from Google IP addresses, then one of two things is likely happening:
1) It is a spoofed user-agent, and not really Googlebot.
2) It *is* Googlebot, but it is crawling your site through a proxy.The latter is how sites get 'proxy hijacked' in the Google SERPs -- Googlebot will see your content on the proxy's domain.
Here's a reference on the "double reverse DNS" method for verifying googlebot:
How to verify Googlebot is Googlebot [webmasterworld.com]
Again, from my imperfect understanding... a shared server can be set up to do a double-reverse verification for all requests. This runs the risk, though, of greatly slowing down the overall performance of your site.
To limit double-reverse DNS checking only to requests claiming to be Googlebot, etc, I've been told you need to use something beyond .htaccess-level code, and that's not available on shared virtual servers.
I think this needs to be said, as I spent many hours trying to get this sorted out, only to discover it couldn't be done.
My problem is, is that the proxy isn't working anymore, it doesn't show my pages anymore. I think that it was an attack to remove my pages from the index and then stop it. The proxy listed abut 80.000 domains.
I did also a Spam Report..