Hello. I'd like to warn the webmaster community about proxy/TOR hacking. It's probably not new, but it appears it continues. And it can have a very negative impact on hacked websites. This is from experience of my competitor (who has eventually become my friend and partner). I've seen the same hacks on at least two different websites.
The hacking is effective against well-established / niche leader websites. I talk about a 18-year old website here, several thousand pages, white-hat, quality content.
The following are the steps hackers follow (not all may be 100% accurate, but this is what I concluded after researching it):
1. Register a domain (fake and anonymized WHOIS).
2*. Use a script (many available for free) to crawl and copy the victim's website, keeping the same URLs structure (except for the domain). Replace the victim's domain with the hacker's domain so that all internal links use the hacker's domain. The result - hacker has a full copy of the victim's website hosted on the hacker's domain.
* I don't even know if hacker needs to follow this step; it's likely that Google indexes these pages without hacker's having to copy them in the first place.*
3. Add a condition to the script so that it shows a different copy to Google vs. the public, like:
if ($rDNS=='googlebot.com') $showPublicContent=TRUE; else $showPublicContent='header: 403 - <h1>Sorry, but this website is temporarily unavailable! Please check back later!</h1>';
This condition makes Google index the full content while showing public the 'site unavailable' notice.
4. Set up a proxy / TOR connection and allow/direct GoogleBot to follow the hacker's site via this proxy/TOR connection. This is likely the crucial point - why would Google index pages via a proxy/TOR server?
5. GoogleBot follows all pages and indexes hacker's site under the hacker's domain in Google search results.
THE RESULT: Thousands of pages created on the hacker's domain are indexed in Google. Some of them rank higher than the original ones or replace the original pages. When you click on ANY of those indexed links as a public user, you see the 'site unavailable' page. But when you use a Google mobile test tool (that uses Google's hostname) OR when you view the Google cache, you see the real page.
How do I know Google followed and indexed the proxy/tor pages? My friend put an IP address on every page ($_SERVER['REMOTE_ADDR']) and all cache pages showed the TOR/Proxy IP. In other words, at the moment Google was indexing the pages, the pages were served via these TOR/Proxy IPs.
---
My first reaction to that was - "Go ahead and submit a DMCA." But then I realized that the hackers got it covered - when you send a DMCA with the link to the "stolen or copyrighted content" - after clicking on the link, the DMCA agent can see the 'site is unavailable' message. When you explain that the real page is hidden via Google cache, they won't do anything about that. Some pages don't even have cache because the hackers would disable caching.
---
I hope this information will help Google and other webmasters solve the problem. Bing, Yahoo, Yandex have successfully resolved it - they don't index such hacked pages at all.