|Site being scraped by Proxy|
Hi guys, I'm in need of some help.
Similar to this thread - http"//www.webmasterworld.com/google/3378200.htm - I discovered my site is being scraped by a proxy.
Lets say my site is xyz.com
the proxy is loading a live version of my site at asdf.org,
Also replacing all instances of my name Brand with their domain on the page itself.
Currently, google has over 12,000 pages from this proxy domain indexed!
I read through that old post and I'm having trouble finding a proper solution to apply. I keep seeing mentions of doing a double reverse-DNS lookup.
Can anyone provide help if this solution is still valid today. And if so, provide a code I could easily apply to my site?
[edited by: tedster at 2:58 am (utc) on May 8, 2013]
[edit reason] make link clickable [/edit]
Yes, the double-reverse googlebot IP lookup is extremely valid today ( see [webmasterworld.com...] ) but that is only if you have a true proxy server going on. I don't think you do.
A proxy server just serves a copy of your files, passively. That is, the proxy server does not actually hold a copy of your files. Since you say that your files are being changed, the server that is outranking you sounds like a scraper server. The solution you need probably falls more in the area of a DMCA take down request, either tho their ISP or to Google.
|The solution you need probably falls more in the area of a DMCA take down request, either tho their ISP or to Google. |
I agree, but would add, if it's not a proxy according to w3 standards and you file a DMCA, make sure you get the wording correct, because if you say it's a proxy, even a non-transparent proxy, neither Google nor the host will legally be able to uphold the DMCA take-down request, to the best of my knowledge.
Non-transparent proxies are a PITA, but if it's really a proxy according to w3 standards, then they are requesting the pages dynamically on behalf of their users and you can block them from doing so via .htaccess so there's no need for a DMCA complaint to be filed, just block them, because if it's a non-transparent proxy a DMCA will in all likelihood and (again) to the best of my knowledge, not be honored by the host or Google.
Google spidering, caching and showing pages from non-transparent proxies in the results and how I feel about it is another story for another thread. The short version is: I don't like or agree with it. That's not the point right now though, getting your site back where it should be is.
I only say proxy because its quite similar to how web proxies work. Except in this case he dedicated his entire domain into mirroring mine and altering the brand. There is no actual proxy feature available for visitors.
He's also using cloudflare, which I've sent a DMCA to. Their response is because of the service they provide, no content is hosted with them, so they can only pass my claim along to the owner.
|He's also using cloudflare, which I've sent a DMCA to. Their response is because of the service they provide, no content is hosted with them, so they can only pass my claim along to the owner. |
CloudFlare should tell you where the website is hosted and then you can send a DMCA to the hosting provider.
Also, this is a cheap answer from CloudFlare. They are a CDN, thus the content IS on their network/servers. I had a similar situation a couple of days ago. When I checked the ip-address of a content thief I ended up at CloudFlare because the content was on one of their servers. CloudFlare did nothing, except for letting me know where the violating website was hosted.
Not all CloudFlare plans actually host content. I gave them a try a month or so ago, and all I had to do was point the DNS to their servers. No content was moved over.
CloudFlare should be blocked in it's entirety.
I had one site that was hot linking to our images through one of their servers and claiming that it was not them and CloudFare said it wasn't coming from their servers but it was according to my server logs. Blocked them all.