Forum Moderators: Robert Charlton & goodroi
When I started searching for some prominent keywords on my site I was seeing one of two things:
1. this particular 'proxy' site was indexed with a proxified copy of my url, and no sign of my original site indexed. or
2. the same proxy indexed with copy of my site above my own content in the results.
I decided to block the proxy server IP address from my site, returning a customized page refusing access to my site thinking that this would gradually get the proxy de-indexed and my site re-indexed but after about 10 days things have continued to get worse from my point of view. Less and less of my original content is being indexed and these 'refused access' pages are being indexed using the proxies urls.
I am not in the USA so can I issue Google with a DMCA? will this resolve my problem?
There's also a thread within the forum hot topics [webmasterworld.com] which is linked from the homepage of the Google forum: Proxy Server URLs Can Hijack Your Google Ranking - how to defend? [webmasterworld.com].
More recently, there have been reports that Google may have fixed the problem, in October '07 [webmasterworld.com] and again in April this year [webmasterworld.com]. From your report it looks like this problem may not be resolved yet, unfortunately.
Initially I just blocked the IP of the proxy but as they add their own header to the output the '403 Forbiden' header is not detected by Googlebot so the pages remain indexed.
I then thought OK, I will redirect the proxy to a specific page with a message just telling users why they can't access the site using the proxy. Again that made no difference either and this one page was indexed numerous times in the google serps, each time with a different proxyfied url.
I am at a loss what to do
I was thinking of something along the lines of
/sbin/iptables -I INPUT 1 -s IP.GOES.HERE/32 -p tcp --syn --dport 80 -j DROP
but you might want to verify that, mit skills with iptables etc aren't that great.
read the threads in the hot topics, particularly the 'How to Verify Googlebot' one this will stop Googlebot from indexing your site through the proxy.
The same Useragent is passed to my server by the proxy script no mater what the real UserAgent is, so when Joe Bloggs or Googlebot visits the site via the proxy I see a UserAgent of 'Mozilla/5.0 Compatible' every time.
janharders
You could be right, hadn't thought about using iptables.
Thanks
Greg
Now I just need to hope and pray that his urls are dropped from Google serps and mine reappear and that it doesn't take too long.
Contacting host and owner asking the urls be 404'ed
Contacting Google via webmaster tools at the same time explaining the situation.
Blocking requests from those ips
In general, it took about 2 weeks on a couple of websites of mine before the issue was resolved, but this was over two years ago.