Forum Moderators: open
I checked the results and noticed that Google spidered some Proxy-Server.
Nearly all my 200+x Pages can be found under
www.MyDomain.de (as always)
and
belediye.nameltd.com/cgi-bin/nph-proxy.cgi/111110A/http/www.MyDomain.de
www.wakedogg.com/cgi-bin/nph-proxy.cgi/000110A/http/www.MyDomain.de
[proxy.citizenlab.org...]
www.vnphys.org/cgi-bin/nph-surfweb.cgi/11110/http/www.MyDomain.de
Now I am really worried that Google is checking for duplicate Pages and will find my homepage and the copies on the Proxy-Servers.
GoogleGuy – Protector of the ignorant and fearful.
Please help
Welcome to WebmasterWorld [webmasterworld.com]!
While I've never had to deal with this kind of problem before, it seems to me that you could take action to stop it.
Set up your server to refuse connections from these proxies, or refuse referrals from them, whichever is appropriate. Without knowing how Google "sees" your site through the proxies, and what kind of log entry it leaves when it spiders through them, it's impossible to tell. But you should be able to do something about it on your server to prevent problems with search engines and to avoid other possible problems.
Jim
But I don’t think this is not the Solution.
Reason 1:
Even if I find a method to refuse connections from these proxies, there are thousands of these proxies in the net.
Reason 2:
It’s not quick enough. The proxies will not delete the pages before the DeepBot will spider the pages.
Reason 3:
I am already in the Databank of Google, together with the duplicate pages of the proxies.
In the moment Google will start to look for duplicate pages I am lost!
Last month I changed the URL of 12 Pages.
I marked the old Pages with NOINDEX,FOLLOW an set a link from the old Page to the new page.
I thought that Google will find the old Page before the new Page and everything should be OK.
During the last 4 weeks the FreshBot visited each of these pages at least 10 times. What happened?
- 4 Pages got killed because of duplicate content (If Google compare pages he is probably ignoring NOINDEX or using (sometimes) old pages of his databank.)
- 4 Pages are still in the index with the old URL
- 4 Pages have changed correctly
The conclusion: Google is only comparing the pages in his databank. If he finds 2 similar pages he will react. I don’t think he is checking the actual status of the Pages before he is deleting the duplicate pages.
Help! What can I do? I am really anxious.