I was hit by Panda back on April 12th.
Anyway, I realize duplicate content might be one of the triggers based on some of the research I've done here and on other sites.
Back in March I went from a shared provider to a dedicated server. I just recently realized my old pages were still being crawled on the old server, despite me having changed the DNS information to the new server. The site doesn't appear as www.mysite.com--it shows the IP address of the old server and then a webpage.
Example: ipaddress_of_old_provider/sample_webpage.html.
I discovered this in the links section of Google Webmaster Tools. So basically I had an exact duplicate of my site on another server that Google was still crawling for some reason. I have since deleted the directories on the old server, and those pages now produce a 404.
So my question is, what steps should one take when switching servers to prevent something like this from happening? I thought since I changed the DNS entries Google would simply begin crawling the new site and ignore the old one. Is there anything I can put on the old site to instruct the google bots to ignore the old site and give priority to the new site? Something in the .htaccess or robots.txt file?
Again, the old site has been deleted, but I wonder if this had any bearing on why my site was targeted by Panda. Thanks for any insite.