Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Old pages being crawled after changing server & relation to Panda?

         

SEO2Go

9:16 pm on Aug 30, 2011 (gmt 0)

10+ Year Member



I was hit by Panda back on April 12th.

Anyway, I realize duplicate content might be one of the triggers based on some of the research I've done here and on other sites.

Back in March I went from a shared provider to a dedicated server. I just recently realized my old pages were still being crawled on the old server, despite me having changed the DNS information to the new server. The site doesn't appear as www.mysite.com--it shows the IP address of the old server and then a webpage.

Example: ipaddress_of_old_provider/sample_webpage.html.

I discovered this in the links section of Google Webmaster Tools. So basically I had an exact duplicate of my site on another server that Google was still crawling for some reason. I have since deleted the directories on the old server, and those pages now produce a 404.

So my question is, what steps should one take when switching servers to prevent something like this from happening? I thought since I changed the DNS entries Google would simply begin crawling the new site and ignore the old one. Is there anything I can put on the old site to instruct the google bots to ignore the old site and give priority to the new site? Something in the .htaccess or robots.txt file?

Again, the old site has been deleted, but I wonder if this had any bearing on why my site was targeted by Panda. Thanks for any insite.

levo

12:32 am on Aug 31, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I use the following code to redirect anything but the main domain name. It takes care of dedicated IPs or parked domains etc.

RewriteCond %{HTTP_HOST} !^www\.domain\.com 
RewriteRule (.*) http://www.domain.com$1 [R=301,L]

g1smd

1:10 am on Aug 31, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That code does not take care of all non-canonical requests. It misses a few.

Use this alternative:
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]


Every code change is important. There are five.

SEO2Go

11:31 am on Aug 31, 2011 (gmt 0)

10+ Year Member



Thanks guys. I'll add the code to the old site's .htaccess file. I always thought simply changing the DNS pointers would be enough, but now I've learned my lesson.

Marshall

12:46 pm on Aug 31, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For what it's worth, I moved a site to a new server 8 months ago and deleted the pages off the old server. However, the old pages still come up in Google despite the fact they don't even exist.

g1smd

12:58 pm on Aug 31, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What HTTP status code do the URLs return?

Change it to "410 Gone" and they'll get the message.

SEO2Go

1:31 pm on Aug 31, 2011 (gmt 0)

10+ Year Member



Since I deleted the pages, they now produce a 404. I'm not sure how to implement the 410. Also, modifying the .htaccess with the suggestions above did not work--it still went to the old page (which is now deleted and producing the 404). I will post part of the URL to give you an example of what the old site is doing:

http://29-223.bluehost.com/example_page.htm

In the above example, you can see how G was still getting to my page even though my website pointers had been changed. When I clicked the link in Webmaster Tools, it went to the old site and displayed the old content--which is now on the new server.

Maybe just deleting them will fix the problem over time. I regret not deleting them sooner. Thanks all.

[edited by: tedster at 2:43 pm (utc) on Aug 31, 2011]
[edit reason] make example URL display in full [/edit]

levo

4:35 pm on Aug 31, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would remove the old/wrong URLs from Google using robots.txt & WMT.

aristotle

6:05 pm on Aug 31, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Could someone have linked to one of the old pages using the IP address as the URL? This might cause Google to keep crawling them.

Also, in Webmaster Tools does it show you as the verified owner of the old site?

Also, do any of the old pages appear in the Google SERPs?