Welcome to WebmasterWorld Guest from 54.198.35.26

Forum Moderators: Robert Charlton & andy langton & goodroi

Message Too Old, No Replies

Index shows old urls in spite of 301 redirects

     
7:37 am on Nov 3, 2012 (gmt 0)

Full Member

10+ Year Member

joined:Aug 5, 2003
posts: 246
votes: 6


The hand-rolled CMS I have for entering news on my site has a re-edit page previously pointed to an old format for the URLS (which I just corrected)

However, I just noticed today that I have about 98 pages in the index that are using the old news URL.

Here's the issue:

1) the old url gets redirected to the new one by a 301. Http live headers shows the redirection
2) the edit area is in an area of the site that bots should not be crawling due to a robots.txt directive.

How is google storing the old url style? Why would it be ignoring the redirect and storing the old version?
8:16 am on Nov 3, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13923
votes: 496


Is it bringing up both? Two versions of each page? Or did you mean that the new correct URL is in a roboted-out area?

If google can't get to the new version, it will keep serving up cached copies of the old version forever. No-crawl does not mean no-index.

If I've understood the situation correctly, you need to do two things. First remove all those old URLs manually in GWT. (Fortunately you can do whole directories at once.) Then get rid of the robots block and replace it with a meta no-index on the individual pages.
8:43 am on Nov 3, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10932
votes: 80


are you sure googlebot has crawled the old style urls? and seen the 301?
9:27 am on Nov 3, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Google will request every URL it has ever seen, forever. Once they see the redirect, the old URL will be delisted from the SERPs. Google will still occasionally request the old URL to check the current status, forever. Make sure that all links on the site point to the new URLs.
9:50 am on Nov 3, 2012 (gmt 0)

Full Member

10+ Year Member

joined:Aug 5, 2003
posts: 246
votes: 6


Sorry I wasn't clear in my post. Late night and a lack of coffee.

The edit page was the only place where the old style URL was showing up (it was a more info link for each news story that would bring up the full article.)

That edit page is in a subdir blocked by robots.txt and protected by http auth. That's the only place those links showed up.

I haven't used those old style urls in the public areas in years. But google was somehow indexing stories from even a few days ago via the incorrect "more info" link in the robots.txt blocked area. The public facing pages have a redirect that catches any of the old style urls with a 301. That's been in place for at least 10 years.

I'm thinking the only way Google could have been aware of those links is via the Google Analytics code that is in the main template and therefore even gets included on the admin pages.