Welcome to WebmasterWorld Guest from 54.234.244.30

Forum Moderators: mademetop

Message Too Old, No Replies

Should Robots.txt be used to block redirected URL's?

     
3:29 pm on Jun 20, 2008 (gmt 0)

New User

5+ Year Member

joined:Sept 24, 2007
posts: 35
votes: 0


We have just migrated a site to a completely new architecture. Google still shows hundreds of old, obsolete URL's in the SERP's several weeks after the launch of the new site. Many of the old url's now 301 redirect to the new URL format, while many others simply 404.

Is it wise to use robots.txt to disallow the old url's, or will this prevent Google from seeing the new URL's which are 301 redirects? Or is it better to have this so that defunct url's will be replaced more quickly with the new URL's?

3:37 pm on June 20, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member caveman is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 17, 2003
posts:3744
votes: 0


Two weeks is no time at all. The 404 URI's will eventually drop out of the index.
3:43 pm on June 20, 2008 (gmt 0)

New User

5+ Year Member

joined:Sept 24, 2007
posts: 35
votes: 0


Hi Caveman,

Thanks for your reply. I'm not really concerned at all with the 404's - my concern is whether using the disallow statement in Robots.txt for url's that have 301 redirects will have a negative or positive effect...

4:04 pm on June 20, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member caveman is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 17, 2003
posts:3744
votes: 0


The whole point of the 301 is to tell the engines the content has moved from one location to another. If you disallow the original URI's you prevent the 301's from working.

<added>If there are significant inbound links to the old URI's, you'll want to keep the 301's in place long term. If not, you can remove the 301's after 3-6 months, at which point the old URI's will 404 out of existence.</added>

4:09 pm on June 20, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


If you disallow a URL, then Googlebot will not fetch it. If it does not fetch the old URLs, it will never see the redirects to the new URLs. Do not Disallow your old URLs.

Depending on the PageRank of each old URL, it will take anywhere between two weeks and a year to resolve all these redirects. As for requests for URLs returning 404s, it may take even longer.

Note that since you removed those old 404 URLs intentionally, you should be returning a 410-Gone response if they are requested. Search engines currently seem to treat 404 and 410 identically, but 410 is the correct response because it says "We removed this resource and it won't be back," rather than "The requested resource cannot be found, the reason is unknown, and the resource may or may not return."

Jim

5:00 pm on June 20, 2008 (gmt 0)

New User

5+ Year Member

joined:Sept 24, 2007
posts:35
votes: 0


Great, thanks for the replies, it's nice to see a consensus from at least two people that the old url's should still be accessible to the bots...

One exception that I thought of is that brand-new url's that are not related to any old ones appear to be ranking better, so I thought that maybe if Google just indexes the new url's from scratch, it might be better, but it was just a theory.

I will take your advice and remove the disallow for the 301's url's...

9:53 pm on June 23, 2008 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14624
votes: 88


I have to chime in on this because Google never forgets in some instances.

I've had them blocked via robots.txt from using an old path and the Google Webmaster Tools still complains that I have over 30K "URLs restricted by robots.txt".

Doesn't seem to hurt my site any but they won't let those URLs go.

[edited by: incrediBILL at 10:00 pm (utc) on June 23, 2008]

10:04 pm on June 23, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member jimbeetle is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Oct 26, 2002
posts:3292
votes: 6


Feed 'em 410s, Bill. It will still take awhile, but eventually G will forget about them. Well, not really forget. It might still come back to test them every so often just to be sure they're really gone.
10:09 pm on June 23, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


The URLs which are now redirects are quite likely to be flagged up as Supplemental Results very soon. In that case they may very well hang around in the SERPs for many many months to come.

That is NOT a problem, because if anyone were to click any such entry in the SERPs then your redirect will deliver them to the correct URL and to the correct content on your site anyway.

That is what the redirect is for.

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members