homepage Welcome to WebmasterWorld Guest from 23.22.173.58
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
robots and redirecting
i am wondering how a robot would act on a page with a redirect
melanger

5+ Year Member



 
Msg#: 3593663 posted 2:55 am on Mar 7, 2008 (gmt 0)

Hi all,

My goal is to learn how robots, crawlers and spiders deal with content on pages that have a redirect or meta refresh. If you have any advice about this and then how its affected with and without exceptions in the robots.txt file i would greatly appreciate!

As a noob working with php im guessing the ways of redirecting from one page to another are doing
- a simple meta-refresh
- a php header change
- some kind of javascript redirect

when a crawler comes across these kinds of page with redirects included does it -

a) scan the page and then jump to the redirected page
*result - both original and redirected page are indexed

b) scan the page and then stop
*result - only original page gets indexed

c) ignore original page, jump to redirect page and scan
* result - only redirect page gets indexed

I am curious as im using redirects a bit this week and wondering how relevant content(for indexing) would be treated on these 'original' pre-redirected pages.

thanks again

 

Receptional Andy



 
Msg#: 3593663 posted 8:56 pm on Mar 8, 2008 (gmt 0)

Hi and welcome [webmasterworld.com], melanger :)

It would be worthwhile creating some pages with the types of redirects you have in mind, and seeing how spiders react to them. I'd also recommend a read of the HTTP/1.1 Redirection Status Codes [w3.org].

My expectation would be

- 'meta refresh' redirects would be interpreted as an 'unknown' or temporary redirect (as if it delivered a 302 status code). Both the original and destination URLs can be indexed
- Server side redirection would be interpreted according to the status code. 301 means only one URL indexed
- It depends on your javascript, but javascript may not trigger a redirect at all. Spiders parse javascript, but don't usually execute it

Note that server-side redirects (with an appropriate status code) don't deliver any page for search engines to spider. If a URL always redirects, there's no content to 'scan'. But that doesn't mean that search engines won't index that URL.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved