Welcome to WebmasterWorld Guest from 50.16.112.199

Forum Moderators: goodroi

robots and redirecting

i am wondering how a robot would act on a page with a redirect

   
2:55 am on Mar 7, 2008 (gmt 0)

5+ Year Member



Hi all,

My goal is to learn how robots, crawlers and spiders deal with content on pages that have a redirect or meta refresh. If you have any advice about this and then how its affected with and without exceptions in the robots.txt file i would greatly appreciate!

As a noob working with php im guessing the ways of redirecting from one page to another are doing
- a simple meta-refresh
- a php header change
- some kind of javascript redirect

when a crawler comes across these kinds of page with redirects included does it -

a) scan the page and then jump to the redirected page
*result - both original and redirected page are indexed

b) scan the page and then stop
*result - only original page gets indexed

c) ignore original page, jump to redirect page and scan
* result - only redirect page gets indexed

I am curious as im using redirects a bit this week and wondering how relevant content(for indexing) would be treated on these 'original' pre-redirected pages.

thanks again

8:56 pm on Mar 8, 2008 (gmt 0)



Hi and welcome [webmasterworld.com], melanger :)

It would be worthwhile creating some pages with the types of redirects you have in mind, and seeing how spiders react to them. I'd also recommend a read of the HTTP/1.1 Redirection Status Codes [w3.org].

My expectation would be

- 'meta refresh' redirects would be interpreted as an 'unknown' or temporary redirect (as if it delivered a 302 status code). Both the original and destination URLs can be indexed
- Server side redirection would be interpreted according to the status code. 301 means only one URL indexed
- It depends on your javascript, but javascript may not trigger a redirect at all. Spiders parse javascript, but don't usually execute it

Note that server-side redirects (with an appropriate status code) don't deliver any page for search engines to spider. If a URL always redirects, there's no content to 'scan'. But that doesn't mean that search engines won't index that URL.

 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month