Forum Moderators: open

Message Too Old, No Replies

Using redirects

What is the best solution to this problem?

         

xbase234

7:22 pm on Nov 17, 2003 (gmt 0)

10+ Year Member



My site is currently running 302 redirects (as opposed to 301)for the purpose of load balancing, but I'm concerned that our pages may get penalized for this. Any evidence to support that 302's may cause the bots to crawl the other way?

GoogleGuy - any thoughts or suggestions?

Most posts on this topic seem inconclusive - other than the fact that just about every type of redirect except a 301 will cause problems. Can anyone also shed more light on the worst case scenario for using a 302 or other type of redirect for otherwise legitimate purposes?

caveman

7:40 pm on Nov 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I was advised in here about six months ago - by some senior members - that 302's were very safe. However, since then I've read other say "maybe not," since while 302's are often legit, they are also a commonly used tool of spammers, and with Google on the warpath these days who knows....

Would love to hear the latest wisdom on this from WW's resident guru's. :-)

caveman

Brett_Tabke

7:46 pm on Nov 17, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



302's used to be safe. Then there was a weird google - ummm, we say bug, but they say - ummm, they said nothing ;-)

Ya, 302's _should_ be safe.

xbase234

10:40 pm on Nov 17, 2003 (gmt 0)

10+ Year Member



so far, this is helpful. Can anyone else add to this? thanks

caveman

3:53 am on Nov 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Normally after a post from Brett, I'd stop asking questions...but, think we can take "_should_" to the bank, in light of Florida, bugs, etc? (Sorry but had to ask...we're planning a new site right now and it could make a significant difference to our success if G blasts us for using 302's...don't want to become collateral damage...)

caveman

xbase234

4:37 pm on Nov 24, 2003 (gmt 0)

10+ Year Member



Brett -

Can you give any examples of the pitfalls of 302's? Pages disappearing, loss of rank, loss of backward links, etc?

Also, maybe we should run 301's instead of 302's for load balancing?

Thanks

Sharper

6:09 pm on Nov 24, 2003 (gmt 0)

10+ Year Member



The relevent question for me would be, HOW are you using 302's to load balance?

I'm trying to think of how you would do that and I can't think of a way that would make more sense then several other much easier load balancing methods, so I'd love to understand how this is setup.

Do you have one web server that 302's requests round-robin to other web servers using an SSI, or what?

xbase234

6:45 pm on Nov 24, 2003 (gmt 0)

10+ Year Member



My understanding from the tech admin is that we are running the round-robin approach, as you described (from one to multiple), using dynamically served pages.

Sharper

3:30 pm on Nov 25, 2003 (gmt 0)

10+ Year Member



In that case you don't want to use a 301 instead because that'd tell each visitor (and google) that the site has moved permanently to the temporary load-balance url you are sending them to.

Since from what you describe, each visitor hits a dynamic page to start with, then is redirected to the "home" page on another server, everyone has to get at least one answer from the primary server. In that case, perhaps you should think about justing serving up the home page from the primary server, but re-writing the links dynamically to lead them to "other" servers when they follow them? That may save a little overhead, but you'd have to consider the effect that spidering would get by having the links constantly changing. Can't see how it'd be worse than it is now, though.

A better solution would be to use DNS round-robin with a low TTL so that you can drop a server that goes down. That would have every server show up as the same domain name/URL, but the IP address would be different. That would make it so that web clients don't hit the same server for the first page and don't get a 302 or 301 or anything, they just get the pages from one of your servers.

An even better solution would be to get a port on a real load balancer. If you are spending the money to host multiple servers and have so much traffic that you need to load balance it, then you can probably afford to use a port on a "shared" load balancer provided by your ISP. If it's a really big site, you could also buy your own. I'd suggest a Foundry ServerIronXL, in that case. It can also double as nice switch.

If you need to load balance due to traffic levels, you should have better options available to you than the method you describe.

SevanB2

3:02 am on Nov 26, 2003 (gmt 0)

10+ Year Member



Why don't you use JavaScript redirections? I don't know of any search engine that does not index page with JavaScript redirection.

oodlum

8:04 am on Nov 26, 2003 (gmt 0)

10+ Year Member



Why don't you use JavaScript redirections? I don't know of any search engine that does not index page with JavaScript redirection.

Yeah but a lot of people will report you for spamming as soon as they see one of those

jim_w

8:32 am on Nov 26, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I went through my site and renamed most all the pages not too long ago. Gave them 301’s, and when Gbot went through it ummm, didn’t get all of the redirects. It forgot the link page that had the name link in it. (red flag #1)

So I put in 302’s to see what would happen. Gbot, maybe a ummm a different version, found the 302’s and fixed their database so all was fine and well up until FL. Then I noticed that once again Gbot wasn’t getting the link page even though it already had the correct page for the redirect. (red flag #2) So I was right back in the same boat, but I had not changed anything. Now, I have gone from any of the top, used loosely, 100 pages. Not only that, but after Gbot read robots.txt, which it and no other SE has had a problem with, it touched one of the pages I disallow. It did not read any bytes, but it touched it.

I’m now going back and changing the 302’s back to 301’s except where I had a large number of redirects because this is the only reason I can think of that my rankings may have dropped so dramatically. Where I have the large number of redirects, silly me had article01, article02, article03, etc. Changed the names of them to something I could manage, but after that changed the directory where they are. This is such a cluster-funk, that I am removing all the redirects for all the articles until I figure out what SE have what dir/article in their databases. I’ll add back in 301’s as I see they are needed. Looksmart just came through and found the article01, article02 stuff, so I know some legacy stuff exists in some of the SE’s databases. But I bet I get tagged for 404’s then. It’s the way my luck runs. I did a whole bunch of 301’s and 302’s just when there were some discrepancies with at least the way one SE bot handled them.

As far as I can tell, to add to Brett’s rules, above all else, really think hard about the structure of your web site to include the file names so you won’t have to go through a bunch of work two or three times and then if there are any, umm, bugs in any of the SE’s, with redirects, you will not have to worry about it.

Example, if I had the courage to redo the site again, I would have most of the web site inside the public_html folder in a folder called public so that I could control access better via .htaccess.

Well that’s my experience with it for what it is worth.

caveman

1:34 pm on Nov 26, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jim_w, I second that emotion! We had a similar experience, since for too long we worried more about content that structure. What a killer.

We eventually 301 redirected all the internal pages to new file names that were more logical, etc. The 301's worked...kind of. It took G *six months* to get it all properly updated. That was not a fun six months...and I've read a lot of similar posts in here. Some get lucky and see it all resolve sooner, but not always.

wanna_learn

6:57 am on Nov 27, 2003 (gmt 0)

10+ Year Member



Just noticed the below Entry in my Log, should I worry about this?

This site has literally vanished from Google Index!

202.156.2.xx - - [02/Nov/2003:19:54:42 +0530] "GET /vtsamodehaveli HTTP/1.1" 301 340 "http://www.myxyzsite.com/hotels-tour.htm" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"

PS - I replaced the exact URL and exact Ip due to TOS of Forum.

nakulgoyal

1:27 am on Dec 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do you feel this is after Florida? Your dates say yes, just want to cnofirm. If yes, sticky me your URL.

claus

4:19 pm on Dec 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Brett's right, there are some issues that might be considered Google bugs, but imho they aren't really bugs, it's just a very strict interpretation of server status codes. Redirects are really powerful tools and should be handled with great care.

A 301 is for pages that are permanently moved to another location

A 302 is for pages that are temporarily moved to another location

So, if the purpose is load balancing, a 302 would be the right one to use given these two alternatives. I'd strongly suggest, though, that you work on sending status codes in the 200 range (page found, etc.) in stead. This can eg. be done by using an internal redirect in stead of an external one (just omit the [R] flag in the rewrite rule).

There's additional info here: Engelschall's Apache URL rewrite guide [engelschall.com] (the guy who invented mod_rewrite)



Also, if you have a redirect script for links you should make sure the links return 302's (or something in the 200 range). That is, if you want the links to transfer PR and count as backlinks for the page you forward to. The site search should produce quite a few threads about this.

The normal and intended use for 30X status codes is pages or websites that move from one URL/URI to another:

a) If you redirect "x.com" to "y.com" using a 301 redirect, Google will eventually merge the two domains in the SERPS so that only "y.com" will remain. It will take at least a couple of weeks for these changes to propagate to the SERPS.

b) OTOH, if you use a 302 redirect both domains will be kept alive in the SERPS with separate listings, but Google will bury "x.com" so deep in the SERPS that it will only show up when you search for it - "y.com" will still be treated as the proper domain. (given that you have only a few incoming links to "x.com" relative to links to "y.com". If not, "x.com" might be interpreted as the right domain in stead of "y.com".) If you have no incoming links to "x.com", then only "y.com" will remain in the SERPS.

It's the same thing for pages.

Conventional wisdom suggests always using 301's (and Google does so on their webmaster pages as well), but this is not right, as those two codes do mean different things and should be used for different purposes. These are my personal rules-of-thumb for Google:

  1. A 301 is primarily for "one-to-one" relationships: That exact page is now found here (and it will remain that way) (Use: one specific page moved to one specific other location)
  2. A 302 is primarily for "many-to-one" relationships: These various pages are now found here (and that location might change at some point) (Use: everything that is not 1: links, vanity domains, merging of two pages into one, fusion of websites, etc.)

If used properly, neither of the two status codes should get you into duplicate trouble by themselves. They simply tell user agents that this or that page has moved. So, what the user agent (eg. googlebot) will see is not a copy of the same page at another location, but the exact same page, only transferred to another URL. It's not a copy, it's the real thing ("whatever it was that used to be here is now found there").

So, essentially, a 30X is a "placeholder" or "shortcut", and not a real page. That's also the reason that Google can merge this kind of URL's in the serps. If you use them right, you will avoid creating duplicates, but the .htaccess setup will not be the only thing to consider, you have to take incoming links into account too.



Worst-case scenarios are:

301: You redirect more than one page to the same new URL. If these pages all have incoming links, Google will have a hard time figuring out what the real URL is, as your website will appear to have a split personality. Effects are described in this thread: redesigns, redirects, & google -- oh my! [webmasterworld.com] - or in any of the "missing index page" threads (imho).

302: You redirect one or more page(s) to the same new URL. If these pages all have incoming links, Google will not be able to merge them (it is a temporary redirect). You will risk that it will be considered duplicate content even though it isn't. Most likely the page that you are redirecting from will be buried deep down in the serps. Theoretically, if it had a lot of backlinks it could even outrank the real page, and the real page would be buried in stead, but that's not real: as backlinks are "inherited" or transferred this will not happen.


All of the above is of course FWIW, IMHO, AFAIK, etc. If it's right it's also subject to change (as everyting that involves Google) and some or all of it might have been, be, or become totally wrong at some point in time.

I would not expect Google to post "official" comments on this, although i would really welcome it. These matters might have been open for abuse at some points in time, but i personally feel that the worst you can do with them at this moment is to harm yourself.

/claus