homepage Welcome to WebmasterWorld Guest from 54.167.182.201
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Get the referer URL from a 301 redirect
glimbeek



 
Msg#: 4151172 posted 7:51 am on Jun 7, 2010 (gmt 0)

I have a website:

http://www.example.com with several different languages aka:
example.com/us/
example.com/uk/
example.com/de/
example.com/fr/
etc...

Based on your IP, using a GeoIP script, you are pointed towards the right language except for languages from the EU (except UK). Those are done based upon the browser language.

Recently we "removed" /us/ and using .htaccess we 301 it to http://www.example.com/ which works fine.
But after the 301 rewrite, the GeoIP script still checks for your browser language (because the script didn't rewrite you yet). If you for instance use a browser with the language set to German and you browse to http://www.example.com/us/ the following happends:
http://www.example.com/us/
301 to http://www.example.com/
302 to http://www.example.com/de/ (based upon the browser language).

Obviously this isn't something you want so I'm trying to "fix" it.

All though I'm lost on how to fix it. I looked into using HTTP_REFERER, but that isn't to be trusted (according to the PHP manual) + it doesn't seem to work in this instance. The HTTP_REFERER stays empty?

Two questions:

1) What's the cleanest rewrite to rewrite /us/?
RewriteRule ^us$ [NC,OR]
RewriteRule ^us/$ [NC,R=301,L]

Does that come close? Or is there a better way?

2) Does anybody know a way to check where the visitor came from (the 301 redirect) so I can make sure my script doesn't 302 the visitor after the 301?

Thanks in advance!

With kind regards,
George

[edited by: dreamcatcher at 12:54 pm (utc) on Jun 7, 2010]
[edit reason] Fixed typo with example.com [/edit]

 

glimbeek



 
Msg#: 4151172 posted 8:42 am on Jun 11, 2010 (gmt 0)

Maybe I posted this in the wrong forum? Was the apache section a better place?

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4151172 posted 2:58 pm on Jun 11, 2010 (gmt 0)

A redirect terminates the current HTTP request, and asks (not tells) the client to re-request what it requested the first time, but using a new URL provided in the server's redirect response.

Therefore, the server has no 'memory' of the first HTTP request (to which it responded with the 301) when the the second HTTP request arrives as a result of that 301.

So, you need something (an indicator) that you can "pass through" the redirect to signal that the second request is the result of a previous 301 redirect and should not be 302-redirected. This would require either appending a query string to the 301 target URL or setting a client-side cookie which would be returned to the server with the second request and could be tested.

An alternative is to drop auto-language selection, which is always problematic (think Switzerland, Belgium, Spain, Canada, etc.), and simply let the user choose his/her language and set a cookie to remember it.

GeoIP doesn't work well for multi-language countries or for travelers or for expatriates. Browser language doesn't work with shared computers in multi-language countries or in countries with high rates of tourism (think internet cafes). So, the best way to pick the display language for international sites is to let the user tell you. The "row of little flags across the top" method seems to be the least offensive and takes the least space.

A compromise might be to auto-select the language only if GeoIP and the browser Accept-Language settings are both present and in total agreement -- GeoIP indicates single-language country, only one language preference is present in the HTTP Accept-Language header, and both agree. If not, show the flags and ask.

Expanding on that just a little: If GeoIP says "Spain" and you see only "es" or "es-ES" in the Accept-Language header, then you can be reasonably confident that the Spanish content should be shown. And even if the Accept-Language header shows "es-ES,es-PT" or "es-ES;q=0.3,es-PT;q=0.1", you can still be reasonably confident. But if it shows "es-ES,ca-ES", indicating that some users speak Catalan, then you cannot really be sure and you should show the flags.

Even with this method, a "Change language" link should appear on every page of your site, preferably as a language-neutral graphic that strongly hints at language selection -- like the flags. :)

Perhaps you could expand and narrow the selection of flags based on the GeoIP and Accept-Language consistency, but again, that leaves out tourists and expatriates.

This is one of the many "Just because you can do something doesn't mean you should" things. As you can tell, I'm strongly against forcing language selection, simply because it is so utterly annoying (some might say insulting) when it is wrong. On the other hand, I concede that sites where the user must click through a huge global map entry page to get to the content are also annoying. That's why I prefer the flags...

The bottom line is that people, ISPs, and languages cross geopolitical boundaries, and an ISP location or a browser or computer language setting is not a person...

Your mileage may vary... Despite my dislike for automatic language-selection, perhaps the 'consistent indicators' suggestions will be useful to you. :)

Jim

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4151172 posted 3:40 pm on Jun 11, 2010 (gmt 0)

Additionally, where the language is auto-forced, Google is probably forced to read only US English pages, even when they want to index other stuff on the site.

In many cases, they will never be able to get to any of the content in other languages.

glimbeek



 
Msg#: 4151172 posted 6:31 am on Jun 14, 2010 (gmt 0)

Thanks for the replies!

We provide flags for every language we support on every page. So a user can always switch if need be.

However we also show different content aka banners and affiliate links for people from the US, UK or an international set of banners and links for people who don't come from the US or the UK on some English pages in our website.

"This would require either appending a query string to the 301 target URL or setting a client-side cookie which would be returned to the server with the second request and could be tested. "

How can I achieve this with .htaccess?
The first option can simply be achieved by creating a redirect which points to a url with a query string? However this would mean that we redirect someone to the frontpage of our website with a url appended by a query string (example.com/?source=us) which shows the same content as example.com? This creates duplicate content?

How can I achieve the second option using the .htaccess file?

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 4151172 posted 6:41 am on Jun 14, 2010 (gmt 0)

query string to the 301 target URL


That's problematic because if you're running AdSense or Google Analytics on the page, the query string makes it a whole new URL.

These 2 are not the same to the search engine, 2 different URLs:

http://example.com/index.html
http://example.com/index.html?q=referer

So using AdSense would cause the google-mediapartners bot to crawl the new URL with the query string and then you'll end up with a huge canonicalization problem on your hands.

You can attempt to fix it with the canonical tag, but not all SEs honor it.

To make sure AdSense doesn't crawl the page, your script would have to immediately redirect (again) to the page without the query string to avoid confusion in the SE but now you have 2 redirects - ICK.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4151172 posted 12:53 am on Jun 15, 2010 (gmt 0)

Set the cookie in the script that generates the first page.

Check the cookie using a RewriteCond %{HTTP_COOKIE} !^pattern-to-match-cookie-here$

That is a negative-match pattern (with the leading "!"), so this will prevent the redirect if the cookie is set.

Jim

glimbeek



 
Msg#: 4151172 posted 6:19 am on Jun 15, 2010 (gmt 0)

jdMorgan,

"Set the cookie in the script that generates the first page."

But doesn't the ^us/ rewrite happen first because of the .htaccess even before the first page is accessed?

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4151172 posted 1:05 am on Jun 16, 2010 (gmt 0)

Whatever script runs first must set the cookie. Whether that is the redirected-to script or not is up to you. This problem cannot be solved all-at-once with two lines of code, and may require you to adjust your scripts.

The proposed 'meaning' of this cookie is "Yes, we have already redirected. Don't do it again."

Jim

glimbeek



 
Msg#: 4151172 posted 6:45 am on Jun 16, 2010 (gmt 0)

I think I misunderstood you. I thought that you ment by script, the PHP script. I understand from your reply that I can also use the .htaccess file to set a cookie?

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4151172 posted 12:32 pm on Jun 16, 2010 (gmt 0)

I did mean your script -- Whatever does the redirect or rewrite should test the cookie, then if it was not set, set it and do the redirect.

I am avoiding discussing 'implementation' because we're not done talking about the design specifications here... You can put most of this logic in your script or put most of it in mod_rewrite code, or something in-between.

You can set cookies in mod_rewrite on Apache 2.x, but not on Apache 1.x. See the [CO=] flag for mod_rewrite on Apache 2.x. Although Apache 1.3 seems to be 'fading fast' now, consider backwards compatibility in your planning.

Jim

glimbeek



 
Msg#: 4151172 posted 12:49 pm on Jun 16, 2010 (gmt 0)

Using the .htaccess file to set the cookie, so I can check if the cookie is set with PHP sounds like a plan. Depending on the state of the cookie I then run the rest of my script or not.

From [httpd.apache.org...] I got the following example.
RewriteRule ^/index.html - [CO=frontdoor:yes:.apache.org:1440:/]

Reading trough the explantion I'm a bit lost what it all means...

Accessing index.html will set the cookie?

frontdoor - Is the name of the cookie?
yes - What does this do?
.apache.org - Is the domain for which is cookie is valid? If so, what does that mean?
1440 - Is the cookie time? Can I leave it blank? What do you recommend?

In my situation should I use something like this?
RewriteRule ^us/$ - [CO=usfrontpage:yes:.example.com:1440:/]

George

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved