homepage Welcome to WebmasterWorld Guest from 54.198.224.121
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe and Support WebmasterWorld
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Redirecting a simpler URL to a real URL
GotToGetItRight




msg:4486729
 11:24 am on Aug 21, 2012 (gmt 0)

Hi, I want to be able to advertise (in print) a URL such as www.example.com/webpage, which would be redirected to the page www.example.com/webpage.html.

I've searched for suggestions and tried a good many of them, but with only limited success.

For example, I've put this code into my .htaccess file:

Redirect 301 page page.html

and it works fine in Firefox and Chrome, it does nothing in Safari (the address bar shows "www.example.com/webpage" and I get a 404 error), and in Internet Explorer there's always a trailing forward slash appended after the .html, so the address bar says "www.example.com/webpage.html/", which also gives an error.

I suppose that I could rename my webpage.html fie to index.html and place it in a folder named webpage, but I don't want to go down that route.

Any suggestions would be most gratefully received!

 

MinosTheNinth




msg:4486734
 11:31 am on Aug 21, 2012 (gmt 0)

The right syntax for .htaccess file should be

RewriteEngine On
RewriteRule ^webpage$ /webpage.html [L]

That will work as internal redirect, so user will still see example.com/webpage in his address bar.

You can also use [R=301,L] instead of [L], but that will change address in browser address bar. The 301 redirection code means that you moved that page to another location. That is why I suggest not to use it.

GotToGetItRight




msg:4486735
 11:43 am on Aug 21, 2012 (gmt 0)

Thanks, but I'm still getting the same behaviour in the different browsers - works fine in Firefox & Chrome, appended / in IE (error), no change in Safari (error).

Could anything else be getting in the way?

BTW, the

RewriteEngine On

line is already there at the top of my file, and there's other stuff before the

RewriteRule ^webpage$ /webpage.html [L]

line - could this matter?

MinosTheNinth




msg:4486746
 12:16 pm on Aug 21, 2012 (gmt 0)

That behaviour is strange to me because mod_rewrite is server side thing so it should be working the same way in any browser.

Are you using apache on *nix or windows based machine?

Can you pleas post here all the rules, that are before the rule we discuss here?

GotToGetItRight




msg:4486773
 1:50 pm on Aug 21, 2012 (gmt 0)

1. It's a Linux webserver

2. Here's my .htaccess code (and in case it's relevant, the redirect from example.com to www.example.com isn't working either!):

RewriteEngine On

# Redirect from example.com to www.example.com
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]

# Enforce a custom 404 page file
Order deny,allow
ErrorDocument 404 /404.html

# Force a redirect of "example.com/webpage" to "www.example.com/webpage.html"
RewriteRule ^webpage$ /webpage.html [L]

[edited by: incrediBILL at 10:10 pm (utc) on Aug 21, 2012]
[edit reason] fixed code display issue [/edit]

MinosTheNinth




msg:4486836
 2:54 pm on Aug 21, 2012 (gmt 0)

I also have Linux webserver.

I setup localhost domain for testing purposes.

Redirection from non-www to www variant needed some change, because it ended up in infinite loop for me. Here is what works for me:
RewriteCond %{HTTP_HOST} ^example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

Now everything works for me (including Custom 404 page).

I'm afraid that I'am out of ideas about what could possibly be wrong.

Is Custom 404 page working for you? Maybe data from RewriteLog can help us solve this problem.

GotToGetItRight




msg:4486873
 3:58 pm on Aug 21, 2012 (gmt 0)

Thanks very much for your suggestions, I've changed my non-www to www redirection code to what you said, but it's still not working.

My custom 404 page redirect is working ok (it has been all along), but I'm a bit confused about what can be wrong with the www redirect and the /webpage redirect. I'll try it all again on another PC and report back, in case something odd is happening at this end.

You mentioned data from RewriteLog - this is new to me but I've done some research on it, is it right that adding a line into my .htaccess file like:

RewriteLog "rewrite.log"

would write a text file with all the redirect activity? This might be useful, as you say.

phranque




msg:4487018
 10:05 pm on Aug 21, 2012 (gmt 0)

# Force a redirect of "example.com/webpage" to "www.example.com/webpage.html"
RewriteRule ^webpage$ /webpage.html [L]

that's not a redirect that's an internal rewrite.

RewriteCond %{HTTP_HOST} ^example\.com [NC]

your test should look more like "if it's not exactly the canonical hostname" instead of "if it begins with example.com".
for example if your configuration allowed wildcard sudomains then a request for example.com.example.com would not get redirected.

is it right that adding a line into my .htaccess file like:
RewriteLog "rewrite.log"

the RewriteLog is only legal in server config and virtual host contexts.
more importantly this directive went away with apache 2.4 so if you are on the latest version of apache you will need to use the LogLevel directive instead.

lucy24




msg:4487033
 11:07 pm on Aug 21, 2012 (gmt 0)

Before anything else: Have you ever used an htaccess file with this host before? Check the fine print and make sure you're allowed to redirect. (If not, change hosts ;)) It ought to be possible in your case, because a redirect is in the same override category as an ErrorDocument directive, and you said those work. But you never know what kinds of hanky-panky a host might get up to.

#1 If you use mod_rewrite for anything, use it for everything. That is, ahem, all redirecting: do not use mod_alias (Redirect by that name). Unless it is your own server and your own config file and you know what you're doing.

#2 Within the mod_rewrite section of your htaccess go from most severe to least severe (F, G, redirect, rewrite in that order). Within each of those groups, go from most specific to least specific.

This means that the mopping-up redirect from with/without www to your preferred form should be the very last redirect. Otherwise you risk redirecting some requests twice. Do as I say. Not as I do. I cop out and let the host do it.

#3 Has everyone already explained that when you say "redirect" in the post title, you really mean "rewrite"? At least, that had better be what you mean:
I want to be able to advertise (in print) a URL such as www.example.com/webpage, which would be redirected to the page www.example.com/webpage.html

advertise and link to the URL you want people to see. Do not not not send people to an URL that you already know will be redirected. If the real content lives somewhere else, REWRITE to that location. Just to confuse you, Apache may call this an "internal redirect". Ignore them. It's a rewrite.

Extensionless URLs are the Fashion of the Hour. Do a Forums search for "rewrite" + "redirect" + "boilerplate" and you will find several specimens of a slab of text that explains the whole thing. (I don't repost it every time, because it's boring for Forums regulars to read it over and over again.)

g1smd




msg:4487035
 11:13 pm on Aug 21, 2012 (gmt 0)

The redirects for individual pages must be listed before the non-www/www redirect otherwise you introduce an unwanted multiple step redirection chain for some requests.


It's unclear whether you want a redirect or a rewrite for your vanity URL requests. Several people have suggested using a rewrite. Unless the new URL is going to be the only URL that will directly access the content and immediately see it with status 200 OK, using a rewrite introduces a duplicate content problem. If you want requests for the vanity URL to be redirected to the real URL and see that fact reflected in the browser address bar, then (unusually) a redirect is the way to go.

I often program a set of redirects for common URLs that people might guess, redirecting the request to the real page. This includes stuff like /about /contact /location /map /sale /offers and various keywords. These are common "hackable URLs".

One example of this type of redirect is the one for
bt.com/phonebook - is that sort of what you want?
GotToGetItRight




msg:4487127
 8:48 am on Aug 22, 2012 (gmt 0)

Lots to think about here, thanks all for your comments.

lucy24, thanks for making the point about the importance of the order that I put things, I'll sort that out. Also, I'm a bit confused about the exact meanings and differences between redirects and rewrites - I'll do some research on this. I've read your "boilerplate" text, but I think what I'm looking for is much simpler than the scenario you're addressing there. To explain:

I want to be able to advertise a single page (whose real URL is example.com/webpage.html) as www.example.com/webpage, simply droipping the extension. I'm quite happy for the address bar to show the proper, full URL to the visitor when they get there.

g1smd, the bt.com/phonebook example you give is also more complex than I need.

I know the obvious thing is to advertise the URL including the extension, and I could do that, but
a) it doesn't look so good, and
b) this is a really good learning process for me, thanks to all for your help!

g1smd




msg:4487134
 9:04 am on Aug 22, 2012 (gmt 0)

The long term goal must be to move the entire site to extensionless URLs. In the meantime, adding redirects from short URLs is OK, but don't go overboard with a large number of these.

I'm quite happy for the address bar to show the proper, full URL to the visitor when they get there.

A redirect is what you want then.

the bt.com/phonebook example you give is also more complex than I need.

In what way? You ask for the URL
bt.com/phonebook and are redirected to the real page with a much longer URL that you don't need to remember because you have a "short" URL that redirects to it.




The difference between rewrites and redirects is simple.

URLs are used "out there" on the web. Files are used "here" inside the server.

You have a file at /page.html

Ordinarily you would use the URL example.com/page.html to access it.

With a rewrite you can link to href="/page" and when example.com/page is requested, the request is rewritten to fetch the content from the file at /file.html without the browser address bar changing.

You can now access the page at two different URLs, both of which return "200 OK" status.

With a redirect, you ask for example.com/page and the browser is told to make a new request for a different URL. The browser address bar changes and the URL example.com/page.html is requested. The content of the file is then served with "200" OK status.

With the redirect there is no duplicate content because only one URL returns the content with "200 OK" status. The other URL returns "301 Moved".

So, a redirect is a URL to URL translation and the browser address bar changes; and a rewrite is a URL to internal filepath translation where the address bar shows the originally requested URL.


The confusion comes because RewriteRule can be configured to deliver a redirect or can implement a rewrite. The code syntax differences between the two functions are very small but VERY important.

GotToGetItRight




msg:4487136
 9:16 am on Aug 22, 2012 (gmt 0)

Thanks g1smd. I've probably misunderstood what you meant, but by saying the bt example was probably more complex than I needed, I meant that when I enter bt.com/phonebook, I get www.thephonebook.bt.com/publisha.content/en/index.publisha which seemed to me to be quite a lot different, while all I needed was to drop the extension - I understand that the process may be the same though, so thanks for pointing that out.

Your explanation of rewrite and redirect is interesting... but confusing (I'm new to this!). So in my situation, I'm looking to do a redirect? Have I understood you properly?

g1smd




msg:4487142
 10:44 am on Aug 22, 2012 (gmt 0)

The important point about the example was that having requested URL "A" the address bar changed and you were redirected to URL "B", irrespective of what the URL "looked" like.

This is going to sound odd, but try reading the redirect/rewrite explanation out loud.

I summise you need a redirect in your situation, BUT you should publish links to the "memorable URL" only on other sites. You should not link to it from within your own site.

lucy24




msg:4487163
 11:34 am on Aug 22, 2012 (gmt 0)

Gulp. Do youse Brits realize that some of us had no idea the phonebook thing was anything other than a made-up example?

You have a file at /page.html

Ordinarily you would use the URL example.com/page.html to access it.

With a rewrite you can link to href="/page" and when example.com/page is requested, the request is rewritten to fetch the content from the file at /file.html without the browser address bar changing.

You can now access the page at two different URLs, both of which return "200 OK" status.

... and that's why you need the "redirect" half of the two-step. If a user has the brazen audacity to ask for "page.html", you forcibly redirect 'em to the bare "page" before rewriting to bring up the contents of the selfsame "page.html" that they asked for in the first place :) Same as when someone asks for /index.html and is unceremoniously sent over to plain / --except that then you don't have to spell out the Rewrite part.

try reading the redirect/rewrite explanation out loud

Also useful if you've tried everything and your children simply refuse to go to sleep.

g1smd




msg:4487209
 12:46 pm on Aug 22, 2012 (gmt 0)

We don't talk much about "hackable URLs" here, but it's something I have done for years. Pick a range of URLs that users might guess and type in when attempting to find particular pages and set redirects to the right place for those requests. You'd be surprised how often they are actually used even though there's no external clues that this has been set up.

GotToGetItRight




msg:4487224
 1:44 pm on Aug 22, 2012 (gmt 0)

Great replies, and very entertaining, thank you!

g1smd, you're right, reading it aloud did help - but I'm not sure what my work colleagues thought though. Anyway, once I've cracked this issue, I'll think a bit more about the "hackable URLS" thing that you mentioned, I can see that it could be useful.

lucy24, thanks for making me laugh about the phonebook, also for pointing out how I could use your boilerplate text - perhaps it just looked a bit too daunting for me earlier, but I'll dive in now!

lucy24




msg:4487345
 8:47 pm on Aug 22, 2012 (gmt 0)

We don't talk much about "hackable URLs" here, but it's something I have done for years. Pick a range of URLs that users might guess and type in when attempting to find particular pages and set redirects to the right place for those requests.

Heh. I do believe I found several directories in {never mind what site} that way. All auto-indexed, too. And, conversely, that's why I always make sure to have a "webmaster@" e-mail address. Someone might try it.

phranque




msg:4487357
 9:08 pm on Aug 22, 2012 (gmt 0)

that's why I always make sure to have a "webmaster@" e-mail address. Someone might try it.


getting off topic, but...

MAILBOX NAMES FOR COMMON SERVICES, ROLES AND FUNCTIONS:
http://www.ietf.org/rfc/rfc2142.txt [ietf.org]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved