homepage Welcome to WebmasterWorld Guest from 54.166.66.204
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Considering taking the extensionless plunge
ChanandlerBong




msg:4465805
 9:15 am on Jun 15, 2012 (gmt 0)

A short history. My site has been online since 2001. For a year, it used .html extension. Then I discovered the sheer glee of includes and it's been .shtml ever since. I have long wanted to switch to .php, but have avoided it due to the feeling that in another few years, I'll be doing the same again and that "there has to be a better way".

So I'm now thinking of going extensionless. Would the following be more or less what I have to do:

1. change all extensions to .php because that's the technology I want to use.

2. change all internal links (on my menu, for example) to similar to example.com/faq, without extensions

3. add mod rewrite to my htaccess file to redirect all requests for faq.html, faq.shtml and faq.php to /faq

couple of noob questions about this:

1. with extensionless URLs, does the browser know what content is being served up? There won't be any problems with file types?

2. If I 301 redirect, will there be any search engine fallout? The site is performing strongly in google/bing and I am terrified of throwing a spanner into all of that.

I've found this thread: [webmasterworld.com...]

which seems a good technical place to start, just wanted a few of these questions cleared up first.

many thanks.

 

lucy24




msg:4465819
 10:03 am on Jun 15, 2012 (gmt 0)

You've missed one piece.

Going extensionless is functionally no different from letting a short pretty URL cover up a long messy "real" path. First you decide what you want the user to see in their address bar. That's your redirect-- only necessary for people following outdated bookmarks or links. The important part is the rewrite, where you secretly serve up content from wherever it really lives, under whatever name it really has.

Do a Forums search for "rewrite redirect boilerplate". Or just eyeball the recent threads within Apache; it's been posted within the last few days.

The browser doesn't care about the filename. Well, maybe MSIE 2 did. But as long as the browser can open up the page and find html inside, it will be happy.

Oh, and Psst! You didn't have to change to .shtml for your SSIs. There are two other ways; I currently use both. But too late now, I guess.

enigma1




msg:4465829
 11:00 am on Jun 15, 2012 (gmt 0)

The technology you use is not related with the extension or extensionless urls. A web application can dynamically generate any type of url with any extension you want.

And theoretically you don't even need a server level rewrite to process the requests of friendly urls. The application can handle those too and be independent of environment.

ChanandlerBong




msg:4465843
 11:44 am on Jun 15, 2012 (gmt 0)

thanks. I cannot parse straight html files for SSI or php, my host has always been strict on that (saying it causes burden on a server), so it's a non-starter if I want to stay with same hosts, which I do as they're generally fantastic in everything else. As far as I know, that means I have to use .php extensions if I plan on using php.

So 98% of my pages are currently .shtml. There are a few .doc and .txt files too, and some PDFs, so clearly I don't want a rewrite that says "change everything to php" but rather "if .shtml is requested, serve the file with .php extension". If I go completely extensionless, will that cause problems for txt, pdf, etc...also, what about image files? Do I have to remove their extensions in internal links too?

enigma1




msg:4465860
 1:32 pm on Jun 15, 2012 (gmt 0)

Dynamic friendly urls do not represent physical file locations. It's just how you want the links to be exposed. It's independent of hosting.

lucy24




msg:4466035
 9:33 pm on Jun 15, 2012 (gmt 0)

There will be three kinds of URL

blahblah/ with trailing slash = directory. Your server already rewrites this to serve content from your specified index file, so this doesn't change. Unless, ahem, you've got explicit links to blahblah/index.php or similar: those should be redirected in any case.

blahblah/name.jpg with extension = subsidiary file such as image, style sheet, script etc. Also files that the user never sees, like SSIs or code for auto-indexing.

blahblah/name without extension = page. This is the one that requires a rewrite.

Any redirect of .php or .shtml will include a look at {THE_REQUEST}. In your case it does two jobs: it avoids infinite redirects if mod_rewrite is already looking at the result of an internal rewrite, and it bypasses files the User Agent didn't explicitly ask for, such as SSIs.

Note "User Agent": it's not always the same thing as User. Look in your logs and you will see that a redirect (301 or 302) comes in as a fresh request even though the human didn't have to re-type the name; their browser did it for them. But an SSI or an auto-generated index isn't logged as a request.

The exact wording of your redirects and rewrites will depend on the files. You are probably looking at some combination of all-around matching like

^easydirectory/([^/.]+/)

and file-specific matching like

(thisfile|thatfile|totherfile)

Redirects come before Rewrites. Use mod_rewrite for everything.

g1smd




msg:4466040
 9:48 pm on Jun 15, 2012 (gmt 0)

Link to extensionless URLs from your pages.

Redirect requests for the old URLs with extensions to the new extensionless URLs. A redirect is a URL to URL translation.

Rewrite requests for extensionless URLs to the internal filepath that will serve the content. A rewrite is a URL to filepath translation.

Use a RewriteRule for the redirect and another RewriteRule for the rewrite. The syntax is only slightly different for these these two different functions.

List all redirects before the rewrites start.

While pages might have extensionless URLs the files on the hard drive do still need to have an extension.

incrediBILL




msg:4466089
 11:26 pm on Jun 15, 2012 (gmt 0)

I cannot parse straight html files for SSI or php, my host has always been strict on that (saying it causes burden on a server), so it's a non-starter


The non-starter is your host.

Find someone that isn't hostile to their customer's needs because their reasons are a crock for a real server.

g1smd




msg:4466094
 11:45 pm on Jun 15, 2012 (gmt 0)

If I go completely extensionless, will that cause problems for txt, pdf, etc...also, what about image files? Do I have to remove their extensions in internal links too?

Only HTML pages are accessed with extensionless URLs. URLs for image files and for css and js files still have extensions. All files inside the server will still also have extensions.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved