homepage Welcome to WebmasterWorld Guest from 54.227.20.250
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
mod rewrite basics shared hosting
mod rewrite basics shared hosting
maniac

5+ Year Member



 
Msg#: 3587561 posted 10:22 pm on Feb 28, 2008 (gmt 0)

I've been deploying a technique for using search engine friendly urls where a 404 error page handles requests and includes the appropriate files. However, I've been finding that some people (seemingly at random) get sent 404 errors. I'm guessing this might be to do with php sending 404 headers before sending the page content.

I've also learned that google is not indexing all of the pages. My urls look like:
domain.com/products/My+Product+Name

I don't know if it's something to do with the urlencoding of spaces.

At any rate I wondering if I actually need to use genuine mod re-writes. But I can't find where to start. I am on shared hosting so I cannot edit the php.ini or httpd.conf files. Is this still possible?

Where do I start? Does anybody know some simple starter tutorials for mod-rewrites?

 

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3587561 posted 12:12 am on Feb 29, 2008 (gmt 0)

Top og this page and just above APACHE WEB SERVER?
Is a little word that says LIBRARY.

Two of the links at that page:
Beginning Mod Rewrite
[webmasterworld.com...]

Mod_Rewrite & Regular Expression
[webmasterworld.com...]

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3587561 posted 1:47 am on Feb 29, 2008 (gmt 0)

maniac,

Welcome to WebmasterWorld!

I also have to comment that
> I've been deploying a technique ... where a 404 error page handles requests and includes the appropriate files.
is one sure way to destroy your site's ranking in search engines.

This is a 1990s technique that was used before mod_rewrite was available on shared hosting (think GeoCities free hosting). But this was also in the days before search engines were important players in the Web world -- We had two popular ones, Lycos and and AltaVista, as I recall. So, it didn't much matter that you told everyone who requested any URL from your server that that URL didn't exist with a 404-Not Found response.

Things are a bit different now.

To duplicate the function of the old 404 page redirector method, you should use mod_rewrite [httpd.apache.org]. I will refer you to the resources that wilderness cited above for further study, but the basics of it would be:

Options +FollowSymLinks
RewriteEngine on
#
# If requested URL does not resolve to an existing file
RewriteCond %{REQUEST_FILENAME} -f
# and does not resolve to an existing directory
RewriteCond %{REQUEST_FILENAME} -d
# then rewrite the request to the page-generation script.
RewriteRule .* /page-generator-script.php [L]

Where "page-generator-script.php" contains the same code as your current "404-redirector" page, and replaces it -- You can call it anything your like if you modify this code and rename the script in agreement.

Pay very close attention to the HTTP server response codes [w3.org] you return in the new Web world of today. Each one has a very specific meaning, and if you want to play in search-engine land, you need to say exactly what you mean. Check your server response codes using the "Live HTTP Headers" add-on for Firefox/Mozilla browsers.

The second-most common error (after returning a 404-Not Found for pages that do exist) is returning a 302-Found when a 301-Moved Permanently more accurately describes the situation. In the first case, search engines will keep the old URL and assign the content of the redirected-to page to that old URL. This rather defeats the purpose of the redirect. If a 301 is returned, the search engines understand that the new URL replaces the old one, and that they should discard that old URL and use the new one from now on. All PageRank/Link-popularity credit for the old URL should be assigned to the new URL. Don't expect this to work perfectly or instantly, but in general, this is how it should go.

After that comes the mistake of returning *any* redirection response, when what is really needed is an internal URL-path rewrite, as implemented by the code above.

I would caution you not to copy any "good ideas" from the Web until you understand them completely, and have investigated their possible side-effects (such as telling search engines that none of the URLs on your site exist). This Web thing really isn't a game where "copy-and-paste" is good enough.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved