Forum Moderators: phranque

Message Too Old, No Replies

mod rewrite leads to 404 response

         

tiger11

6:55 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



I've tried various fixes and venues to track down this problem - hoping someone will be able to help. I have a database driven site and after many suggestions have rewritten a test section to see if it
works - it isn't.

Here's what I'm doing: I have a directory called /umb in which I have
the file index.php as well as an .htaccess file. The .htaccess file
reads:

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{SCRIPT_FILENAME} !-f
RewriteCond %{SCRIPT_FILENAME} !-d
RewriteRule ^(.*)$ index.php?loc=$1

Basically saying for any call to this directory take the part after
umb/ and makes that the variable loc. I then use php in index.php
like this:

// Parses the URL to help determine the requested content
$url = preg_replace('/[^[:alnum:]\-\/]/', '', $_GET['loc']);
// Get rid of any trailing /
if ($url[strlen($url)-1]=="/"){
$url = substr($url, 0, -1);
}
$expl = explode("/",$url);
$page_id = $expl[count($expl)-1];

To get the variable page_id and then check the database for matching items:

// DB query gets the scheme details and gets the 404 page if request
is invalid
$resultUmbrella = mysql_query("SELECT * FROM tablename Where
column='$page_id'", $dbConn);
$rowUmbrella = mysql_fetch_assoc($resultUmbrella);
// If not looking for the index (no url entered) or there is no result
from the db
if ((!($page_id==""))&&(!($rowUmbrella["UmbrellaName"]))){
// Include the 404 page
include($include_path.'/404.htm');
exit;
}

However while http://example.com/umb/index.php returns 200

While http://example.com/umb/ returns 404

as does http://example.com/umb/FLTA which is an example of an actual page.

These pages are all working via browsers but respond with a 404. Any thoughts are appriciated.

[edited by: jdMorgan at 7:27 pm (utc) on Nov. 12, 2007]
[edit reason] example.com [/edit]

jdMorgan

7:34 pm on Nov 12, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The /umb/ request is likely to not invoke your rule, since that URL exists as a directory. However, unless index.php is defined using DirectoryIndex as the index page for /umb, you should get a 404 as a result of the index page request.

Then we have the added complication of the script also producing 404 responses, so it's hard to tell where the error is coming from. One way to find out is to temporarily change your code to produce an external redirect, so that you can see (in your browser address bar) whether the code is invoked to pass the request to the script. Then you can determine whether the server itself or your script is producing the 404 for each URL that you test:


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{SCRIPT_FILENAME} !-f
RewriteCond %{SCRIPT_FILENAME} !-d
RewriteRule (.*) http://example.com/index.php?loc=$1 [R=302,L]

Jim

tiger11

7:55 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



Thanks for your reply - and we're definitely getting somewhere.

Just for clarity, the .htaccess file outlined is actually in the /umb directory. I've modified the .htaccess file in the /umb directory as you suggested to it reads:

DirectoryIndex index.php
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{SCRIPT_FILENAME}!-f
RewriteCond %{SCRIPT_FILENAME}!-d
RewriteRule (.*) http://example.com/umb/index.php?loc=$1 [R=302,L]

and now the page http://example.com/umb/FLTA does indeed return as 200 although it appears in the browser bar as:

http://example.com/umb/index.php?loc=FTLA rather than http://example.com/umb/FLTA

http://example.com/umb/ does still return 404 even with the DirectoryIndex index.php

What an I missing?

Thanks.

jdMorgan

8:10 pm on Nov 12, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



http://example.com/umb/ does still return 404 even with the DirectoryIndex index.php

What an I missing?

When testing with this URL, what do you see in the browser address bar?

If you see the originally-requested URL, then the rule was not invoked, and the 404 was returned by Apache itself. If you see the script URL, then the rule was invoked, and the 404 was returned by your script.

Jim

tiger11

8:21 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



I see the url itself - so http://example.com/umb/ rather than http://example.com/umb/index.php?loc=

tiger11

9:03 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



Further investigation seems so show 2 challenges:

First, the conditions are not picking up example.com/umb so that's not executing the mod_rewrite for the index.

Second, unless the redirect is specifically called the header sends 404, so using:
RewriteRule ^(.*)$ index.php?loc=$1

creates a browser reachable page when entering http://example.com/umb/GEN and the browser bar shows http://example.com/umb/GEN but the header response is 404

As opposed to using:
RewriteRule ^(.*)$ index.php?loc=$1 [R=301]

which creates a browser reachable when entering http://example.com/umb/GEN but the browser bar shows http://example.com/umb/index.php?loc=GEN with a header response of 301 then 200.

With this situation I'd be left choosing between either having pages with clean urls or having pages the are index-able by search engines.

Is there any way to overcome this?

Thanks again.

jdMorgan

9:28 pm on Nov 12, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Is there any way to overcome this?

Of course there is... but panic won't help.

I'd like to point out that I suggested a 302 redirect above, so that search engines won't pick up your dynamic URLs while you are working on this. SO first order of the day is to fix that.

Jim

tiger11

9:43 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



Sorry if I sounded like I was panicking - I didn't mean to come across that way. What I meant was that I figured there must be a way around this just that I was missing it. I continue to appreciate the help.

As for the R=302 - I do indeed have it set that way right now, just a typo when I put 301.

jdMorgan

9:51 pm on Nov 12, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Why do you feel it necessary to check for "file exists" and "directory exists" in this code? I'm asking this as an open-ended kind of question to determine your actual requirements.

Jim

tiger11

10:20 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



I had included the check for "file exists" and "directory exists" as the end of the url looked like a directory (GEN in http:/example.com/umb/GEN) and under those circumstance I wanted the index.php to run so that the db could be queried.

I suppose because I always want index.php?loc=something to be run that the conditional is not really needed - with that in mind and out of curiosity I tried deleting the conditionals so the .htaccess in /umb read:

DirectoryIndex index.php
Options +FollowSymLinks
RewriteEngine on
RewriteRule (.*) http://example.com/umb/index.php?loc=$1 [R=302,L]

and then I got the browser message:

Firefox has detected that the server is redirecting the request for this address in a way that will never complete.

jdMorgan

11:21 pm on Nov 12, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, it'll loop since you removed the only thing that stopped it from looping. It'll redirect /umb/index.php?loc=<original request> to /umb/index.php?loc=index.php and then to /umb/index.php?loc=index.php which will redirect to itself until the browser or server gives up.

So, unless you actually have real directories and files under /umb, you can/should use the much more efficient construct:


RewriteCond $1 !^index\.php$
RewriteRule (.*) http://example.com/umb/index.php?loc=$1 [R=302,L]

to stop the loop.

Jim

tiger11

11:32 pm on Nov 12, 2007 (gmt 0)

10+ Year Member



That's excellent - thanks. It completely takes care of the first challenge.

So now using:

RewriteRule (.*) http://example.com/umb/index.php?loc=$1

When I go to [ecolabelling.org...] the page displays perfectly, the server header returns 302 then 200 and the browser bar displays http://example.com/umb/index.php?loc=GEN

In the hopes of getting it to read [ecolabelling.org...] I changed it to:

RewriteRule (.*) index.php?loc=$1

When I go to [ecolabelling.org...] the page displays perfectly (so index.php is being called correctly) and the browser bar shows [ecolabelling.org...] but the server header returns 404.

I'm so close, no? :)

jdMorgan

12:03 am on Nov 13, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Remember to always use the [L] flag unless you want the output of this rule to be processed by subsequent rules:

RewriteRule (.*) index.php?loc=$1 [L]

So, now it's time to debug your PHP script... :)

For that, I'll refer you to our PHP Server-Side Scripting forum for competent help.

Jim

phranque

12:20 am on Nov 13, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], tiger11!

in case you haven't found the WebmasterWorld php forum [webmasterworld.com]...