Forum Moderators: phranque

Message Too Old, No Replies

Numbers being stripped from rewrite, why?

         

brokaddr

9:27 pm on Aug 26, 2011 (gmt 0)

10+ Year Member



I use an htaccess rule to rewrite a page, the _XX.html (XX being the value I search wildcard in my database); works perfect for letters, but numbers are not read.

Can anybody let me know what I may have done incorrectly?

htaccess rule:
RewriteRule ^page/([^.]+)_([^.]+)\.html$ page.php?cPath=$1&alphanumeric=$2 [NC,L] 


the alphanumeric string equates into this:
http://www.example.com/page/page-name_A.html

*A* being what I wilcard search in my database with. Turns up as a blank wildcard with numbers.

g1smd

9:56 pm on Aug 26, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The first
([^.]+)
says "read until you find a period, then stop" and therefore doesn't find the underscore after it. The first RegEx "eats" it.

Try
([^._]+)_
instead.

brokaddr

10:59 pm on Aug 26, 2011 (gmt 0)

10+ Year Member



Is this correct, I'm still getting a blank response for numbers (but letters work):
RewriteRule ^page/([^._]+)_([^.]+)\.html$ page.php?cPath=$1&alphanumeric=$2 [NC,L] 

lucy24

1:09 am on Aug 27, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Which of these is happening?

--pages with names in the form "page-name_2" aren't recognized at all (and therefore don't arrive at the php page)

--pages with names in the form "page-name_2" do arrive at the php page, but the second part of the query string is empty

Does anything behave differently if you change the second capture to

([A-Z0-9]+)

?

Incidentally, is the "page-name_" part variable? That is, it could be "foobar_" or "buynow_" or anything? If there's only a small range of possible names, you might do better putting them into the rule as pipe-separated options instead of using wildcards.

brokaddr

7:53 pm on Aug 27, 2011 (gmt 0)

10+ Year Member



This is what's happening;
--pages with names in the form "page-name_2" do arrive at the php page, but the second part of the query string is empty

Does anything behave differently if you change the second capture to

([A-Z0-9]+)

Same result, nothing changes.

After some further digging in my code, this appears to work for a portion of the code:
$getletter = isset($_GET['letter']) ? preg_replace("/[^A-Z0-9\/]/", "", $_GET['letter']) : '';


-- operational, until we get to here (with numbers):

$categories_query = tep_db_query("select c.categories_id, cd.band_name, cd.categories_name, c.categories_image, c.parent_id from " . TABLE_CATEGORIES . " c, " . TABLE_CATEGORIES_DESCRIPTION . " cd where c.parent_id = '" . (int)$current_category_id . "' and c.categories_id = cd.categories_id and cd.categories_name LIKE '$getletter%' order by sort_order, cd.categories_name");


Now I am wondering if my method of wildcard search is perhaps breaking this and not the htaccess?
What's extra confusing is that letters work in the query; any number does not.... BUT, if I hard-code a number, like:
$categories_query = tep_db_query("select c.categories_id, cd.band_name, cd.categories_name, c.categories_image, c.parent_id from " . TABLE_CATEGORIES . " c, " . TABLE_CATEGORIES_DESCRIPTION . " cd where c.parent_id = '" . (int)$current_category_id . "' and c.categories_id = cd.categories_id and cd.categories_name LIKE '4%' order by sort_order, cd.categories_name");

... it works.

brokaddr

5:20 am on Aug 30, 2011 (gmt 0)

10+ Year Member



Is this something I should continue in the php forum? I only know the absolute bare basics of htaccess so I initially suspect htaccess as the culprit of my woes.

Could this be more of a php problem?

brokaddr

6:48 am on Aug 30, 2011 (gmt 0)

10+ Year Member



I could not edit my previous post.

Sure enough, it was a php issue. The culprit was two things:
1) my forced 404 for nonexistent pages was backwards. Instead of > I had <

2) Another way this bypassed my many trial-and-errors before posting here:
$getletter = isset($_GET['letter']) ? preg_replace("/[^A-Z0-9\/]/", "", $_GET['letter']) : '';

Once replacing that with:
$getletter = isset($_GET['letter']) ? preg_replace("/[^A-Z0-9\/]/", "-", $_GET['letter']) : '';


the added "-" would force a 404, because any irregular character or submission would have an extra - to query; thereby resulting in mysql_num_rows = 0.... rather than adding a "" (a blank character, which all that achieved was a compounded wildcard search; pointless with what I was trying to achieve!)


I have edited my htaccess rule based on the suggestions g1smd and lucy24 offered. The suggestions will be put to use!. :) I appreciate the help!

g1smd

7:35 am on Aug 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Often the problem is quite a distance away from the apparent symptom.

Glad you fixed it.

Test thoroughly with a wide variety of both expected and unexpected (malformed in as many ways as you can think of) URLs.