Forum Moderators: phranque

Message Too Old, No Replies

Friendly URLs - htaccess - mod rewrite problems

mod rewrite problems return original url

         

zorro

3:47 pm on Apr 18, 2010 (gmt 0)

10+ Year Member



I would appreciate very much anyone who can help with the following problems I am having with my .htaccess mod_rewrite rules:

Our site is run using html, php and mysql on shared server using:
Linux operating system
MySQL 5.1.30
Apache version 2.2.15
php version 5.2.13

We are running a real estate site where all properties are referenced from the database as follows (using 555 as property id example):
http://www.example.com/accommodation.php?id=555

I am trying to write friendly URL so the 'property type' then id number shows instead.
For example: http://www.example.com/Apartment-555
Or http://www.example.com/Villa-555

We have property types:
Apartment
Villa
Bungalow
Finca
Townhouse
House
Studio
Cottage
Duplex

I have achieved this but with problems, here is the .htaccess file:

SetEnv TZ Europe/London

Options +FollowSymLinks

RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]

THE ABOVE IS TO MAKE ALL REQUESTS GO TO WWW

RewriteRule ^Apartment-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]
RewriteRule ^Villa-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]
RewriteRule ^Bungalow-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]
RewriteRule ^Finca-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]
RewriteRule ^House-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]
RewriteRule ^Studio-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]
RewriteRule ^Cottage-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]
RewriteRule ^Townhouse-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]
RewriteRule ^Duplex-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [P]

THE ABOVE ARE WORKING BUT SEE PROBLEMS BELOW

ErrorDocument 404 /404.shtml

THE ABOVE IS A CUSTOM 404 PAGE

All relevant links in all pages have been changed to accomodate the change for example the old links which used to say:
<a href='accommodation.php?id=".$row["id"]."'> now read...
<a href='".$row["accommodation_type"]."-".$row["id"]."'>
And the links work great - showing the friendly URL in the address bar of the browser as: http://www.example.com/Apartment-555

PROBLEM;

The issue I have is this... If I change the word 'Apartment' to Bungalow or any other property type in the address bar so it reads:
http://www.example.com/Bungalow-555 (it still shows Apartment-555), there isn't a Bungalow-555 in the database so I was sort of expecting a error page!
And I certainly don't want Google to think there is an Apartment-555 and Bungalow-555 and Villa-555 etc etc.

I only want whatever is stored in the database to show as a friendly URL

On another note, I was looking around at other property sites to see if they did the same and came across a site called <snip>.

Ideally we would like our URL's to behave the same as this site and I have spent the past 7 days reading dozen of sites about Apache, Mod_rewrite and friendly URL's, htaccess and more. I have tried dozen of different conotations but still can't seem to get it right.

Here is what happens with the competitor-site site;

If you take a look at anyone one of the properties they have, for example:
http://www.competitor-site.co.uk/rentals/mijas-costa/22122

If you enter the above URL into your browser and remove any character or several characters from either the word rentals or the word mijas-costa and hit the enter/return key the misspelled URL jumps back to the original URL above automatically.

Example:
http://www.competitor-site.co.uk/reals/mijas-costa/22122 (couple of letter taken out of the word rentals)
http://www.competitor-site.co.uk/rentals/mijsta/22122 (couple letters taken out of mijas-costa)

but after hitting the enter key on keyboard

Both incorrect urls jump back to the original;
http://www.competitor-site.co.uk/rentals/mijas-costa/22122

I presume this is done with mod rewrite and .htaccess?
I would be so grateful if someone could help me on these issues and how to achieve the desired results?

[edited by: jdMorgan at 5:12 pm (utc) on Apr 18, 2010]
[edit reason] example.com, obscured URLs [/edit]

jdMorgan

5:27 pm on Apr 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You're asking your server to do the impossible -- to declare "Not Found" based on database entries that only your script can access.

Put another way, the only way you will get a 404 for any URL matched by your rules is if the "accommodation.php" script itself is not found. The server has no way to know --in advance-- whether a database entry exists for Bungalow-555, Apartment-555, or Hovel-555. The server only knows whether a URL resolves to an existing *file* -- not whether a database entry exists.

The likely solution to all of your issues here is to simplify your code down to a single rule such as
 RewriteRule ^[A-Z][a-z]+-([0-9]+)$ http://www.example.com/accommodation.php?id=$1 [L] 

and then handle everything else in your script.

The script can take the "id number," look it up in your database, and if that id exists and the "words" describing the accomodation type are correct, generate and serve a "page" for that property.

If the id is valid, but the "words" are not *exactly* right (to include any/all casing and spacing errors), then generate a 301-Moved Permanently server response, providing the "corrected words" from the database.

Finally, if the id is invalid, then return either a 404-Not Found response, or in the case where a database entry exists but is outdated, produce a 410-Gone response or a redirect to an appropriate "current" listing or category page (this would require the necessary information to be added to your database).

So, only your script has access to the information required to implement the solutions and features you want here -- it's all in the database.

Note that if you cannot modify "accomodation.php" because it's an off-the-shelf script which may be frequently-updated (thus reverting all of your changes), the solution is to put a 'wrapper' script around it to do the functions described above; The mod_rewrite code calls the wrapper script, it does the checking described above, and if the request is valid, simply "includes" the accommodation.php script and runs it.

Jim

zorro

7:41 pm on Apr 24, 2010 (gmt 0)

10+ Year Member



Thank you for the advice jd, much appreciated. The script is not off the shelf and was created from the ground up for us by some Bulgarian programmers.
I am by no means a programmer my self but over the last 12 month have been changing a few things and have a basic understanding of how some of the php code works - I'm more of a html only person.

I'll try the new Rewrite rule as you suggest but i'm not sure where to start with linking the id number with the correct accommodation type.
I appreciate what your saying I just can't get my head around the bit where you say "if id exists and accommodation type are correct" or "id is valid but accommodation type is not".

If id is valid how do I say how the accommodation type is right or not as it could be one of several depending on what the user chooses.
One id could be an Apartment
Another id could be a Bungalow.

Any help greatly appreciated!

g1smd

8:06 pm on Apr 24, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I assume that each database entry contains data about the accommodation type stored against each ID number.

Pass both the requested type and the requested number to the script in two variables.

RewriteRule ^([A-Z][a-z]+)-([0-9]+)$ /accommodation.php?type=$1&id=$2 [L]


Get the script to look in the database and check that the requested type matches that number.

If the ID number is not found at all in the database, then immediately send the 404 header and an error message.

If the type retrieved from the database doesn't match the requested ID number, then use the type you got from the database for that ID to build an immediate 301 redirect header to the correct URL.

If the requested ID does match the requested type then retrieve all the content for the page from the database and then send the content as an HTML page.

zorro

10:17 pm on Apr 24, 2010 (gmt 0)

10+ Year Member



Thanks g1 for your speedy response and please forgive my ignorance but would it be still ok to add the extra type=$1 to the original url which was just:

accommodation.php?id=15

(15 is example)

Many thanks!

g1smd

10:44 pm on Apr 24, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, add the extra variable, then add a chunk of code at the beginning of your script to do the various checks I detailed above and send the correct responses depending on the outcome.

Your custom built script should have had these features included from the beginning. Whoever wrote the requirements specification for your script missed a major part of what it needs to do.

Too often, designers focus only on what happens when a valid request arrives at the server and fail to design in proper features to cope with when a non-valid request is received.

The good news is that it should only be a few dozen lines of PHP code and one call to the database.

zorro

2:03 pm on Apr 28, 2010 (gmt 0)

10+ Year Member



Thanks again g1.

Afraid I have to hold my hand up here as I was the major contributor to the design.
Again thanks a lot!

g1smd

2:32 pm on Apr 28, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



LOL. Maybe I'm a bit blunt sometimes. However, always ask yourself when you have designed it so that a request for
www.example.com/something
returns some content; what happens if I ask for...
- non-www:
[b]ex[/b]ample.com/something

- an appended port number:
example.com[b]:80[/b]/something

- an added trailing slash:
example.com/something[b]/[/b]

- an unwanted parameter:
example.com/something[b]?junk-on-the-end[/b]

and so on.

If any of those return a duplicate page with "200 OK" status, then you have a problem to fix. If any return a blank page with "200 OK" status, that also needs fixing. If they return either 404, or a single-step 301 redirect to the canonical URL then you're doing OK.

You now need to add one more part to your site. If anyone directly asks for
(www.)example.com/accommodation.php?type=<something>&id=<number>
or for
(www.)example.com/accommodation.php?id=<number>&type=<something>
with or without extra parameters, they need to be redirected to
http://www.example.com/<type>-<number>
. This is the final part in getting searchengines to update their list of valid URLs on your site, and prevent Duplicate Content issues.

zorro

4:58 pm on Apr 28, 2010 (gmt 0)

10+ Year Member



Brilliant, Cheers!

g1smd

6:08 pm on Apr 28, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Similar topic: [webmasterworld.com...]