Forum Moderators: phranque

Message Too Old, No Replies

Creating Search Engine Friendly URLs

         

username

11:31 pm on Apr 15, 2009 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi, I am very new to SEF URL's with .htacess, so excuse my lack of progress thus far.

What I want to do is have a URL, such as http://www.example.com/members/ include another folder which does not exist on the server i.e. a virtual folder. What I have now is http://www.example.com/members?user=bob, where bob, would be the virtual folder (http://www.example.com/members/bob/). I am trying to clean things up, instead of having ugly URL strings for member pages.

The issue I am having is grabbing the virtual folder, and storing it in a $username variable in the http://www.example.com/members/index.php file. Is this possible?

If so, some of the usernames in the string may contain a full stop or dash. Will this cause issues?

Thanks.

[edited by: tedster at 7:14 pm (utc) on April 18, 2009]
[edit reason] switch to example.com - it cannot be owned [/edit]

g1smd

12:54 am on Apr 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What code have you used so far?

You'll be using a RewriteRule I guess, coded as a rewrite, NOT as a redirect.

The URL will be www.example.com/members/bob and the internal filepath inside the server will be /members/index.php?name=bob I guess.

That's quite easy to code; there's an example like that almost every day here.

What code have you tried so far?

Do be aware that it is links that define' URLs, so you need to link to the www.example.com/members/bob URL format in the pages of your site.

Hyphens or dots will be OK in your URLs. Note that spaces and/or underscores would be a disaster.

.

Once the rewrite is coded and working, we will need to talk about adding additional redirects for canonicalisation and for avoiding Duplicate Content caused by direct access to the old format URLs.

username

1:58 am on Apr 16, 2009 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi g1smd, so far I have been trialling

RewriteEngine On
Directory Index members/
RewriteRule ^/([a-z][A-Z+][0-9]+)/ index.php

...but this is not working?

I was told by my host providers that the Directory index can be put in the members folder or the root directory. Which do you suggest?

jdMorgan

2:43 am on Apr 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In www.example.com/.htaccess :

RewriteEngine on
RewriteRule ^members/([a-z0-9][a-z0-9\-]{0,22}[a-z0-9])/$ /index.php?name=$1 [NC,L]

membernames consisting of a minimum of two and a maximum of 24 upper- or lowercase letters are allowed. Hyphens are allowed anywhere in the middle, but not at the beginning or end.
See the resources cited in our Forum Charter, and the threads in our Forum Library. Links to both are at the top of every page in this forum.

Jim

jdMorgan

2:45 am on Apr 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If the code goes into www.example.com/members/.htaccess, then it changes to:

RewriteEngine on
RewriteRule ^([a-z0-9][a-z0-9\-]{0,22}[a-z0-9])/$ /members/index.php?name=$1 [NC,L]

Jim

g1smd

10:54 pm on Apr 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That should get you started with part of what you need.

How are you getting on?

username

4:01 am on Apr 18, 2009 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi guys, definitely making progress, however there are a couple of minor bugs. For example:

1. If the user enters http://www.example.com/members/name with no slash (/) on the end, the page gets a 404. With the slash it works.

2. When I include non alphanumeric characters such as . or - it breaks and causes a 404 also. Other than that, it works fine.

Note: I am using the .htacess file from within the members dir not the root dir.

Your help would be appreciated.

g1smd

6:08 pm on Apr 18, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The 404 is correct. One URL should deliver the content, the other should be 'the wrong URL'. The rule can be adjusted to reverse the actions, but only one URL should deliver content.

The rule can also be adjusted to allow other characters to appear in the URL. You didn't mention any other parameter value than 'bob' in your problem description, so nothing other than 'characters' was provided in the solution.

jdMorgan

12:18 am on Apr 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To fix the slash problem, you need an additional rule. And to allow periods in usernames, you need to modify the regular-expressions patterns in the original rule to accept them:

RewriteEngine on
#
# Externally redirect to forcibly remove trailing slash (canonical URL has no trialing slash)
RewriteRule ^([a-z0-9][a-z0-9.\-]{0,22}[a-z0-9])/$ http://www.example.com/members/$1 [NC,R=301,L]
#
# Internally rewrite slashless membername requests to /members/index.php script.
# Membernames may be up to 24 characters long, must start with a letter or number, may
# contain letters, numbers, periods, or hyphens, and must end with a letter or number.
RewriteRule ^([a-z0-9][a-z0-9.\-]{0,22}[a-z0-9])$ /members/index.php?name=$1 [NC,L]

You can easily accept *any* membername (as long as your script enforces strict rules about membernames) by changing the pattern to "^(.+)$". Do NOT do this unless your script accepts a very limited character set, or you risk creating security vulnerabilities. I suggest the restrictions implemented in the patterns here: There's no reason to risk vulnerabilites just because someone 'wants' to use 'cool characters' in their username -- The ones allowed here are sufficient.

Jim

username

11:25 pm on Apr 19, 2009 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi, so still having issue with this. Currently I am using:

RewriteEngine on
RewriteRule ^(.+)/$ /members/index.php?name=$1 [NC,L]

This works for urls with / on the end, but not with no forward slash. I tried the rules from jdMorgan, but they caused Internal server errors. The "^(.+)/$" has allowed my - & .'s to be included , and I can control the other variables elsewhere, but still need control on the end slash issue. Any ideas?

Thanks in advance.

g1smd

6:55 pm on Apr 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For extensionless URLs, they should not end with a slash. So, jd's code redirected to remove the slash, and then rewrite slashless requests to the internal filepath.

You have changed the rewrite to accept URLs that do include a slash. That's not a good idea if the redirect is still in place to take it off. That's a conflict.

I recommend that you use the slashless URL as the canonical form, and redirect to that.

username

10:54 pm on Apr 22, 2009 (gmt 0)

10+ Year Member Top Contributors Of The Month



Ok, so after testing a few versions, I still can only get a url with a / on the end to not throw a 404. I have:

RewriteEngine on
RewriteRule ^(.+)/$ http://www.example.com/members/$1 [NC,R=301,L]
RewriteRule ^(.+)$ /members/index.php?name=$1 [NC,L]

Also, this rule does not like have query string items such as: http://www.example.com/members/test/?page=1

It redirects back to http://www.example.com/members/test?page=1, which is also causing issues, but I am assuming this is just because the rewrite is incorrect.

Any ideas?

g1smd

11:18 pm on Apr 22, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The query string will be automatically re-appended to the target URL of a redirect unless you add a question mark to the target URL in the rewrite rule. The question mark says to 'clear the query string

Note that the page=1 parameter is 'lost' in the rewrite. You'll need to use the [QSA] flag if you need to get it re-appended on the end of the path.

username

12:39 am on Apr 23, 2009 (gmt 0)

10+ Year Member Top Contributors Of The Month



Ok, so the QSA flag worked well, and and redirect is also working well. However, no matter what Rewrite Rule I enter, urls such as http://www.example.com/members/test (with no slash) will not run properly and causes a Internal Server 500. The redirect points to this url, but the rule is failing. If I enter:

RewriteEngine on
RewriteRule ^(.+)$ /members/index.php?name=$1 [NC,L,QSA]

This does not work.

Only the rule for trailing slashes works:

RewriteEngine on
RewriteRule ^(.+)/$ /members/index.php?name=$1 [NC,L,QSA]

If I can understand why I cannot write a rule to accept urls with no slash on the end this should be fine i.e. http://www.example.com/test?page=1

Any ideas?