Forum Moderators: phranque

Message Too Old, No Replies

adding slash at the end of url

slash url aliases

         

thosecars82

1:43 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



Hello
I am trying to add a slash for a particular url at the end of it in case the user has not entered. For achieving this, I just placed this simple code:

RewriteCond %{REQUEST_URI} ^/home/language/en$
RewriteRule .* /home/language/en/?
I placed it right after the RewriteEngine on sentence.

Would you please suggest any reason why this couple of sentences is not doing this redirection?

http://www.example.com/home/language/en -> http://www.example.com/language/en/

What I see is this: when I browse http://www.example.com/home/language/en then the browser can not open the page.

However, if I remove the two sentences mentioned above and I browse the same url without the slash at the end, then thanks to the other code in my .htaccess, the browser actually can open the corresponding page, that is to say a page with the same content as if
http://www.example.com/language/en/ had been browsed.

It is just that I do not want to have aliases for SEO purposes and therefore, I would like to add this slash at the end of the url.
Thanks in advance

thosecars82

1:45 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



Sorry, I have to write the post again because I made some mistake and I have to correct it. However I do not know how to edit posts already submitted in this forum. So here it is the post corrected:

Hello
I am trying to add a slash for a particular url at the end of it in case the user has not entered. For achieving this, I just placed this simple code:

RewriteCond %{REQUEST_URI} ^/home/language/en$
RewriteRule .* /home/language/en/?
I placed it right after the RewriteEngine on sentence.

Would you please suggest any reason why this couple of sentences is not doing this redirection?

http://www.example.com/home/language/en -> http://www.example.com/home/language/en/

What I see is this: when I browse http://www.example.com/home/language/en then the browser can not open the page.

However, if I remove the two sentences mentioned above and I browse the same url without the slash at the end, then thanks to the other code in my .htaccess, the browser actually can open the corresponding page, that is to say a page with the same content as if
http://www.example.com/home/anguage/en/ had been browsed.

It is just that I do not want to have aliases for SEO purposes and therefore, I would like to add this slash at the end of the url.
Thanks in advance

jdMorgan

2:23 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That code does an internal rewrite, and you apparently need an external redirect. Also, the RewriteCond is not needed at all:

RewriteRule ^home/language/en$ http://www.example.com/home/language/en/? [R=301,L]

This rule, placed in www.example.com/.htaccess, will redirect requests for example.com/home/language/en or www.example.com/home/language/en to www.example.com/home/language/en/ and remove any query strings appended to the requested URL.

Jim

thosecars82

2:36 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



Hello jdMorgan
Thank you very much because it worked by adding R=301 to my code, that is to say, the same behavior as the more simplified code you wrote.. However, I do not understand why by placing the flag R=301 fixes the problem to my code. Actually, by placing this flag, the redirection works like I wanted. But, why if I change the external redirection into an internal rewrite then it stops working? I have to say that I am not using the L flag and it works to me. So, why is such a difference in the behavior of the redirection just for adding R=301 to my code? I thought that adding this was only for telling other sites linking to you to change their links because they were old.
Thanks

jdMorgan

2:45 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You need an external redirect to avoid duplicate content "penalties" -- Problems related to allowing the same content to appear at more than one unique URL.

Always use the [L] flag, unless you have a very good reason that you do not want to. It can make mod_rewrite processing *far* more efficient.

Jim

thosecars82

3:08 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



Thank you
I think I understand it better now. It is just that I had to mull it over.
I understood:

->without R=301: for rewriting internally(no change is seen on the browser address bar) the request to an internal path that exists in the servserver like you told me without letting the user see any visible change in the url.

->with R=301 for redirecting externally the url, that is to say, the user sees a change on the browser address bar and at the same time this address bar might point to a physical internal path that exists in the server or it might still point to another virtual path in the server and in this case it would have to be rewrited (without R=301) or redirected(with R=301) with another rule considering that the path does not exist physically in the server yet.

Question.
Imagine this case: There are more than one rule with R=301 which is applied for a virtual url entered by the user. This is due to the fact that the different rules change the virtual url entered by the user into other virtual urls until getting to the physical path or url having run through several different virtual urls. I know this example might not make much sense because if all the rules applied one after another have the flag R=301 applied, it is the same as if only the last rule applied this flag R=301 and the previous ones omitted it. However, if we did not omit this flag in any of these rules applied one after another,
Would we see all the different urls changing in the browser address bar starting from the virtual url entered by the user and passing through all the others virtual urls got with the rules applied in order until getting the last rule applied and showing the last url, that is to say the physical (non virtual) path in the server?

Thanks

jdMorgan

3:32 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It would be a mistake to have more than one redirect [R=30x] applied to any URL. Do not let this happen, as it will cost you in the search engines. Redirect from any 'bad' URL straight to the 'correct' URL in one step. Then you can apply any internal URL-to-filepath rewrite needed to 'connect' that final URL to the internal script or object filepath.

Also, never let an internal rewrite occur before an external redirect. If you do this, then the internal filepath will be 'exposed' to the client --browser or search engine robot-- and that is almost never desirable.

NOT using the [L] flag is useful in only one situation; When multiple internal rewrite steps are needed to convert a URL into an existing internal filepath. However, due to a bug which is still present in Apache/2.x, multiple rewrite steps don't always work properly anyway. And, if a bit of skill is applied, multiple rewrite steps are rarely ever needed (the URL Rewriting Guide in the Apache documentation itself contains some examples of unnecessary multi-step rewrites). So in almost all cases, mod_rewrite rules should end with [L].

You should know a couple of things. First, an external redirect ([R=30x]) ends the current HTTP transaction and requests that the client start a new one using the redirected-to URL; No information is preserved by the server from the redirect transaction into the resulting new HTTP transaction -- HTTP is a 'stateless' protocol.

Second, .htaccess is recursive; If any rule matches and is applied, then .htaccess processing re-starts from the beginning. This is necessary because some rewriterules (and mod_access directives) are used for access control, and the code must be re-run to be sure that the new internal filepath is not subject to access restrictions.

For these reasons, there is usually no reason to continue processing subsequent rewriterules after one has been applied; Either a new HTTP transaction will be invoked (and mod_rewrite will run again), or if the rule is applied, then .htaccess will be restarted from the top (and mod_rewrite will run again). Therefore, there are only a very few circumstances where a good reason exists not to use [L].

The .htaccess 'restart' is the reason that rewrite recursion (looping) must be explicitly prevented in .htaccess, usually by using a RewriteCond to prevent the rewritten URL-path from being rewritten again.

Jim

g1smd

3:43 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



How much simpler is your code?

If you don't have the

http://www.example.com/
prefixed on the URL on the "right side" of the redirect rule, then you get this:

[b]e[/b]xample.com/home/language/en --> 301 --> [b]e[/b]xample.com/home/language/en/

[b]www.[/b]example.com/home/language/en --> 301 --> [b]www.[/b]example.com/home/language/en/

With the domain prefixed, both calls are correctly redirected to the www version. You do want that to happen at the same time as the other fix. This avoids duplicate content and/or a redirection chain.

.

To reiterate, a rewrite connects the path in the requested URL to a server filepath, and silently fetches the content.

A redirect forces the browser to make a new URL request. You should explicitly state the full URL - domain - path - filename - query string - because assumptions are going to be made for the "bits that are missing" - and those assumption might not be right and may cause a problem.

thosecars82

4:12 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



Thank you G1SMD very much for the information.
1. I understand like you say that there will not be a redirection chain in the example you told us.

2. JdMorgan told me yesterday that the full url should be placed on the right side of the rewriterule with R=301 in contrast to rewritecond without R=301, that is to say, the same thing that you are telling me now. Then I did not understand very well this point, but now I understand it better bearing in mind what you told me: "assumptions are going to be made..." So like you said my previous example could be done in a simplified way like jdMorgan said or in two lines like I wrote but changing the right part of the rewriterule, that is to say:

RewriteCond %{REQUEST_URI} ^/home/language/en$
RewriteRule .* http://www.example.com/home/language/en/? [R=301,L]
Thanks

g1smd

4:12 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



For reference, under your user name on the left, the "owner edit" button allows you to edit a post for up to 60 minutes after initial posting.

[edited by: g1smd at 4:17 pm (utc) on Oct. 10, 2008]

thosecars82

4:13 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



Thanks for the information Now I saw it. I guess there is not signature here.

g1smd

4:15 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Nope. No sigs. No ads. No link drops. Only information.

I like it that way, and so must a lot of other people. This forum has had 2.5 million posts since it started.

jdMorgan

4:59 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



or in two lines like I wrote

RewriteCond %{REQUEST_URI} ^/home/language/en$
RewriteRule .* http://www.example.com/home/language/en/? [R=301,L]

Why would you want to use two lines, which run twice as slow as one line? It is a waste of time, and there is no functional difference.


RewriteRule ^home/language/en$ http://www.example.com/home/language/en/? [R=301,L]

is processed almost twice as fast.

Jim

thosecars82

5:04 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



Ok, thanks, I guess I had not made much emphasis on performance yet but I will consider this points. Thanks for the good advice because it will avoid that users leave the site just for the fact that it is not fast enough loading pages.

[edited by: thosecars82 at 5:08 pm (utc) on Oct. 10, 2008]

g1smd

5:04 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I used to believe that a RewriteCond was needed first, until I found out that the first thing that is processed is the left side of the RewriteRule.

It is therefore the left side of the RewriteRule that you want to make as specific as possible. If what you put there is still too wide (or you need to look at other things such as the query string, original full HTTP request, and so on) then, and only then, do you need to add a RewriteCond too.

thosecars82

5:12 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



I have this question:
Could I use
RewriteRule ^home/language/en$ http://%{HTTP_HOST}/home/language/en/? [R=301,L] 
or something like that instead of
RewriteRule ^home/language/en$ http://www.example.com/home/language/en/? [R=301,L] 

to let this work not only in my remote server but also on my local server or in case I change my domain?
Thanks

[edited by: jdMorgan at 6:34 pm (utc) on Oct. 10, 2008]
[edit reason] De-linked URL [/edit]

g1smd

5:13 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think so, but do be sure that it doesn't cause the reappearance of "www and non-www problem" that I documented above.

jdMorgan

6:30 pm on Oct 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Bad idea -- You end up with the same problem of not forcing a domain canonicalization at the same time as correcting the URL.

Better to define your own user-variable that contains either the canonical 'real' hostname or you local servername, and use that variable instead of HTTP_HOST:


# Set localhost as default hostname
RewriteRule .* - [E=MyHost:localhost]
# If any variant of "example.com" is requested
RewriteCond %{HTTP_HOST} example\.com [NC]
# Set default hostname to "www.example.com"
RewriteRule .* - [E=MyHost:www.example.com]
...
RewriteRule ^home/language/en$ http://%{ENV:MyHost}/home/language/en/? [R=301,L]
...
RewriteCond %{HTTP_HOST} example\.com [NC]
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

As shown, it's mostly useful for canonicalizing URL-paths, and not so useful for canonicalizing the hostname itself, since you have to check hostnames explicitly.

And why do you call your test server something different anyway? You can use the 'hosts' file to set up 'private DNS' to be used only on your testing computers, and point your 'real' domain name to 'localhost' or 127.0.0.1, or to a host on your LAN. When finished testing, comment out that line in the hosts file. Your test server should be an exact copy of your production server... Well, unless you enjoy frustration, grief, and woe. ;)

Jim

thosecars82

7:50 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



It seems a good reply and a good suggestion.
I think I liked your suggestion and therefore I am going to try it. I included this line
127.0.0.1 www.example.com # For browser access
in my C:\WINDOWS\system32\drivers\etc\hosts
file and if I browse http://www.example.com I only go to the local xampp web. So, how do I get that www.example.com point to my site locally?
Thanks

thosecars82

8:06 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



Ok, I think I got it the way that you say. It is kinda cool. The thing that I had to do besides inserting that line I mentioned in the previous post, was:
change this file
C:\xampp\apache\conf\httpd.conf
What I changed was this line:
DocumentRoot "C:/xampp/htdocs to DocumentRoot "C:/xampp/htdocs/example.com

I placed the folder example.com of my site exmaple.com in the folder C:/xampp/htdoc and it is working nice.
Thanks

thosecars82

8:39 pm on Oct 10, 2008 (gmt 0)

10+ Year Member



As for this
RewriteRule ^home/language/en$ http://www.example.com/home/language/en/? [R=301,L] 

I have another question. It was great because it worked. But why does it stop working when instead of placing that sentence I place this other one?
RewriteRule ^/home/language/en$ http://www.example.com/home/language/en/? [R=301,L] 

If I am not wrong, I think that the uri modified by the rewrite takes the / but if this second sentence I just wrote does not match the pattern I want, then I will have to change my mind. Nonetheless, when I print this variable with php $_SERVER['REQUEST_URI'], I get /home/language/en
So why is it not working with my suggestion?
Thanks

g1smd

12:26 am on Oct 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The leading / is not present in the data presented to a RewriteRule when used in .htaccess - but it is there when you use a RewriteRule in httpd.conf.

The path information that is present is also localised such that if the .htaccess is in a folder then information about folders closer to the root is not included.

The REQUEST_URI is a different variable, and (I believe) it will always begin with the / wherever it is used.

thosecars82

1:42 am on Oct 11, 2008 (gmt 0)

10+ Year Member



Question:
As for this rule:
RewriteRule ^/home/language/en$ http://www.example.com/home/language/en/? [R=301,L]
I thought I understood the meaning of the L flag. However I have been mulling over of this idea that if L stops the rewriting process then after converting a uri like
/home/language/en into
/home/language/en/
no more rules would be applied. Furthermore, this last uri /home/language/en/ is not a physical path to an internal file from the server because the corresponding rule to rewrite this virtual path into the physical path with query string would have not been applied yet. So, I wonder why this works successfully showing me the physical page to which the virtual address /home/language/en/ should point and at the same time no more rules are applied.

I have to say that I have a rule like this in my htaccess file

RewriteRule ^(intro多ome存ervices地dvice安ebdesign圭lases圭ontact安riteEmail地boutme好otfound)/language/([^/]+)/ $1.php?language=$2 [L,NC]

which actually would rewrite the virtual address to the physical address. Nonetheless, this last rule would not have been applied because the first one in the order of rules of my htaccess file, that is to say,
RewriteRule ^/home/language/en$ http://www.example.com/home/language/en/? [R=301,L]
with its L flag enabled would have stopped the execution of the process.

Well, I am just trying to understand this better.
Thanks a lot in advance for the information.

g1smd

1:52 am on Oct 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The [R=301] makes the browser issue/send a new HTTP request to the server for the new URL.

Once the new request is received, everything in .htaccess is then evaluated based only on this new request.

So, this time, the rewrite comes in to play.

.

You really do need to install Mozilla Firefox and the Live HTTP Headers extension and watch the "conversation" that your browser and the server have at the HTTP level.

thosecars82

10:05 am on Oct 11, 2008 (gmt 0)

10+ Year Member



Thank you g1smd. I think I am going to install that extension for firefox. However, thank you for the explanation because you have made it clear really well for me. Now I understand how this .htaccess is processed by the brwoser with this two flags L and R=301.

thosecars82

10:18 am on Oct 11, 2008 (gmt 0)

10+ Year Member



Oh, yeah, I just tested that extension and it is really cool. Thanks.

g1smd

8:14 pm on Oct 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Whenever you write code for .htaccess it is imperative to test it using Live HTTP Headers using a range of both expected and unexpected URL requests.

thosecars82

9:50 pm on Oct 11, 2008 (gmt 0)

10+ Year Member



Hello
That was another thing that I was thinking about and it is good that you have remembered to mention it because I was wondering right that: Should be considered a good practice or a good way of acting as a principle when placing a web on the internet or when adjusting it for seo purposes to check every possible weird(not cannonical or not expected) string that the user might past as the url just to throw the corresponding error page or to avoid aliases? As for this, I see two points of view:

On one hand, I thought it might be enough by checking the possible aliases with the Yahoo site explorer tool and with the google webmaster tools and removing those aliases by placing the proper code into the .htaccess file.
On the other hand, I had the idea that there might always be some aliases for some urls of a particular site that had never been used to link your site from any other site. These aliases of the cannonical links might not be discovered or shown by the yahoo site explorer or the google webmaster tools just because those links(alias) which are alias of the ones(cannonical) that you really wanted to be used, might have never been placed in any other site like I said in the previous sentence. For that reason I might need to think not only in the possible errors detected by these tools that I mentioned but also to think about all the possible aliases or not cannonical urls or not expected urls that someone might place in the url when addressing your site. Because this would be an extra effort. But would it be worth it? Would it be needed when placing a web in the Internet? Would it be needed for SEO purposes? Would it make a difference to treat these possible not expected strings not recognized yet by these tools offered by Yahoo and Google?

What suggestions do you have to solve my dilema?
Thanks
PD: If you use any other tool besides google webmasters tool and yahoo site explorer to face the problem of dealing with unexpected strings in the url by placing code in the .htaccess I would appreciate if you let me know of them.

g1smd

12:02 am on Oct 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I make a list of all of these various test URLs and put them in a text file. The list includes all formats that could possibly work - whether or not any search engines have found them.

I then get Xenu Linksleuth to run through the whole list and generate a report. I am looking for all unwanted URL formats to either redirect or to fail with a 404 Error.

The current site I am testing has several hundred variations for each of five sample URLs, and a further list of several hundred more URLs of various types. The whole lot is duplicated for both www and non-www. For indexes, there are versions both with and without the index filename in the URL.

So, this list runs to over 1500 URLs. It takes Xenu LinkSleuth under two minutes to check them all.