Forum Moderators: phranque

Message Too Old, No Replies

Mod Rewrite and the slash at the end

How will spiders see this?

         

punisa

9:40 am on Apr 23, 2008 (gmt 0)

10+ Year Member



Good day people,

My links on the site go like this:
www.mysite.com/category

But when you click it, it adds the slash so it becomes:
www.mysite.com/category/

This works ok, but how will google treat this?
I have this in htaccess:

RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]\n
RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]\n
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !.php
RewriteCond %{REQUEST_URI} !.jpg
RewriteCond %{REQUEST_URI} !.flv
RewriteCond %{REQUEST_URI} !.html
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://www.example.com/$1/ [L,R=301]

Can I sleep without worrying ? : )
Thanks alot

[edited by: jdMorgan at 2:12 pm (utc) on April 23, 2008]
[edit reason] example.com [/edit]

g1smd

11:24 am on Apr 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If it really is a folder URL, then the canonical form is that with the slash included so this is all OK.

However, DO check that the redirect really does return a HTTP Status Code of 301 in the HTTP header.

A 302 redirect would be a disaster for this. Use Live HTTP Headers or somesuch Mozilla extension to check it.

punisa

1:00 pm on Apr 23, 2008 (gmt 0)

10+ Year Member



Hi g1smd,

Thank you for your reply.
No it is just "made" to look like a folder by using mod rewrite, but I believe that is ok.

Unfortunately I'm not an expert on HTTP headers, by using the firefox seo tools I get this for "view response headers":

Keep-Alive: timeout=5, max=96
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html

200 OK

This means my response header is 200? Is that bad ? : (
Once again I apologize for not understanding the subject quite clearly.

jdMorgan

2:11 pm on Apr 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In the interest of efficiency and robustness, I'd suggest the following:

RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) http://www.example.com/$1 [L,R=301]
#
RewriteCond $1 !/$
RewriteCond $1 !\.(jpg¦flv¦php¦html)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) http://www.example.com/$1/ [L,R=301]

Replace the broken pipe "¦" characters with solid pipe characters; Posting on this forum modifies the pipe characters.

"RewriteBase /" is the default Apache behaviour; Unless you have preceding non-default RewriteBase directives, including the "RewriteBase /" directive doesn't change anything.

Note that the "file exists" check in the second rule-set is now done last, and only if the other two RewriteConds are true. This can make a significant improvement in your server's performance, since 'file exists' checks involve querying the OS and filesystem and possibly reading the disk each time; They are therefore very demanding on the server.

As for your 200-OK response, the likely cause is that you did not completely flush your browser cache before testing. Therefore, the browser already had a cached copy of the redirect response, and did not request it from your server. Always completely flush your cache before testing any new code on your server, and if the server response depends on client-side request variations (e.g user-agent string), you will have to flush it before each test request.

Jim

punisa

9:37 am on Apr 25, 2008 (gmt 0)

10+ Year Member



Hi again,
unfortunatly after clearing every cache/temporary files possible my respnse header is still 200, instead of 301.

I'm not sure why...
I have a custom 404 error page and the headers are ok there.

punisa

10:14 am on Apr 25, 2008 (gmt 0)

10+ Year Member



Me once more,
I did some testing on my localhost server.
Apache is configured ok, just as on my hosting. So I tried a small test like this:

RewriteBase /
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ [localhost...] [L,R=301]

It adds the slash at the end, but again the response header is 200, and not 301.

Imagine my confusion now: )

Everytime I make an error I try to break my code like this and find the dirty spot. Now there are three options:
1. The above code is missing something, spelling perhaps?
2. Tool I use to check the response header, add on for firefox "web developer" is not showing correct response header.
(not likely cause it shows 404 without any problems)
3. I'm a noob and have a lot of "duplicate content" reoccuring nightmares

Thanks for reading

jdMorgan

4:55 pm on Apr 25, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> It adds the slash at the end, but again the response header is 200, and not 301.
> Tool I use to check the response header, add on for firefox "web developer" is not showing correct response header.

Please be specific: Do you see only one HTTP transaction, and a 200-OK response to a request for the URL with no slash, or do you see a 200-OK on a request for the URL with the slash added?

If you are only seeing one response, and it is the 200-OK for the URL with the slash added, then that indicates that the Web developer add-on is not showing all HTTP transactions, but only the final one. In that case, use the "Live HTTP Headers" add-on instead -- It will show all HTTP transactions unless you set filters to prevent it.

If you are only seeing one response, and it is the 200-OK for the URL without the slash added, then that indicates that the redirect is not being invoked due to an error in the mod_rewrite code, or that mod_rewrite is not being invoked at all (because it's not enabled).

Anyway, it would be quite helpful to know which URL the 200-OK response is being returned for.

Jim

punisa

9:57 pm on Apr 25, 2008 (gmt 0)

10+ Year Member



Hi again,
Web developer add on aparently shows only the last transaction, I got "Live HTTP Headers" and it gives me much more info, thanks for the tip Jd ! : )

Checked my response and it gives me 200-OK slash or no slash, actually it gives me 200-OK all the time.
Anyway I kept investigating further.

I erased everything in htaccess on my test site, I've put only this:

RewriteEngine On
RewriteRule ^(.*)$ [mysite.com...] [R=301,L]

This should 301 redirect to my new domain, so I go ahead and punch this into address:

[localhost...]
It goes to:
[mysite.com...]

Just like it should, so everything works fine. Except, again, there is no 301, just 200-OK.

I have posted what Live HTTP headers add-on puts out below.

mod_rewrite is definetly enabled cause my wholle new site, 2000+ pages is built upon rewritten URL's and everything works great. In fact I belived everything is ok until I learnt about HTTP headers : )

Here are the results of the above simple experiment, I dont have too much clue what this info means, no 301 unfortunatly:

[mysite.com...]

GET /news/ HTTP/1.1
Host: www.mysite.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 200 OK
Date: Fri, 25 Apr 2008 21:43:04 GMT
Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7a mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
----------------------------------------------------------

Host: www.mysite.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 200 OK
Date: Fri, 25 Apr 2008 21:43:05 GMT
Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7a mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
Keep-Alive: timeout=5, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
----------------------------------------------------------

Host: www.mysite.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 200 OK
Date: Fri, 25 Apr 2008 21:43:05 GMT
Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7a mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html
----------------------------------------------------------

[edited by: jdMorgan at 10:26 pm (utc) on April 25, 2008]
[edit reason] Removed extraneous data [/edit]

punisa

9:28 pm on Apr 26, 2008 (gmt 0)

10+ Year Member



I dunno if anyone follow my problem, but I believe others may be having it. So in order to shed more light on the topic I found this:
my "awstats" show under HTTP errors: 301 Moved permanently (redirect) percent: 8.1 %

This looks very resonable, I guess the 8.1 % is just about how much people enter my pages without the slash at the end. So it redirects them to the page with slash.
Tha fact that awstats show this bit of info under "HTTP errors" section doesn't mean its actually an error, correct?

I still have no clue why all of my add-ons show only 200-OK, but the awstats log show that my 301 is actually working... somewhere.

Anyway, if you have similar questions/problems tune in.
I'll keep on learning : ¦

mehh

1:20 pm on Apr 27, 2008 (gmt 0)

10+ Year Member



As jdMorgan already said, the 200 responce may be for the page with the slash added. To check you can use the WebmasterWorld tool, in the control panel under plugins click Server Headers then enter the url without the slash.

jdMorgan

1:40 pm on Apr 27, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Be sure to completely-flush your browser cache (and/or clear "Temporary Internet Files" in Internet Explorer) before each test.

Jim