Forum Moderators: phranque

Message Too Old, No Replies

Redirect htaccess

         

dolcevita

11:18 am on Jul 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I come behind that some part of my sites are indexed through

http://www.example.com/test.shtml/blog/blog/index.php
http://www.example.com/test.shtml/images/test.php

etc... Really surprised because i though that server will response with 403 error? test.shtml exist only as single file and not as directory but server still load and show first page test.shtml through examples above only broken (without images, css etc...) although it does not exist

I do not have any idea why server show test.shtml page on such a request but it must be stopped (also do not have idea how it come to be indexed!? ).
I would like to prevent such a thing and to redirect any ridiculous request to test.shtml/to whatever directory/ to simple test.shtml page

Still have trouble how to set it redirection into .htaccess
I can do it single page

Redirect 301 /test.shtml/whatever/ http://www.example.com/test.shtml
Redirect 301 /test.shtml/index.shtml http://www.example.com/test.shtml

but do not know how to set it with one command to all request.

Or maybe it is good idea to block it through robots.txt
User-agent: *
Disallow: /test.shtml/blog/blog/
Disallow: /test.shtml/images/

I'm only not sure of it will block file test.shtml because i do not want to do it.

Thanks

[edited by: jdMorgan at 8:21 pm (utc) on July 8, 2007]
[edit reason] example.com [/edit]

dolcevita

6:20 pm on Jul 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hmmm... Is it possible that nobody know it?

jdMorgan

6:37 pm on Jul 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I do not have any idea why server show test.shtml page on such a request

Apache looks for the first ".file-extension" in the URL-path, and ignores everything after it. So to Apache,
http://www.example.com/test.shtml/blog/blog/index.php
and
http://www.example.com/test.shtml
are the same URL-path

Apache gives a precise meaning to the "/" and "." characters in URLs, and uses rules to parse URL-paths based on those characters; If you *want to* include extra "." characters in the URL-path, you have to take extra steps, such as enabling AcceptPathInfo (on Apache 2.x). Otherwise, as we see in this case, Apache ignores the extra path information after the first ".file-extension" it finds.

To redirect to remove these incorrect URLs from search engine listings, you can use the mod_alias RedirectMatch directive or mod_rewrite. Since RedirecMatch is easier to use, here is an example:


RedirectMatch 301 ^/test\.shtml(/.*)$ http://www.example.com/test.shtml

It is also possible to generalize this function: If you wish to redirect *any* URL with more than one "." in it, then try something like this:

RedirectMatch 301 ^/([^.]*\.[0-9A-Za-z]+)(.+)$ http://www.example.com/$1

Jim

dolcevita

7:23 pm on Jul 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



KIndly thanks
RedirectMatch 301 ^/test\.shtml(/.*)$ http://www.example.com/test.shtml

works great.

They are only some situation when it got something extra behind.
For example:

http://www.example.com/test.shtml/viewtopic.php?p=19190

become

http://www.example.com/test.shtml?p=19190

or

http://www.example.com/test.shtml/protect/phpBB2/viewtopic.php?p=17336

become

http://www.example.com/test.shtml?p=17336

Is is possible to get rid of?p=17336 or?p=19190 or whatever behind test.shtml?

Thanks

g1smd

7:42 pm on Jul 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You need to suppress parameters from being added back to the final URL.

dolcevita

7:46 pm on Jul 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I know it but can you give me practical example using already excellent example from Jim:

RedirectMatch 301 ^/test\.shtml(/.*)$ http://www.example.com/test.shtml

Thanks

jdMorgan

8:20 pm on Jul 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You'll need to use mod_rewrite instead of mod_alias to remove the query strings:

Options +FollowSymLinks
RewriteEngine on
#
RewriteRule ^test\.shtml/ http://www.example.com/test.shtml? [R=301,L]

The first line (the Options directive) will be required on some servers, and not needed and not allowed on other servers. The only way to find out is to test. If you have multiple rules or add more rules later, the first two lines are only required once in your .htaccess file.

Jim

dolcevita

8:39 pm on Jul 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It works. Thanks for your time and all help.