homepage Welcome to WebmasterWorld Guest from 54.198.42.105
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

This 37 message thread spans 2 pages: < < 37 ( 1 [2]     
index.php redirect breaking 404 pages.
boasting_j




msg:4520916
 11:47 pm on Nov 20, 2012 (gmt 0)

Howdy,

I have recently setup a index.php redirect through .htaccess. The idea here is to negate duplicate content issue that crops up when a site has both an index.php and / (homapage) getting indexed.

I used the technique listed here.

[askapache.com ]

It works great too. The one issue is, it breaks the 404 pages.

So if a user types in or goes to www.example.com/dafjkadbfda instead of serving the 404 page, what happens is the URL stays the same, in this case the broken one, and it severs the index.php page.

This in turn is opening another can of worms in that all those broken pages are coming up as duplicate content and meta. So while this is somewhat seo related, it does have to deal with the .htaccess. :) This has been an issue on many sites that I thought the .htaccess redirect worked on. Thanks in advance.

 

lucy24




msg:4526544
 8:44 am on Dec 10, 2012 (gmt 0)

msg:4526235 [webmasterworld.com] mentioned the path part of the 3 URLs, but didn't clarify the requested hostname for the middle step.

It's the yellow one halfway down the page ;)

Isn't the point that it's supposed to work cleanly with any hostname?

Oh, and I just realized:

RewriteRule ^index\.php(/(.*))?$ http://www.example.com/$1 [R=301,L]

means

index.php(/blahblahhere) >> http://www.example.com//blahblahhere

index.php(/) >> http://www.example.com//

Get rid of that slash in the target.

g1smd




msg:4526545
 8:56 am on Dec 10, 2012 (gmt 0)

I see
rewriterule ^index\.php(/(.*))?$ http://www.example.com$1 [R=301,L]
in the posted code. with target as ending .com$1

I would use
RewriteRule ^index\.php(/(.*))?$ http://www.example.com/$2 [R=301,L]
here for clarity. This also means the target always correctly ends with a trailing slash when $2 is empty.

Additionally, there's a typo in the non-www/www redirect, the
rewriterule (.*) http://www.example.com/$2 [R=301,L]
should be
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
as $2 will always be empty.

Sgt_Kickaxe




msg:4526593
 1:29 pm on Dec 10, 2012 (gmt 0)

I fixed the typo, thanks g1smd, and I've tried the changes you mentioned Lucy but nothing changes. For simplicity I removed both the index.php/ and www arguments completely leaving ONLY the one line rewriterule and that rule fails. The problem is with that, somehow. e.g.

RewriteRule ^some-old-url$ http://www.example.com.com/my-new-url [R=301,L]

visiting www.example.com/index.php/some-old-url results in a 404 error.

visiting www.example.com/anything-here/some-old-url ALSO fails to redirect.

visiting www.example.com/some-old-url results in a 301 redirect to the right page.


I thought that the rewriterule above was supposed to capture any url ending in some-old-url but when there is a directory in the url it doesn't? So right now my site needs to redirect to remove the index.php/ first and then the first rule works so it immediately does that too.

RewriteRule ^index.php/some-old-url$ http://www.example.com.com/my-new-url [R=301,L]
works so which catchall should I use to consider the index.php/ without opening other cans of worms?

I ordered an .htaccess book for myself from Amazon btw - I'm not sure if that will be a present or punishment :)

edit: the following works but can it be improved?
RewriteRule ^(.*/)?some-old-url$ http://www.example.com.com/my-new-url [R=301,L]

would it be more efficient to use
RewriteRule ^(index.php\/)?some-old-url$ http://www.example.com.com/my-new-url [R=301,L]
lucy24




msg:4526662
 6:09 pm on Dec 10, 2012 (gmt 0)

I thought that the rewriterule above was supposed to capture any url ending in some-old-url

Ah ha! You've got a beginning anchor. If you want to look only at the end of the URL, regardless of what comes before it, you have to leave off the anchor.

And don't bother about ^(.*/) because g1 will tear your head off and it isn't worth it :) Luckily there are approximately ten thousand earlier posts in this forum showing the correct way to capture the beginning of an URL if you need to save the part before your target text. Here maybe you don't, so just omitting the anchor is enough.

g1smd




msg:4526667
 6:19 pm on Dec 10, 2012 (gmt 0)

Never use
(.*) at the beginning of the pattern. The ^(.*) means read the entire URL all the way to the very end. Replace with ^(([^/]+/)*)index\.php to match and capture optional folder levels.

RewriteRule ^thispage$ -- matches a request for example.com/thispage or www.example.com/thispage

If you want to match a request for
example.com/index.php/thispage you will need
RewriteRule ^index\.php/thispage$

The
^ means "begins with". It's good to include it, otherwise the rule might match other URL requests that it should not do so.

Is
AcceptPathInfo enabled or disabled on this site?
Sgt_Kickaxe




msg:4526710
 9:44 pm on Dec 10, 2012 (gmt 0)

Thanks guys(and gals).

Sgt_Kickaxe




msg:4526857
 8:08 am on Dec 11, 2012 (gmt 0)

Oh and AcceptPathInfo is disabled.

This 37 message thread spans 2 pages: < < 37 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved