homepage Welcome to WebmasterWorld Guest from 54.196.18.51
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Help: htaccess rewrite rule
bramley




msg:4291256
 6:10 pm on Apr 2, 2011 (gmt 0)

Need help with a Rewrite rule

I need to change urls like this :

domain.com/beijing/index.php/component/messaging/message/components/com_kunena/template/default/images/index.php?option=com_content&view=article&id=603:luge-slide-at-mutianyu-great-wall-video&catid=45:outings&Itemid=66

So that the bold part is omitted.

In other words, find index.php? and add from there to domain/beijing

RewriteRule ^index.php?([^/]+) domain.com/beijing/index.php?$1

This doesn't work, as I think the ? is taken as a special character. But I need it because i dont want the url from the first /index.php/ only from index.php?...

I thought about something like this

RewriteRule %{QUERY_STRING} /beijing/index.php?%{QUERY_STRING}

but seems this is not correct way to write the rule.

Any help will be much appreciated.

 

g1smd




msg:4291258
 6:16 pm on Apr 2, 2011 (gmt 0)

You need a RewriteRule to do the rewrite, noting that the RewriteRule RegEx pattern can see only the path part of the URL request.

You also need a preceding RewriteCond looking at the QUERY_STRING, and there are several thousand such examples in the WebmasterWorld Apache forum.

bramley




msg:4291260
 6:27 pm on Apr 2, 2011 (gmt 0)

Thanks g1smd.

Need to check there is a query string - i get that.

Not sure how to match the index.php? with the ?

Does ? have a special significance in rewrite rules ?

bramley




msg:4291266
 6:31 pm on Apr 2, 2011 (gmt 0)

Yes, ? does a special use.

Can I use \? ?

When you say 'can only see the path part of the url, you mean up to but not including the query string?

g1smd




msg:4291278
 7:02 pm on Apr 2, 2011 (gmt 0)

Yes, RewriteRule sees only the path, not the hostname or query string.

A RewriteCond must be used to detect protocol, domain name, port number, or query string data (one RewriteCond for each).

bramley




msg:4291286
 7:25 pm on Apr 2, 2011 (gmt 0)

Almost there :)

Now have :

RewriteRule ^index.php/(.*)index.php?(.*) /beijing/index.php?$2 [R]

This detects the strange URLs fine but redirects to

/beijing/index.php

The last part : ?$2 is not showing up :(

bramley




msg:4291289
 7:35 pm on Apr 2, 2011 (gmt 0)

Got it :)

RewriteRule ^index.php/(.*)index.php /beijing/index.php [R]

g1smd




msg:4291290
 7:41 pm on Apr 2, 2011 (gmt 0)

Swap (.*) for ([^/]+/)+ to quickly recurse folder levels. Escape literal periods in patterns using \. instead of . here.

The [R] produces a 302 redirect. Change to [R=301,L] and add the protocol and domain name to the rule target.

bramley




msg:4291357
 12:47 am on Apr 3, 2011 (gmt 0)

Good ideas g1smd.

But in my case the pattern is always index.php/[various number of directories]/index.php?a=b So what I have works fine

I did try escaping the .s but it didn't work like that. Don't know why, what a . means or why what I have does work, but it does :)

I'll check on the 302/301 and what L means.

Thanks again!

bramley




msg:4291359
 12:57 am on Apr 3, 2011 (gmt 0)

"Note: if you add an "L" flag to the mix; meaning "Last Rule", e.g. [R=302,NC,L]; Apache will stop processing rules for this request at that point, which may or may not be what you want. Either way, it's useful to know."

- [corz.org...]

I already made it the last rule - it is last in the rewite rules list. I am hoping I have fixed the reason why my Joomla is causing the bot to find/create these URLs so it will activate only as a last resort.

I'll let the 302 run for a while first until I'm sure all is well (by checking WMT).

g1smd




msg:4291362
 1:22 am on Apr 3, 2011 (gmt 0)

The . matches ANY character and \. matches only a literal period.

The .* forces thousand of "back off and retry" trial matches. Use ([^/]+/)+ to parse the URL once from left to right, very much faster.

Add the [L] flag to EVERY rule, otherwise you can trigger a nasty Apache bug.

bramley




msg:4291364
 1:27 am on Apr 3, 2011 (gmt 0)

It's curious about the .

I'll leave it is is because it's been quite a headache to reach this point and there will not be any urls with indexaphp in (or such).

I'll change the other two straight away.

bramley




msg:4291365
 1:34 am on Apr 3, 2011 (gmt 0)

Changed to use ([^/]+/)+ and [R,L]
and tried some urls straight from WMT and it's fine :)

Thanks g1smd !

g1smd




msg:4291415
 7:27 am on Apr 3, 2011 (gmt 0)

[R,L] gives a 302 redirect.

The . vs. \. is important. Ignore it at your peril. Don't leave loopholes that can be exploited.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved