homepage Welcome to WebmasterWorld Guest from 54.235.36.164
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Stripping out a particular string from URLs
Question about my logic
Patrick Taylor




msg:3977586
 8:29 am on Aug 24, 2009 (gmt 0)

My personal website uses WordPress and I've noticed it generates a type of incorrect URL.

GENERATED (INCORRECT) URL EXAMPLES:
h*tp://www.example.com/alphanumeric-string/comment-page-1#comment-4357
h*tp://www.example.com/alphanumeric-string/comment-page-2#comment-4357
h*tp://www.example.com/alphanumeric-string/comment-page-3#comment-4357
CORRECT URL EXAMPLE:
h*tp://www.example.com/alphanumeric-string#comment-4357

GENERATED (INCORRECT) URL EXAMPLES:
h*tp://www.example.com/alphanumeric-string/comment-page-1
h*tp://www.example.com/alphanumeric-string/comment-page-2
h*tp://www.example.com/alphanumeric-string/comment-page-3
CORRECT URL EXAMPLE:
h*tp://www.example.com/alphanumeric-string

So I think I need my .htaccess to strip out "/comment-page-[any number]".

My logic is:
h*tp://www.example.com/[anything] /comment-page-[any number] [anything]
... to become:
h*tp://www.example.com/[anything][anything]

My proposed rule is:

RewriteRule ^(.*)(/comment-page-)([0-9])(.*)$ /$1/$4 [R=301,L]

Is my logic correct, and do I need a RewriteCond before the RewriteRule?

Patrick

 

jdMorgan




msg:3977770
 3:33 pm on Aug 24, 2009 (gmt 0)

Get rid of unnecessary back-references and make your rule as specific as possible for the sake of efficiency:

RewriteRule ^([0-9A-Za-z]+)/comment-page-[0-9]+(.*)$ http://www.example.com/$1/$2 [R=301,L]

However, be aware that this is not the 'solution' to your problem. It will only be useful to add this rule after fixing the root cause, which is that your WP installation (or a plug-in) is *defining* these incorrect URLs on the Web by publishing them on your pages. Once the URLs appear on your page where users and search engines can see them, no amount of mod_rewrite code is going to fix that; The act of publishing a URL in a link is what 'defines' that URL for the world, and the only use for this mod_rewrite rule will be to speed up the cleanup of search engine listings once the root cause of the problem is fixed and your site starts publishing correct URLs.

There are a couple of tutorials about what mod_rewrite can and can't do for you in our Apache Forum Library here at WebmasterWorld if you'd like to read more details.

Jim

[edited by: jdMorgan at 4:25 pm (utc) on Aug. 24, 2009]

Patrick Taylor




msg:3977793
 3:58 pm on Aug 24, 2009 (gmt 0)

Jim, thanks for the clean-up.

Point taken regarding fixing the source of the problem. In this instance there's not much I can do about it without editing a WordPress core file, which will be over-written with each upgrade. Fortunately I have very few 'problem' links and those I do have appeared only yesterday.

And yes, over the years I've received a lot of assistance with .htaccess both from you and from the Apache Forum Library. The trouble is, I dip into mod_rewrite only every now and then, and my understanding of what I've done fades with time.

Many thanks again.

Patrick

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved