homepage Welcome to WebmasterWorld Guest from 54.234.141.47
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
mod rewrite help
floid78



 
Msg#: 4616479 posted 9:04 pm on Oct 13, 2013 (gmt 0)

I'm just preparing to upgrade my old vBulletin 3.8 board with Zoints SEO addon to vBulletin 4.2.

The Zoints SEO rewrite rules basically did this:

/forumname-f1/index.html -> forumdisplay.php?f=1

/topicname-t123/index.html -> /showthread.php?t=123

/topicname-t123/index2.html -> /showthread.php?t=123&page=2

RewriteRule ^([a-z0-9_\-]*-(f|all)[0-9]+(p[0-9]+|/index[0-9]*)?\.html)$ forumdisplay.php/$1 [QSA,L]
RewriteRule ^([a-z0-9_\-]*-(t|p)[0-9]+(p[0-9]+|/index[0-9]*)?\.html)$ showthread.php/$1 [QSA,L]


Putting those to the .htaccess fle of vb4 won't redirect correctly. Some research and try and error led me to this rules:


RewriteRule ^[a-z0-9_\-]*-f([0-9]+)(p[0-9]+|/index[0-9]*)?\.html$ forumdisplay.php?f=$1 [QSA,L,R=302]
RewriteRule ^[a-z0-9_\-]*-(t|p)([0-9]+)(p[0-9]+|/index[0-9]*)?\.html$ showthread.php?$1=$2 [QSA,L,R=302]


First rule for displaying sub-forums works, second one for displaying threads works as well, except for pagination (example 3). So the second page of a thread is still being redirected to the first page. So what's missing to make it work?

Any help would be appreciated!

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4616479 posted 10:54 pm on Oct 13, 2013 (gmt 0)

Wait, wait.

Your first package of rules involve rewrites. The second package involves redirects. (302, at that. Is this temporary, for testing?) So in each case we're only seeing half of the question.

It's good the old rules are being retired, because there are some issues:
a. Target of a rewrite should have a leading slash
b. "QSA" implies that the original request might already have a query string-- but why would it, if the URL is in HTML?
c. Rule as written allows for requests for explicit "index.html", which should have been forcibly redirected.
d. What's with [a-z0-9_\-]*- ? Seems like that would allow for requests with leading slash-- to say nothing of the massive backtracking involved in the rest of the rule, since [a-z0-9_\-] matches almost everything.
e. Don't quite care for URLs in .php/more-stuff-here
f ...well, there's probably more stuff lurking among those pipes.

Most RegEx engines are perfectly happy with \w meaning [A-Za-z0-9_] -- in other words, almost everything you'll ever see in an URL, except hyphens and ::shudder:: literal periods. Similarly \d for [0-9] at a savings of three bytes. And, of course, [tp] instead of (t|p), or ([tp]) if you need a separate capture.

The quoted rules don't show how pagination was handled in the old pattern. I'd expect something involving
topicname-(\d+)/index(\d+)\.html >> t=$1&p=$2

But the most important question is...

Why are you changing URLs at all? Why not keep the old URLs and rewrite to the files' new location? If you do want to change the URL, why not take the opportunity to change them to something pretty and user-friendly?

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4616479 posted 11:07 pm on Oct 13, 2013 (gmt 0)

You've said what the URLs used to look like.

What do the URLs that users should see in the browser address bar now look like?

You'll need a redirect from old to new URL only if the URL format has changed.

You'll need to amend the current rewites (but still keeping them as rewrites) if either the URLs have changed or the internal script names have changed.

floid78



 
Msg#: 4616479 posted 11:26 pm on Oct 13, 2013 (gmt 0)

Why are you changing URLs at all? Why not keep the old URLs and rewrite to the files' new location? If you do want to change the URL, why not take the opportunity to change them to something pretty and user-friendly?


I guess I should have explained better. Unlike vBulletin 3.8, vBulletin 4.2 has a build SE-friendly URL function. With vB 3.8 I was using the described addon called "Zoints SEO". I really can't say much about the problems you described with that old rules, I only know they came with the addon, and they actually did work, and rewrote the standard vBulletin URL's to the format that I posted.

I now would like to use the build in SEO function, which has a different URL format, and still keep the old URL's by redirecting to the vB standard URL's with query string, which than should be rewritten to the new SE-friendly vB URL format.

floid78



 
Msg#: 4616479 posted 11:38 pm on Oct 13, 2013 (gmt 0)

Here's the full set of rules that came with the Zoints SEO addon:

RewriteRule ^([a-z0-9_\-]*-(f|all)[0-9]+(p[0-9]+|/index[0-9]*)?\.html)$ forumdisplay.php/$1 [QSA,L]
RewriteRule ^([a-z0-9_\-]*-(t|p)[0-9]+(p[0-9]+|/index[0-9]*)?\.html)$ showthread.php/$1 [QSA,L]
RewriteCond %{REQUEST_URI} !(index\.php|\.css) [NC]
RewriteRule ^(archive|sitemap)/(.*)$ $1/index.php/$2 [QSA,L]

floid78



 
Msg#: 4616479 posted 12:07 am on Oct 14, 2013 (gmt 0)

You've said what the URLs used to look like.

What do the URLs that users should see in the browser address bar now look like?

You'll need a redirect from old to new URL only if the URL format has changed.

You'll need to amend the current rewites (but still keeping them as rewrites) if either the URLs have changed or the internal script names have changed.


Old and new script names are the same. Still, for some reasons, the old rules don't work in vB4, and aren't rewriting correctly.

JD_Toims

WebmasterWorld Senior Member Top Contributors Of The Month



 
Msg#: 4616479 posted 12:09 am on Oct 14, 2013 (gmt 0)

It's going to be a bit more complex than you think if the new version of vBulletin uses the post titles in the URLs, especially if you're not familiar with PHP, MySql and Mod_Rewrite -- The easiest way to redirect the old URLs to the new version's structure if the new version uses the title of the post as part of the URL is to rewrite requests for the old URLs to PHP, then access the info for the title from the database, then redirect via PHP to the new style URL, then use Mod_Rewrite to rewrite those back to PHP again.

BTW: Welcome to WebmasterWorld!

JD_Toims

WebmasterWorld Senior Member Top Contributors Of The Month



 
Msg#: 4616479 posted 12:20 am on Oct 14, 2013 (gmt 0)

These rules aren't setting a page and have some "odd" flags for rewriting.

RewriteRule ^[a-z0-9_\-]*-f([0-9]+)(p[0-9]+|/index[0-9]*)?\.html$ forumdisplay.php?f=$1 [QSA,L,R=302]
RewriteRule ^[a-z0-9_\-]*-(t|p)([0-9]+)(p[0-9]+|/index[0-9]*)?\.html$ showthread.php?$1=$2 [QSA,L,R=302]

They should be something more like this:
[Likely not exactly and could likely be more efficient, but idk vBulletin well enough to say for sure off the top of my head]

RewriteRule ^[a-z0-9_\-]*-f([0-9]+)(p([0-9]+)|/index([0-9]*))?\.html$ forumdisplay.php?f=$1&page=$3$4 [L]
RewriteRule ^[a-z0-9_\-]*-(t|p)([0-9]+)(p([0-9]+)|/index([0-9]*))?\.html$ showthread.php?$1=$2&page=$3$4 [L]

floid78



 
Msg#: 4616479 posted 12:59 am on Oct 14, 2013 (gmt 0)

RewriteRule ^[a-z0-9_\-]*-f([0-9]+)(p([0-9]+)|/index([0-9]*))?\.html$ forumdisplay.php?f=$1&page=$3$4 [L]
RewriteRule ^[a-z0-9_\-]*-(t|p)([0-9]+)(p([0-9]+)|/index([0-9]*))?\.html$ showthread.php?$1=$2&page=$3$4 [L]


Thanks! :) But same result than before, everything works except for pagination for threads. It's working for dorums though.

[edited by: floid78 at 1:07 am (utc) on Oct 14, 2013]

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4616479 posted 12:59 am on Oct 14, 2013 (gmt 0)

What do the old URLs look like and what do the new URLs look like?

You need a rule that matches requests for old URLs and then does something with those requests.

If everything in the new URL can be derived from what is in the old URL request then you can build the new URL and issue the redirect entirely within the rule.

If there are elements in the new URL that are not in the old URL you will need a different approach. Rewrite the request to a new PHP script. This new PHP script will be only a few lines long. It will grab the extra elements for the URL from the database, build the new URL and issue the redirect.

I don't understand why you're currently trying rules that redirect old URL requests to a URL format that uses parameters. At worst, you're creating duplicate content problems. At best, you're creating a redirect chain. // EDIT // OK. Removing the R=302 stops the redirect and turns it into a rewrite. This is likely closer to what you need. If you ask for URLs with parameters, does the new PHP script then redirect the request to the new URL format that the new add-on uses?

floid78



 
Msg#: 4616479 posted 8:31 am on Oct 14, 2013 (gmt 0)

If you ask for URLs with parameters, does the new PHP script then redirect the request to the new URL format that the new add-on uses?


Yes, redirect works perfectly, the old URL format of the vB addon is being redirected to the new native, vB4 URL format via the detour of the script names.

Like I said, only thing that doesn't work is pagination for threads.

So /topicname-t123/index2.html is not being redirected to showthread.php?t=123&page=2 but to showthread.php?t=123.

But I guess you are right, a direct redirect from the old SE-friendly URL format to the new one would be a much better approach. all elements of the old format are also part of the new format.

The new URL format for forums is

/forums/123-forumname/page2

and for threads

/threads/123-topicname/page2

Here is my current, complete .htaccess:

Options +FollowSymlinks
RewriteEngine on

RewriteBase /vb4/vb/

RewriteRule ^[a-z0-9_\-]*-f([0-9]+)(p([0-9]+)|/index([0-9]*))?\.html$ forumdisplay.php?f=$1&page=$3$4 [L]
RewriteRule ^[a-z0-9_\-]*-(t|p)([0-9]+)(p([0-9]+)|/index([0-9]*))?\.html$ showthread.php?$1=$2&page=$3$4 [L,R=302]
RewriteRule ^archive/[a-z0-9_\-]*t([0-9]+)\.html$ archive/index.php/t-$1.html [QSA,L]

RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?revleft.com(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?revleft.com(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?revleft.org(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?revleft.org(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?google.com(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?google.com(/)?.*$ [NC]
RewriteRule .*\.(jpe?g|gif|bmp|png)$ - [F]

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d

RewriteRule ^.*$ - [NC,L]

# Forum
RewriteRule ^threads/.* showthread.php [QSA]
RewriteRule ^forums/.* forumdisplay.php [QSA]
RewriteRule ^members/.* member.php [QSA]
RewriteRule ^blogs/.* blog.php [QSA]
RewriteRule ^entries/.* entry.php [QSA]

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d

RewriteRule ^.*$ - [NC,L]

# MVC
RewriteRule ^(?:(.*?)(?:/|$))(.*|$)$ $1.php?r=$2 [QSA]

# Check MVC result
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ - [NC,L]
RewriteRule ^(.*)$ - [R=404,L]

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4616479 posted 3:42 pm on Oct 14, 2013 (gmt 0)

There are a large number of issues with that code. It's going to need a significant amount of tweaking to fix.

floid78



 
Msg#: 4616479 posted 3:46 pm on Oct 14, 2013 (gmt 0)

Uhm, except for the upper part, it's the "official" vB 4 rewrite code...

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4616479 posted 7:09 pm on Oct 14, 2013 (gmt 0)

most "official" CMS rewrite code examples have issues.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4616479 posted 9:58 pm on Oct 14, 2013 (gmt 0)

Uhm, except for the upper part, it's the "official" vB 4 rewrite code...

What phranque said, assuming that "have issues" is what the Word Censor puts out when you put in "are ### ###". Equally important: RewriteRules can't be listed in the order you create them ("the upper part"). They must be listed in conceptual order.

First tier: group rules by severity of result. First access-control rules in [F], then [G] if any, then redirects [R=301], then internal rewrites [L].

Second tier: within each of these areas, list rules from most specific (anything that applies to an individual page or directory) to most general.

RewriteCond %{HTTP_REFERER} !^http://(www\.)?revleft.com(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?revleft.com(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?revleft.org(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?revleft.org(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?google.com(/)?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?google.com(/)?.*$ [NC]

Why is this six separate conditions? Among other things, www. should never be optional in a referer test, unless you know for a fact that the site you're testing for doesn't canonicalize. (In other words, never your own site!) Even google redirects without-www requests; I just checked.

JD_Toims

WebmasterWorld Senior Member Top Contributors Of The Month



 
Msg#: 4616479 posted 11:06 pm on Oct 14, 2013 (gmt 0)

What phranque said, assuming that "have issues" is what the Word Censor puts out when you put in "are ### ###".

LMAO! I thought the same thing earlier and almost posted it, but didn't have time to look at the code too much right then.



There are definitely serious issues with some of it, and one of the first things that jumps out, besides the file-system walk by f, d is the set of conditions you're pointing out -- They're all missing an escape and doubled!

I think 2 are all that are necessary, not 6.

RewriteCond %{HTTP_REFERER} !^http://www\.example\.(:?com|org)/
RewriteCond %{HTTP_REFERER} !^http://www\.google\.com/

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4616479 posted 9:40 am on Oct 15, 2013 (gmt 0)

Make sure there's a blank line after each Rule.

Add a numbered comment to each rulseset to make it easier to tell you which ones to swap the order or to modify.

List rules that block access first, redirects next and rewrites last. Make sure all the rules use RewriteRule and NOT Redirect or RedirectMatch.

Every rule needs the L flag.

Where a leading or trailing .* is both uncaptured and unanchored it can be deleted,
i.e. ^this/.* simplifies to ^this/ and .*/that$ simplifies to /that$

^.*$ simplifies to .*

^(.*)$ simplifies to (.*)

Fix up what you can and post again.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved