Forum Moderators: phranque

Message Too Old, No Replies

301 redirect help

         

Brutal Dreamer

4:42 pm on Nov 12, 2019 (gmt 0)

5+ Year Member



Hello,

Thank you for taking the time to read my request and offer advise if you will. I'm in the process of upgrading my forum to a new version of the software. My current setup is Vbulletin 4 and I will be moving to Vbulletin 5. With Vb4 I have used VBSEO to do rewrites to my URLs in the forums and CMS. Since VBSEO is now out of business, there isn't a solution for VB5 so I plan to go back to the default URL structure. I'm hoping to not loose all the links posted in my site and on the net directing traffic to my site, so I'm wanting to use a permanent redirect. Can anyone offer some advice on the following two items?

Here is and example of the forum URL now using VBSEO
http://www.mysite.com/forum/subform-name/48975-trichopilia-tortilis-leaf-tips.html

Here is what I need to redirect to when I remove VBSEO (this is the url when I disable the product)
http://www.mysite.com/forum/showthread.php?t=48975


What would be the rewrite rule I would use to do this?

Additionally, the CMS (content management system) has been altered by VBSEO. The URL structure is a little different, so I am posting it here to see if someone can also help me with the 301 rewrite for this as well.

Here is and example of the forum URL now using VBSEO
http://www.mysite.com/forum/content/178-raft-cork-deteriorated-replace.html


Here is what I need to redirect to when I remove VBSEO (this is the url when I disable the product)
http://www.mysite.com/forum/content.php?r=178-Raft-cork-deteriorated-to-replace


What would be the rewrite rule I would use to do this?
Anything else I should consider?

Thank you for your advice/help.

Kind regards,
Bruce

lucy24

6:21 pm on Nov 12, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The recurring references to “rewrite rule” imply that this is an Apache server, so the question is best addressed in the Apache subforum.

:: wandering off in search of a moderator ::

Meanwhile, the stock answer to both questions is “What have you tried so far, and how did it work?”

Are you absolutely certain you want to redirect from URLs with no query string, to URLs with a query? That seems backward. (And is the only part of the question that potentially has anything to do with SEO.)

Brutal Dreamer

10:43 pm on Nov 12, 2019 (gmt 0)

5+ Year Member



Hello Lucy24,

I apologize for posting my message in the wrong forum section. I did several searches regarding VBSEO and the SEO section seemed to be the place people had discussions about this topic. Regardless, it is because of SEO that I want to make sure I do not loose all of the links that my website has garnered over the past 15 years. VB5 has no SEO plugins and my URLs will be reverted to their standard URLs when I upgrade. I want to get ahead of the game and turn off the VBSEO plugin and revert my community URLs back to the standard ones as posted in my first message as the re-written ones will no longer be available to me after the upgrade. With a 301 redirect, I should be able to keep my current traffic. That is why I am here and seeking assistance. You are correct, I do have an Apache server.

To answer your questions: No, I absolutely would rather not change my URL structure, but at this point my current server php is at end of life and I must upgrade to move to the new server.
--What have I done so far? - I've researched and read as much as I can so that I could ask the precise questions I needed help with from people who understand what I am attempting. I've supplied both of the two URL types that I need help with (forum URL and CMS URL), and what the final result should be after the 301 redirect.
--How did it work? - Well, I joined this community after reading several forums with related topics. This place seemed to be the best place to find help.

Thank you for taking time to respond to my original post. I hope now that the moderator has moved this topic to the correct forum someone will be able to offer advice.

cheers,
Bruce

lucy24

1:56 am on Nov 13, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've researched and read as much as I can so that I could ask the precise questions I needed help with from people who understand what I am attempting. I've supplied both of the two URL types that I need help with (forum URL and CMS URL), and what the final result should be after the 301 redirect.
Well, that's certainly a start, and is a heck of a lot more than some folks have done by way of preparation. But unfortunately, what I meant was, “What attempts at a RewriteRule have you formulated? Which parts work as intended, and which parts don’t?” There exist forums that will simply write your code for you--generally at a cost of several snarky preliminary comments pointing out that the same question was asked and answered back in 2015, with a link that nobody but a Forums regular would know how to find. But I digress.

Now, in the second pattern:
http://www.example.com/forum/content/178-raft-cork-deteriorated-replace.html
...
http://www.example.com/forum/content.php?r=178-Raft-cork-deteriorated-to-replace
This is where it really seems as if everything could be done with an internal rewrite, letting you hold on to your established URLs. You're moving from
/forum/content/blahblah.html
to
/forum/content.php?r=blahblah
(I did see a casing issue between “raft” and “Raft”, but php should be able to deal with that.) Human users don't need to see the new URL; only your forums software does.

The first example is actually trickier because I'm not sure which parts are variable. It looks like
/forum/subforumname/stringofnumbers-moreblahblah.html
/forum/showthread.php?t=stringofnumbers
while discarding the subforumname and the part after the numbers. Is that all that's happening?

:: wandering off to GBIF Species Search to learn what in the world trichopilia tortilis is ::

Oh. It's an orchid. Somehow it sounded like an unwanted parasite.

phranque

5:40 am on Nov 13, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], Bruce!

as best as i can understand from your description, this is what you need/want:
- some of your legacy urls are in the form of http://www.example.com/forum/subforum/12345-foo-and-more-foo.html
- the internal url used by VB4 for these legacy urls is in the form of http://www.example.com/forum/showthread.php?t=12345
- some of your legacy urls are in the form of http://www.example.com/forum/content/123-foo-and-other-foo.html
- the internal url used by VB4 for these legacy urls is in the form of http://www.example.com/forum/content.php?r=123-foo-and-other-foo

is this the problem as you understand it?
is there any difference between the internal urls used by vb4 vs vb5?

assuming i understand your problem correctly, the solution for the necessary internal rewrites are relatively trivial using mod_rewrite.

however the other half of the problem is externally redirecting requests for the internal form of the url to the canonical external urls.
the first class of urls cannot be solved solely by using mod_rewrite since you don't have enough information in the requested path to externally redirect the VB internal url to the external url.

Brutal Dreamer

1:49 am on Nov 14, 2019 (gmt 0)

5+ Year Member



Hello and thank you for the welcome and for the timely responses. You have both given me much to consider.
After reading your questions, I realized that perhaps VB5 URL structure might not keep with the legacy format. Today I got a test site from VB and started digging through the admin control panel to see if there were any URL structure controls. (I don't see any.)

I did notice that the URL structure at VBulletin's site is similar to mine (they do not have the .html at the end of their URLs. Anyway, while digging some more, using the ideas you two shared, I found some code that should redirect my modified urls back to the original structure and keep me from loosing my linked/indexed pages. I'm still learning what the different parts of the code does, but since my URL structure is basically
 forum-name/threadname-threadid.html 

Then to redirect the forum posts I should be able to use something like the following.
RewriteEngine on RewriteRule [^/]+/([0-9]+)-[^/]+\.html http://www.mysite.com/forums/showthread.php?t=$1 [L,R=301]


I'm not so sure what to do about the CMS urls, but it should be similar, right?

And Lucy, yes, the capital R in one line and the lowercase r in the next would be an issue. I read today that I can use 'NC' to keep the capitalization of words from making a difference, but I'm not certain where it needs to go. And, the "L" before the redirect=301 ends that particular rule, so I'm guessing that it would only go in the final bit of code once I learn how to redirect the CMS urls. I put it in the example above so that I could get some feedback. Still learning... Oh, and yes, trichopilia is an orchid. :)

Phranque, thank you. Yes, I believe you understand what I was trying to ask. Your question about the difference between internal vb4 and internal vb5 urls is what led me to the discoveries I made today. (asking the right questions - helps!) As for your last two statements, for a novice at this, I'm not sure what the next step should be. I'm not sure what additional information is needed to 'externally redirect{ing} requests for the internal form of the url to the canonical external urls'

I'll keep working. I appreciate any help/advice and questions.
Sincerely,
Bruce

lucy24

6:49 pm on Nov 14, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteEngine on RewriteRule
I hope this was just an artifact of posting. You say “RewriteEngine on” just once--logically before all RewriteRules, but the server doesn’t care--on a line by itself. For your own sanity, leave a blank line between each ruleset, though again the server doesn’t care. (That is, it's not like robots.txt where a blank line has syntactic meaning.)

I read today that I can use 'NC' to keep the capitalization of words from making a difference, but I'm not certain where it needs to go.
The NC flag applies only to patterns--either the left-hand side in the body of the rule, or to any individual RewriteCond. Unfortunately there is no mod_rewrite flag that means “match the original casing” or “capitalize the first word” or what-have-you. This may or may not turn out to be a problem, depending on the specific circumstances. It is even possible that the Forums software itself includes a case-leveling function, so casing of the URL doesn't matter.

And, the "L" before the redirect=301 ends that particular rule, so I'm guessing that it would only go in the final bit of code once I learn how to redirect the CMS urls.
Think of [R] and [L] as an inseparable couple; normally anything with an R flag (or R=301) also takes the L flag. Most of the time, any given request will only match one pattern, so you need to apply the L right away. Otherwise the server will keep checking all the other RewriteRules, which is just a waste of server resources.

Edit:
externally redirect requests for the internal form of the url to the canonical external urls
Do these requests even exist? I got the impression there were two different URL formats, and the ones with the query string have never been visible as external URLs, so nobody is requesting them. (I hope so, because it makes things a lots easier.)

Option B is to rewrite all requests for old-format URLs to a simple php script that does all the transformations, potentially including targeted case-changing. It winds up by issuing the needed redirect, or a 404 if the request isn't valid. If so, all your RewriteRules collapse to a single
RewriteRule ^(request-matching-old-url-pattern)$ /specialpage.php?oldurl=$1 [L]
In spite of the [L] flag, this rule would be located at the beginning of the RewriteRules that have the [R] flag. If this turns out to be necessary we can hammer out more details, but with luck it won't be needed. (This format is a little worrying because your server access logs will show a 200 response for all requests, and you have to take it on faith that the redirects were all issued, unless you have your php page create a supplementary log of its own.)

Brutal Dreamer

8:29 pm on Nov 14, 2019 (gmt 0)

5+ Year Member



Wow, okay, thank you for that information. I'm so not sure on the next step here. The php page with only one redirect seems like a good idea, but, unless I'm still really lost, I believe I would only need two redirects in an htaccess file at the root of my site. (With the hope that the forum "software itself includes a case-leveling function". (I hope so!) haha..

Yes, the Rewrite engine on is at the top line of the htaccess file I'm working on. It is not repeated each time. I should have placed a blank line in the code box when I copied it here. It doesn't seem to matter in the htaccess file (that I can tell).

Thank you for helping me better understand the NC, L, and R codes. My eyes have crossed with all the information that is contradictory on the net regarding all of this. So, " [L,R=301]" at the end of both redirect rules since the "L" keeps from wasting resources checking for something that it will never need. That makes it much easier to understand. :)

I'm still struggling with the CMS redirect code. (if in fact the Forum redirect code I supplied in MSG4972356 above is actually correct and will do what I'm hoping it will do. Any thoughts on the CMS code?

Namaste,
Bruce

lucy24

8:44 pm on Nov 14, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Any thoughts on the CMS code?
Not from me, as I don't speak the language. (I'm just saying this so you know I'm not ignoring you.)

:: looking vaguely around for phranque or someone like him ::

Brutal Dreamer

8:50 pm on Nov 14, 2019 (gmt 0)

5+ Year Member



haha... thank you, Lucy!

phranque

12:08 am on Nov 15, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I'm still struggling with the CMS redirect code. (if in fact the Forum redirect code I supplied in MSG4972356 above is actually correct and will do what I'm hoping it will do. Any thoughts on the CMS code?

i'll assume you are referring to this:
RewriteRule [^/]+/([0-9]+)-[^/]+\.html http://www.mysite.com/forums/showthread.php?t=$1 [L,R=301]

which i will assume is intended to address requested urls in this form:
http://www.mysite.com/forum/subform-name/48975-trichopilia-tortilis-leaf-tips.html

about which you stated:
Here is what I need to redirect to when I remove VBSEO (this is the url when I disable the product)
http://www.mysite.com/forum/showthread.php?t=48975


1 - there is an inconsistency between your problem description (/forum/) and your code sample (/forums/).
i'll assume "forum" is correct.
2 - the regular expression in your code sample won't match the requested path because there are 2 slashes before the numeric string.
i would start the pattern with the slash before the captured numeric string and add an end anchor.
i.e. /([0-9]+)-[^/]+\.html$
3 - if you want to retain the old urls, why not use an internal rewrite instead of an external redirect?
RewriteRule /([0-9]+)-[^/]+\.html$ /forums/showthread.php?t=$1 [L]

Brutal Dreamer

12:59 am on Nov 15, 2019 (gmt 0)

5+ Year Member



Hi Phranque,

1 - there is an inconsistency between your problem description (/forum/) and your code sample (/forums/).
i'll assume "forum" is correct.

Yes, that 's' is a typo. Thank you.

2 - the regular expression in your code sample won't match the requested path because there are 2 slashes before the numeric string.
i would start the pattern with the slash before the captured numeric string and add an end anchor.
i.e. /([0-9]+)-[^/]+\.html$


I think the two forward slashes before the numeric string was to get to the correct location. root/forum/ (I don't have much experience with this as is pretty obvious by my questions.) I'm betting that I only need that one forward slash as you state. Thank you. Sorry for the obvious error.

3 - if you want to retain the old urls, why not use an internal rewrite instead of an external redirect?


I did not know that was an option. I've only read about external redirect and thought that was what I needed to do. I like the internal rewrite better if it will save that linked/indexed content URLs. That was my intention.

RewriteRule /([0-9]+)-[^/]+\.html$ /forums/showthread.php?t=$1 [L]

Last thing, the above should work for the Forum URLs, but my second question was about the Content Management System (CMS) URLs. The structure of the URLs in the CMS is a bit different and I wasn't sure how to do that particular redirect (now internal rewrite - if that will work).

So, here is and example of the forum URL in the CMS now using VBSEO

 http://www.mysite.com/forum/content/178-raft-cork-deteriorated-replace.html


Here is what I need to redirect (or rewrite) to when I remove VBSEO (this is the url when I disable the product)

http://www.mysite.com/forum/content.php?r=178-Raft-cork-deteriorated-to-replace


Based on what you did with the forum URL rewrite, would the below then work for the CMS? They are the same with the exception of the php file name and the title text. I feel like there is something missing since I'm not sure what to put in to get the text that appears after the content ID number (IE ".Raft-cork-deteriorated-to-replace). Is that covered when the t=$1 calls for the thread id? Does it need another call? I'm not sure if I'am even asking the question correctly. I hope you can understand what I mean.
RewriteRule /([0-9]+)-[^/]+\.html$ /forums/content.php?t=$1 [L]


I sincerely appreciate your help and apologize for my obvious errors. Thank you.

Bruce

phranque

2:59 am on Nov 15, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



So, here is and example of the forum URL in the CMS now using VBSEO

 http://www.mysite.com/forum/content/178-raft-cork-deteriorated-replace.html


Here is what I need to redirect (or rewrite) to when I remove VBSEO (this is the url when I disable the product)

http://www.mysite.com/forum/content.php?r=178-Raft-cork-deteriorated-to-replace

assuming the "raft" vs "Raft" is correct and intentional, there's no way to arbitrarily transform to correct case using mod_rewrite.
Based on what you did with the forum URL rewrite, would the below then work for the CMS? They are the same with the exception of the php file name and the title text. I feel like there is something missing since I'm not sure what to put in to get the text that appears after the content ID number (IE ".Raft-cork-deteriorated-to-replace). Is that covered when the t=$1 calls for the thread id? Does it need another call? I'm not sure if I'am even asking the question correctly. I hope you can understand what I mean.
RewriteRule /([0-9]+)-[^/]+\.html$ /forums/content.php?t=$1 [L]

you would need to capture the entire alphanumeric string.
and /forum/, not /forums/.
and use the "r" parameter in the target.
more like:
RewriteRule /([0-9]+-[^/]+)\.html$ /forum/content.php?r=$1 [L]

but that still doesn't solve the case issue.
is it only and always the first alphabetic character that is upper case?

lucy24

4:13 am on Nov 15, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteRule /
This bit makes me uneasy. A pattern with leading / slash will only match* if there is something before the slash. In this case I guess you’re aiming for the element /forum/ but as written there could be anything-and-everything. It is better to start with an opening anchor and then the exact text-to-match. This is partly to avoid Unintended Consequences** but more to save the server work:
RewriteRule ^forum/blahblah
where “blahblah” is the part you and phranque have been discussing.


* Exception if the RewriteRule is lying loose in the config file, most likely in a VirtualHost envelope, but that doesn’t seem to be the case here.
** Apache-speak for “The world as we know it will come to an end and it won’t be pretty.”

phranque

4:33 am on Nov 15, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



RewriteRule ^forum/blahblah

actually:
Here is and example of the forum URL now using VBSEO
http://www.mysite.com/forum/subform-name/48975-trichopilia-tortilis-leaf-tips.html

so you need a pattern that matches whatever "/subforum-name/" might be.

lucy24

7:40 pm on Nov 15, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If it involves /forum/subforum-name/ then the one thing you can be sure of is that it will always start with
^forum/
excluding any and all other directories that might happen to exist on the site. The second element /subforum-name/ may not even matter, unless there are armies of malign forum robots asking for
^forum/some-garbage-name
just to clutter up the Forums software.

Brutal Dreamer

9:19 pm on Nov 15, 2019 (gmt 0)

5+ Year Member



=phranque ...is it only and always the first alphabetic character that is upper case?


No, users may, and occasionally do, use upper case in random places in thread creation.

=phranque and /forum/, not /forums/.


Darn it; I wanted to get it right so I copied it from my original post. I understand, and in my text doc that will become the htaccess file, it is /forum/ not /forums/ Sorry. I really appreciate your accuracy.

Regarding Case (upper - lower) of words in the url:

=lucy24The NC flag applies only to patterns--either the left-hand side in the body of the rule, or to any individual RewriteCond. Unfortunately there is no mod_rewrite flag that means “match the original casing” or “capitalize the first word” or what-have-you. This may or may not turn out to be a problem, depending on the specific circumstances. It is even possible that the Forums software itself includes a case-leveling function, so casing of the URL doesn't matter.


I'm still not sure if I understand this completely since on the Apache RewriteRule Flags page on the net it says this about NC:


NC|nocase

Use of the [NC] flag causes the RewriteRule to be matched in a case-insensitive manner. That is, it doesn't care whether letters appear as upper-case or lower-case in the matched URI.

In the example below, any request for an image file will be proxied to your dedicated image server. The match is case-insensitive, so that .jpg and .JPG files are both acceptable, for example.
RewriteRule "(.*\.(jpg|gif|png))$" "http://images.example.com$1" [P,NC]




Will this only work for images as in the example or is it more broad as their definition implies? I wonder if adding in the [NC] might solve the case issue. (Lucy?)

=phranque you would need to capture the entire alphanumeric string.
... use the "r" parameter in the target.
more like:

RewriteRule /([0-9]+-[^/]+)\.html$ /forum/content.php?r=$1 [L] 


That seems really easy; much easier than I expected. Thank you. I'll try it! What about, before the the end of the string, if I did: [NC,L] instead of just the [L]?
so...
RewriteRule /([0-9]+-[^/]+)\.html$ /forum/content.php?r=$1 [NC,L]


I'm hoping the forum software will simply take care of the case issue, but I've been unable to find out if that is true or not.

As for the discussion about the
RewriteRule /
vrs the
RewriteRule ^
I'm at a loss. The forum folder is in the root of the site, so nothing should come before it in sequence. I do not have, for example: something.example.com/forum/123.html (Is that what you mean?)

Thank you both so much for your time, patience, and help. I appreciate it.
Bruce

phranque

10:58 pm on Nov 15, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I'm still not sure if I understand this completely since on the Apache RewriteRule Flags page on the net it says this about NC:

refer to the "What is matched?" section of the documentation:
https://httpd.apache.org/docs/current/mod/mod_rewrite.html#what_is_matched
Will this only work for images as in the example or is it more broad as their definition implies? I wonder if adding in the [NC] might solve the case issue.

it applies to all requested paths.
it won't solve the problem of arbitrary case-correction in the Substitution string.
I'm hoping the forum software will simply take care of the case issue, but I've been unable to find out if that is true or not.

i'm not sure what canonicalization is built in to VB but i wouldn't count on it.
most likely solution will be writing a script to examine the requested path and generate the correct 301 response after consulting the db.
use mod_rewrite to internally rewrite requests that look like these to the script and let the script return the 301 status code response and the Location: header.
As for the discussion about the...

this is getting into the esoteric and arcane art of specifying the most restrictive and/or most efficient pattern.
i would go with this:
RewriteRule ^forum/[^/]+/([0-9]+-[^/]+)\.html$ /forum/content.php?r=$1 [L]

btw if you use the [NC] flag here, the effect will be to allow lower/upper/mixed casing of "forum" and "html" while matching the requested path.
this is not what you want or need.

lucy24

2:07 am on Nov 16, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The forum folder is in the root of the site, so nothing should come before it in sequence. I do not have, for example: something.example.com/forum/123.html (Is that what you mean?)
No, the hostname isn't part of the string-to-match. But the issue isn't where the /forum/ directory is located; it's where the RewriteRule is located.

:: detour here to verify that we are in htaccess ::

OK, here's how it works. When your RewriteRules are located in a directory context--which by definition includes any htaccess file, anywhere--the part-to-match begins at the root of the directory, excluding the leading / slash. Assuming your htaccess is located at the site root, as they generally are:

^foldername
will match anything located in
example.com/foldername

foldername
will match the same, but also
example.com/directory/foldername
example.com/xfoldername
example.com/blahblah/more/gibberishfoldername
and so on.

/foldername
will match
example.com/directory/foldername
example.com/1/2/3/4/5/foldername
and so on, but it will NOT match
example.com/foldername
because the slash immediately after your hostname has already been omitted.*

And most importantly
^/foldername
will match nothing, ever.

And your point is...?
In the interest of efficiency, you should try to constrain the rule as narrowly as possible. If your aim is to capture only within the /forum/ directory, the rule should be expressed as
RewriteRule ^forum/(stuff-to-capture) etcetera
That way, when the server meets a request for stuff in other directories-- /blog/ or /images/ or /funstuff/ or what-have-you--it can get out of there immediately, rather than continue looking at the whole request in hopes of later finding a ([0-9]+-[^/]+)\.html

In the Apache docs on [NC], the very important word is match. It means you're comparing something against a pattern--or, with the [NC] flag, comparing two case-leveled somethings (ABC = Abc = abc). The target of the rule can basically be two things: literal text like /directoryname/, or stuff that was captured earlier in the rule. But that means the text that was actually captured, in its original casing, not the case-flattened version that the server used when deploying the [NC] flag.

If the forums software doesn't do case leveling by default, there may be a plugin you can use. Especially useful if people are typing on smartphones that are too smart for their own good, so they see fit to capitalize the first letter of everything.


* Fun but irrelevant fact: If you have a malformed URL with // two or more consecutive slashes, they can only be matched in a RewriteCond, not in the body of the rule. Or at least, that was the case in 2.2; I fortunately haven't had need to try it in 2.4.