Forum Moderators: phranque

Message Too Old, No Replies

URL rewriting based on specific folders

         

olegko

7:42 pm on Oct 3, 2014 (gmt 0)

10+ Year Member



So I have Magento ecommerce installed in the root folder of a domain and a Wordpress blog (their old site) installed in /blog/.

There are several pages (contact, about, learning center, etc) that are part of the wordpress but we want the URLs to be on the root of the site.

So we want to make a bunch of the wordpress pages not appear to be in the /blog/ folder.

site.com/blog/contact/ --> site.com/contact/
site.com/blog/about/team/ --> site.com/about/team/ (have quite a few sub pages to /about/)
site.com/blog/blog/post/ --> site.com/blog/post/
site.com/blog/blog/ --> site.com/blog/
site.com/nonwp/page.html --> site.com/nonwp/page.html (shouldn't effect non-wordpress pages)

So how can I make a rewrite that would check if the current page is site.com/x/ (where x is set by me manually to include individual pages as well as all subpages within a category) and load the url at site.com/blog/x/?

Seems really messy but I hope that's clear. Thanks in advance!

lucy24

3:49 am on Oct 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Preliminary question: Is the WP installation-- including its own htaccess-- in a subdirectory? It kinda sounds as if it is.

Normally what you want to do is quite straightforward. First a redirect to point requests for any with-blog URL to a new without-blog URL, and then an internal rewrite to serve content from a physical file inside /blog/ -- or, for that matter, from a nonexistent file that WP thinks is using a with-blog URL. It's really no different from the redirect-to-rewrite two-step that you use with short pretty URLs.

The complication is that WP relies on mod_rewrite ... and mod_rewrite, unlike normal Apache mods, isn't inherited.

So if you have a request for
example.com/wpblog/blahblah

then it will first pass through the htaccess it finds at
example.com/
(root)
and then through a second htaccess at
example.com/wpblog/
(inside the subdirectory)

And if both of those htaccess files contain RewriteRules, then only the ones in the inner directory will apply. The rules from the outer directory will be discarded as if they had never existed. You can bypass this by saying
RewriteOptions inherit

in the inner htaccess (this line, itself, is not inherited :)) but this can lead to further complications.

So before anything else, you have to figure out where to put the rules.

not2easy

4:46 am on Oct 4, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It is so much easier than that just because it is WP. You can make it appear to be in the domain root just by changing the settings to http://example.com instead of the http://example.com/blog where it currently says it is, and putting a copy of index.php in the root folder. Presto change-o the blog is at http://example.com (not really, but it appears to be.)

In other words, rather than rewrite the WP URLs, let WP rewrite them for you. You will need to adjust your permalinks (also in "settings") so old WP URL change to the new URLs - but WP can 301 them for you.

Step by step instructions at WordPress codex: [codex.wordpress.org...]

lucy24

9:07 am on Oct 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well, ###. Did you mean that you want to change ALL the /blog/ URLs? Yes, in that case it's simple :( It's only complicated if you're changing some-but-not-all.

olegko

8:33 pm on Oct 6, 2014 (gmt 0)

10+ Year Member



"and putting a copy of index.php in the root folder"

yeah, there in lies the problem. We have Magento installed in the root folder and WP installed in the "blog" folder.

There is no way to know which pages should be processed via WP vs. Magento before they are parsed after we remove the /blog/ from the URL.. which is why I'm thinking we'd need to manually set up a bunch of the redirect/rewrite rules in the htaccess to tell server which pages to rewrite and pull from the /blog/ equivalent URL.

not2easy

9:08 pm on Oct 6, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



WP URLs would change based on your settings so that all WP URLs would be shown as example.com/page-whatever and WP wouldn't change your Magento URLs because it only writes internal URLs (within the /blog/ directory). Unless they both use pages with the same name there shouldn't be any interference, the current root directory (Magento) URLs would stay as they are. The blog stays where it is, but the internal URL structure would change to appear it is in the root directory.

I am not familiar with Magento's workings but unless it requires index.php to function they should co-exist fine. Your setup may be such that it can't work that way, it is just a way to a simple fix and it won't take care of everything for everyone. I suggest it because it can be difficult to get WP to do things it is not configured to do in its settings.

olegko

9:44 pm on Oct 6, 2014 (gmt 0)

10+ Year Member



Magento does have an index.php file that contains data so I can't overwrite it. i've changed the settings before and the links work. My difficulty is associating a site.com/page to be part of wordpress and not just a magento 404 page.

olegko

3:40 pm on Oct 14, 2014 (gmt 0)

10+ Year Member



looking for any other ideas.

lucy24

7:46 pm on Oct 14, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Go back to the beginning and explain exactly what you need to rewrite. Give the actual names of the pages, excluding domain name; it's easier.*

example.com/magento-stuff-here
example.com/blog/wp-stuff-here

... except that some-but-not-all of the stuff in /blog/ needs to be rewritten so it appears to be in the root? So (a) you have to prevent Magento from handling this rewritten material and (b) you have to make sure WP does still handle it.

Is that right?

How many htaccess files do you currently have, and where are they located?


* I say this with trepidation, because I vividly remember one longago asker whose page-and-directory names made it pretty evident he was running an escort service ;) File under: tmi.

olegko

4:00 pm on Oct 15, 2014 (gmt 0)

10+ Year Member



hahaha, its not like that (this time). just want to protect identity. everything you wrote is correct except...

all pages from /blog/ need to shift down a level.

the difficulty lies in points a and b you stated.

There are 2 htaccess files. 1 for Magento in root folder, another for Wordpress in /blog/ folder.

Here are all the pages that Wordpress should handle (when loading the below urls except without a /blog/).
/blog/info/
/blog/randomtexturl-page1/
/blog/randomtexturl-page2/
...
/blog/page28/
/blog/about/
/blog/about/randomtexturl-page1/
...
/blog/about/page10/
/blog/products/
/blog/testimonials/
/blog/members/
/blog/contact/
/blog/blog/
/blog/blog/randomtexturl-post1/
...
/blog/blog/randomtexturl-post-to-infinity/
/blog/blog-category/category1/
...
/blog/blog-category/category4/

lucy24

8:09 pm on Oct 15, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Oh, painful. htaccess files are based on physical directories. So no matter what the URL, all requests for /blog/ will pass through the root htaccess, which contains the Magento material.

Does your existing root-level htaccess contain any RewriteRules that aren't directly concerned with the workings of either CMS? Other content doesn't matter, just mod_rewrite. For example, rules ending in the [F] flag for unconditional lockouts. I'm trying to figure out whether you want to set
RewriteOptions inherit

or not. You probably do.

Now, each of your two htaccess files contains a block that looks something like this:

## CMS begins here
<If Module I-forget-the-exact-wording
RewriteEngine on

RewriteRule index.php - [L]

RewriteCond %{REQUEST_URI} !-f
RewriteCond %{REQUEST_URI} !-d
RewriteCond I-think-there's-one-more-standard-condition
RewriteRule . /index.php [L]
</IfModule>
# end-CMS-stuff

That's from memory, but there's something that looks roughly like that in each htaccess, right?

So in the root htaccess you need a rule before the Magento package that says

RewriteRule ^blog/ - [L]


meaning "never mind about the /blog/ directory, I'll deal with it later". Right next to this-- still before the Magento rules-- you need a rule that says

RewriteRule ^((page1|page2|page3).+) /blog/$1 [L]


listing all your WP pages or directories by name.

Do you really have URLs that say /blog/blog/ twice? Is this a legacy of past mistakes, or do you want to keep them? If you want to redirect /blog/blog/blahblah to /blog/blahblah alone, things will get messy. Not impossible, but ugly.

Final crucial question: Have the /blog/ URLs ever been publicly visible, so people-- including search engines-- ask for them by name? If so, you will need another rule-- before all the [L] rules-- that says something like

RewriteCond %{THE_REQUEST} /blog/
RewriteRule ^blog/(.*) http://www.example.com/$1 [R=301,L]


(This is the part where your duplicate /blog/blog/blog/ URLs cause trouble.)

:: dammit, Forums, I never said [ red ] even once, let alone twice. It was the cat ::

olegko

10:17 pm on Oct 15, 2014 (gmt 0)

10+ Year Member



cool cool, making moves.

With current setup, it now redirects all the files i entered manually to the current location and ignores those i didn't.

site.com/about --> site.com/blog/about

One thing to note, I'm setting this up on a dev domain in a folder (site.com/dev/*). I added:

RewriteBase /dev/

before all the rules and also changed /blog/$1 to ./blog/$1

Would take care of any folder mismatch issues?

lucy24

11:52 pm on Oct 15, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is that a . in the target? What for?

olegko

1:01 am on Oct 16, 2014 (gmt 0)

10+ Year Member



without it, i get a "Not Found

The requested URL /blog/about/ was not found on this server."

assumed the . would move it up into the right folder (seemed so when it redirected to the current blog url)

lucy24

6:07 am on Oct 16, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In the target, a leading / refers to the domain root. It doesn't matter where the htaccess file is located. (It does make a difference in patterns.)

A RewriteBase is only used in internal rewrites, and then only when the target starts in bare text (neither http nor / slash). Normally it's safer to make the target say what you need it to say, including any subdirectories in the path. When you change from development or local to live, just fire up the text editor and do some global replaces in the htaccess file.

A relative link in a RewriteRule target-- or, for that matter, anywhere in any type of config file-- gives me the fantods. But that may just be me. (phranque? you out there?)

olegko

3:03 pm on Oct 17, 2014 (gmt 0)

10+ Year Member



when I change it to /dev/blog/$1, it does the same redirect - so back to the issue. How to I make it load the content instead of the redirect to it?

lucy24

8:43 pm on Oct 17, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do you have a RewriteCond looking at %{THE_REQUEST} ? You almost always need one when you're using the redirect-to-rewrite package.

:: looking back ::

Oh, oops, you're the multiple-CMS guy.
I'm setting this up on a dev domain in a folder (site.com/dev/*)

Is the domain accessible from "outside"? Either from the Internet if it's on your live server, or locally if it's a MAMP/WAMP/X-thingy type setup. Once you've got the dns pointing to the domain root, no other /directory/ references should be necessary. (In rare situations you may need the physical path in the pattern. Not in the target.)

olegko

8:52 pm on Oct 17, 2014 (gmt 0)

10+ Year Member



domain is accessible with the internet.

Not sure what to do with %{THE_REQUEST}.. here is what I have.

RewriteEngine On
RewriteBase /dev/

RewriteRule ^blog/ [L]
RewriteRule ^((list|of|all|pages).+) /dev/blog/$1 [L]

olegko

11:17 pm on Oct 20, 2014 (gmt 0)

10+ Year Member



From what I'm reading, THE_REQUEST is for RewriteCond but I want the rewrites to be unconditional (all visitors). Confused how adding a condition would change a redirect to a rewrite.

lucy24

2:43 am on Oct 21, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A RewriteBase is only used in internal rewrites, and then only when the target begins in a bare directory name (no / slash). So if your targets say explicitly /dev/blahblah with leading slash, any RewriteBase simply won't apply.

A RewriteCond is more likely to be needed in the case of an external redirect. For example:
RewriteRule ^(widget|foobar) /blogs/index.php?pg=$1 [L]


Now, if someone comes along asking for
/blogs/index.php?pg=widget
by name, you would want to redirect them forcibly to your preferred short pretty URL. So before the [L] rule you put one that says
RewriteCond %{QUERY_STRING} pg=(widget|foobar)
RewriteRule ^blog/index\.php http://www.example.com/%1 [R=301,L]

(Oops. This was not a perfect example, because a RewriteCond-- a different one-- was necessary anyway in this scenario I'm making up.)

But once you have these two rules, you've set up an infinite loop, because the server doesn't distinguish between requests from "outside" and requests that are the result of your own rewriting ... unless you add a RewriteCond that says
RewriteCond %{THE_REQUEST} index\.php\?pg=widget

Now the rule says: "Only execute this redirect if my original visitor asked for this page name."

That's what a %{THE_REQUEST} condition is for.

RewriteRule ^blog/ [L]

Is something missing here? Probably a - (null target) in the middle.

When someone (including yourself) asks for
www.example.com/
(the root), where do they physically end up? If the physical path involves /dev/blog/ then you don't need to put that part in the RewriteRule target. Looking at your site's error logs may shed light. Infinite loops, in particular, will stick out like a sore thumb because they come in batches of 30 or so. (Exact number depends on your server settings, which you may or may not be able to change.)

Confused how adding a condition would change a redirect to a rewrite.

It doesn't. Redirect vs. rewrite is determined by the [R] flag and/or by giving a full protocol-plus-domain in the target. Any type of RewriteRule, with any type of flag, can come with or without conditions.

olegko

6:25 pm on Oct 21, 2014 (gmt 0)

10+ Year Member



heyo, got it done. The code was correct but WP was redirecting the rewritten URLs on its end. Changed the settings for Wordpress to have the "home" be the root folder, not /blog/ and bingo!

Thanks for your help lucy24!