Forum Moderators: phranque

Message Too Old, No Replies

Rewrite rule for URL containing a capital letter

         

tntpower

10:44 am on May 11, 2009 (gmt 0)

10+ Year Member



I would like to rewrite all URLs (html files) that contain capital letter P to bt-br/ for all visits to subdomain.mydomain.com

Will it work?

RewriteBase /
RewriteCond %{HTTP_HOST} !^subdomain.mydomain.com$ [NC]
RewriteRule ^(.+)P(.*)\.html$ pt-br/$1P$2

Usually (In 99% cases), the URL only contains one capital letter P.

So that:

aboutusP.html goes to pt-br/aboutus
courseP-20090501.html goes to pt-br/courseP-20090501

Thanks !

jdMorgan

1:59 pm on May 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your rule would only affect hostnames that were NOT subdomain.example.com, because "!" means NOT.

If a URL-path that contained an uppercase P was requested from the main domain or any subdomain other than "subdomain.example.com", then the rule would create an infinite loop, because it rewrites xyzPabc to itself -- note the uppercase P in the RewriteRule's substitution.

I'd suggest:


RewriteBase /
RewriteCond %{HTTP_HOST} [b]^s[/b]ubdomain\.example\.com [NC]
RewriteRule ^([^P]*)P([^.]*)\.html$ pt-br/$[b]1p$[/b]2 [L]

Unfortunately, another problem becomes apparent here... The pattern on the left side of the RewriteRule matches the URL requested by a client (e.g. browser). The path on the right side of the RewriteRule specifies the new filepath to be used, or a new URL to be sent back to the client in a redirect response, telling the client to ask again at that new URL. Therefore, for this rule to work, none of your files can have .html extensions (or any other extensions), and therefore, you will have great difficulty serving the proper MIME-type in the server's Content-Type response header (among other things).

However, your first sentence above talks about "html files" so it is possible that you've got the entire RewriteRule backwards.

As documented [httpd.apache.org], here is how a RewriteRule is specified:


# Internally Rewrite requests for requested-URL to internal filepath "new-filepath-or-URL"
RewriteRule requested-URL new-filepath-or-URL [[i]flags[/i]]

Jim

g1smd

2:10 pm on May 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



*** courseP-20090501.html goes to pt-br/courseP-20090501 ***

Which one of those is the URL used out on the web, and which one is the name of the file stored inside the server?

A rewrite accepts a URL request as input then fetches a file from the server to service that request. Rewrites does not 'make' URLs.

Or is this a case of redirecting requests for an old URL to a new URL? Though you did say "rewrite" in the question, so I guess not.

I am guessing that the original example is worded exactly backwards from what is really required.

tntpower

4:55 pm on May 11, 2009 (gmt 0)

10+ Year Member



Thanks for reply.

I should add another rewrite condition:

RewriteCond %{REQUEST_URI} !^/pt-br/

Shouldn't I?

Thanks,

tntpower

4:58 pm on May 11, 2009 (gmt 0)

10+ Year Member



>Which one of those is the URL used out on the web, and which one is the name of the file stored inside the server?

Hi g1smd,

courseP-20090501.html is a static HTML files.

I am migrating the entire site (all static HTML files) to a Drupal site. pt-br/courseP-20090501 is a URL alias (through Drupal's per node path setting). Internally, it is node/123. pt-br is a language prefix.

Thanks,

g1smd

5:29 pm on May 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK. Your rewrite takes a URL request for example.com/pt-br/courseP-20090501 and rewrites that to fetch content from the file /courseP-20090501.html.

In that case, your code in the very first post is exactly backwards.

However, you also need a redirect so that people asking for the old URL, are redirected to the new URL. That redirect should force the domain name and use the [R=301,L] flags.

jdMorgan

6:39 pm on May 11, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



An html file is not a URL. Therefore, you are probably confusing yourself, and making this project seem a lot more difficult than it really is.

A URL is what you see in your browser's stats line when you hover over a link on one of your pages.

A file is a physical file on the server.

URLs are not at all the same thing as filepaths, and are only "associated," not "equivalent."

URLs are used "out on the Web," while filepaths are used "inside your server," and these two "realms" are very different. The basic job of a server is to convert HTTP client-requested URLs to internal filepaths, so that it can find the content that is being requested.

If you click a link to the URL "http://example.com/privacy" on a Web page, then the example.com server may get the file to serve that request from a file-path like "/apache/users/ex/example.com/public/www/privacy.html". Obviously, the linked URL does not contain most of the directory-path info needed inside the server, and the server has no need of "http:" once the request has already arrived at the server.

Mod_rewrite allows you to change the URL-to-filepath association even more; For example, you could add a rewrite so that requests for URL "http://example.com/privacy" would be served with a file at "/apache/users/ex/example.com/public/www/some-new-subdirectory/new-privacy-policy.html".

Hopefully, this clarifies the very-important difference between a URL and a filepath.

Please post an example of one of the "uppercase P" links on your pages, and an example of the filepath that you wish that request to resolve to, including all subdirectory path information.

Jim