Welcome to WebmasterWorld Guest from 35.172.195.49

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Single var mod rewrite failing

I was sure this one would be easy...

     
4:46 pm on Sep 8, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7256
votes: 3


And it probably is easy for an expert, but I can't for the life of me figure out what I'm doing wrong here.

I have a file in:-

https://example.com/sub-directory-parent/a44fG667/index.php

I want the user to be able to get here by visiting:-

https://example.com/a44fG667

(and preferably, but non-critically, also):-

https://example.com/a44fg667/

sub-parent-directory is static. a44fG667 is actually a UID/randomised directory name that is always made up of letters and numbers. It's part of a flatfile DB type structure (yeah Brett would love it ;) ).

My current design in my Apache config is:-

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ https://example.com [R=301,L]

RewriteCond %{HTTP_HOST} ^([a-zA-Z0-9]+)/?$ [NC]
RewriteRule ^(.*)$ https://example.com/sub-directory-parent/$1/index.php [L]

The www to non-www works fine. But the dynamic one doesn't. I'm struggling to debug to the point where I started just randomly trying to fluke the correct regex pattern then thought I'd better stop and ask you chaps :).

Any help would be much appreciated.

PS : long time no see WebmasterWorld - hope all is well here!
5:32 pm on Sept 8, 2016 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Apr 11, 2015
posts: 328
votes: 24



RewriteCond %{HTTP_HOST} ^([a-zA-Z0-9]+)/?$ [NC]
RewriteRule ^(.*)$ https://example.com/sub-directory-parent/$1/index.php [L]


You probably meant REQUEST_URI instead of HTTP_HOST in your condition, but that still wouldn't be quite right because REQUEST_URI starts with a slash (as opposed to the URL-path matched by the RewriteRule pattern that doesn't) - however, the condition is not actually required at all here. You only need a single RewriteRule.

For example:


RewriteRule ^([a-zA-Z0-9]+)/?$ https://example.com/sub-directory-parent/$1/index.php [L]


This does make the trailing slash optional (as you mention), however, in doing so you potentially have a duplicate content issue - do you really need that trailing slash?
8:22 pm on Sept 8, 2016 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11871
votes: 245


besides the RewriteCond issue mentioned by whitespace...


RewriteRule ^([a-zA-Z0-9]+)/?$ https://example.com/sub-directory-parent/$1/index.php [L]


This does make the trailing slash optional (as you mention), however, in doing so you potentially have a duplicate content issue - do you really need that trailing slash?

you really want to externally redirect the noncanonical version(s) of the url to the canonical version and then only serve the internally rewritten content to the canonical url request.
9:49 pm on Sept 8, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7256
votes: 3


Thanks guys :)

I have tried this, but it doesn't seem to work (pages 404-ing, but I have confirmed that the index.php page does exist and I can access it it with the full URI):-

RewriteRule ^([a-zA-Z0-9]+)/?$ https://example.com/sub-directory-parent/$1/index.php [L] 


I also tried:-

RewriteRule ^([a-zA-Z0-9]+)$ https://example.com/sub-directory-parent/$1/index.php [L] 


... but equally no joy.

Could I have something else configured incorrectly? Is there an easy way to debug via logging/other ?

you really want to externally redirect the noncanonical version(s) of the url to the canonical version and then only serve the internally rewritten content to the canonical url request.


Thanks Phranque. By "externally" you think I need a proxy sitting in front?

do you really need that trailing slash?


I could live without it. From a product point of view (sharable link containing a collection of images) it makes sense to be a "folder" because it's a collection of stuff. Which is why having that trailing slash is nice. But it's non-critical.
10:03 pm on Sept 8, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7256
votes: 3


OK, I admit it, I hadn't googled "is there an easy way to debug mod_rewrite" so I did that and found the re-write logger!

With that enabled I see this in my rewrite.log:-
With the trailing slash:-


192.168.199.1 - - [08/Sep/2016:23:05:04 +0100] [example.com/sid#7f42e437ee70][rid#7f42de5590a0/initial] (2) init rewrite engine with requested uri /x8b967d98/
192.168.199.1 - - [08/Sep/2016:23:05:04 +0100] [example.com/sid#7f42e437ee70][rid#7f42de5590a0/initial] (3) applying pattern '^(.*)$' to uri '/x8b967d98/'
192.168.199.1 - - [08/Sep/2016:23:05:04 +0100] [example.com/sid#7f42e437ee70][rid#7f42de5590a0/initial] (3) applying pattern '^([a-zA-Z0-9]+)$' to uri '/x8b967d98/'
192.168.199.1 - - [08/Sep/2016:23:05:04 +0100] [example.com/sid#7f42e437ee70][rid#7f42de5590a0/initial] (1) pass through /x8b967d98/


Without:-


192.168.199.1 - - [08/Sep/2016:23:07:22 +0100] [example.com/sid#7f42e437ee70][rid#7f42de55b0a0/initial] (2) init rewrite engine with requested uri /x8b967d98
192.168.199.1 - - [08/Sep/2016:23:07:22 +0100] [example.com/sid#7f42e437ee70][rid#7f42de55b0a0/initial] (3) applying pattern '^(.*)$' to uri '/x8b967d98'
192.168.199.1 - - [08/Sep/2016:23:07:22 +0100] [example.com/sid#7f42e437ee70][rid#7f42de55b0a0/initial] (3) applying pattern '^([a-zA-Z0-9]+)$' to uri '/x8b967d98'
192.168.199.1 - - [08/Sep/2016:23:07:22 +0100] [example.com/sid#7f42e437ee70][rid#7f42de55b0a0/initial] (1) pass through /x8b967d98


I'm guessing that "pass-through" means somehow the REGEX did not match the pattern x8b967d98 (actual, the only thing I changed in the log above is my domain for example.com) but I can't understand why?
10:15 pm on Sept 8, 2016 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Apr 11, 2015
posts: 328
votes: 24



RewriteRule ^([a-zA-Z0-9]+)/?$ https://example.com/sub-directory-parent/$1/index.php [L]


Ooops, if this is an internal rewrite then the scheme and hostname should be stripped from the substitution (Apache will tend to force an external redirect otherwise).


My current design in my Apache config is:-


Oops, I was somehow assuming this was in an .htaccess file - so, this is directly in your server config, not in a <Directory> section or .htaccess file? In which case, bringing this together, try the following:


RewriteRule ^/([a-zA-Z0-9]+)/?$ /sub-directory-parent/$1/index.php [L]


Note the extra slash at the start of the RewriteRule pattern.

But phranque certainly has a point here... should this be an external redirect? (Thus avoiding the duplicate content issue etc.) What is the canonical URL? By external redirect, it's just that... R or R=301 flag on the RewriteRule. For example:


RewriteRule ^/([a-zA-Z0-9]+)/?$ http://example.com/sub-directory-parent/$1/index.php [R=302,L]
11:20 pm on Sept 8, 2016 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7256
votes: 3


Thank you Whitespace.

RewriteRule ^/([a-zA-Z0-9]+)/?$ /sub-directory-parent/$1/index.php [L]


That works perfectly :)

What is the canonical URL?


From a product point of view I think it'd have to be with the slash.

I wouldn't say this should be an external redirect though. The user should never see the background URL with the sub-directory-parent and index.php. It's a shareable link, so the shorter the better and it should ideally be the only way to access that page.
8:41 am on Sept 9, 2016 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Apr 11, 2015
posts: 328
votes: 24


You're welcome, glad you got it working.

From a product point of view I think it'd have to be with the slash.


In that case, you could redirect to the canonical URL (as phranque suggested) if you wish, with something like:


RewriteRule ^/([a-zA-Z0-9]+)$ /$1/ [R=301,L]
RewriteRule ^/([a-zA-Z0-9]+)/$ /sub-directory-parent/$1/index.php [L]


The first directive redirects a request for "/a44fG667" (no trailing slash) to "/a44fG667/" (the canonical URL, with a trailing slash). The second directive internally rewrites just the canonical URL (with a slash) as before (except the slash on the end of the pattern is no longer optional).
10:49 pm on Sept 9, 2016 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11871
votes: 245


you probably don't want to expose the internal urls, so you should also implement external redirects for /sub-directory-parent/[a-zA-Z0-9]+/index.php requests.