homepage Welcome to WebmasterWorld Guest from 54.167.173.250
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
subdomains and using main assets/images
mihomes




msg:4594541
 11:23 am on Jul 19, 2013 (gmt 0)

Running into a problem and not sure what the best practice is going to be or maybe I am thinking about it all wrong.

test.example.com is a subdomain with its contents located at example.com/sub_ds/test/

I am using some redirects and a rewrite to take care of this scenario appropriately as discussed and solved in [webmasterworld.com ]


# Externally redirect client requests for example.com/sub_ds/<subdomain>/<URLpath> to <subdomain>.example.com/<URLpath>
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /sub_ds/
RewriteRule ^sub_ds/([^/]+)/(.*)$ http://$1.example.com/$2 [R=301,L]

# Externally redirect client requests for www.[subdomain].example.com/sub_ds/<subdomain>/<URLpath> to <subdomain>.example.com/<URLpath>
RewriteCond %{HTTP_HOST} ^www\.([a-z0-9-]+)\.example\.com
RewriteRule ^sub_ds/([^/]+)/(.*)$ http://$1.example.com/$2 [R=301,L]

# Internally rewrite <subdomain>.example.com/<URLpath> to example.com/sub_ds/<subdomain/<URLpath>
RewriteCond $1 !^sub_ds/
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteRule (.*) /sub_ds/%1/$1 [L]


All good. Now, let's say example.com/sub_ds/test/index.html uses the following in it:

<link rel="stylesheet" href="/css/styles.css" type="text/css" />

or

<img alt="" height="10" src="/images/test.png" width="10">

or

<a href="/test.htm">test</a>

These will no longer work (in terms of their real file location) as they now use test.example.com as the root.

Now, before you say use absolute urls, I will be forcing https on a specific subdomain, say secure.example.com, and force http on everything else to prevent dupe content.

Would the best practice be to store all images, css, js its own subdomain called assets.example.com then allow both http and https on that subdomain? This still brings up the issue of dupe content with secure/non-secure of all images, css, and js though.

It's been a long night and I'm not thinking too clearly so forgive the strange thought process right now.

 

JD_Toims




msg:4594742
 1:58 am on Jul 20, 2013 (gmt 0)

I like the subdomain idea with both http and https "open" and I think any "dupe content" issue (if there is one for images and js files) could be resolved with a canonical header for the pages.

<FilesMatch "[^.]+\.(js|gif|jpg)$">
SetEnvIf Request_URI ^(/[^.]+\.(js|gif|jpg))$ CANONICAL_LOCATION=$1
Header set Link: "<http://www.example.com%{CANONICAL_LOCATION}e>; rel='canonical'"
</FilesMatch>

Just edit the protocol to https if that's the canonical location you prefer for search engines and I think you should be fine.

mihomes




msg:4594765
 7:07 am on Jul 20, 2013 (gmt 0)

I like this idea as well, but it will prove troublesome when actually designing the websites as the pictures will not be visually available ugh.

Never thought to use the canonical headers on the assets and that would certainly solve having secure / non-secure of the same files. Of course, if non-html/text actually matters.

Thanks for the input!

The only other thing that crossed my mind was doing an internal rewrite for subdomain assets to the main domain location. Not sure if this would pose a problem with http/https though. This way all files would be referenced normally when designing the site (as long as I used relative from root), would show as its location on the web, and keep all assets in the main domain.

mihomes




msg:4594908
 9:09 pm on Jul 20, 2013 (gmt 0)

Nobody else chiming in? I thought for sure this would be a hot topic... perhaps not enough people have the need for this type of setup?

Certainly people use subdomains so someone must have run into the problem of reusing the main domain assets if the page layout/design does not change much from the main...

JD_Toims




msg:4594944
 12:05 am on Jul 21, 2013 (gmt 0)

but it will prove troublesome when actually designing the websites as the pictures will not be visually available

Not sure what you mean? If you're worried about designing you have options:

A.) Don't use the "assets" subdomain initially for the design process and upload everything to the main domain, then when it's designed upload the assets to the subdomain, put a redirect for the assets into the .htaccess of the main domain and change the <img> <script> to absolute URLs referencing the subdomain.

B.) More complicated, because you can't "rewrite" to a subdomain, but you can "work-around" a bit, like this: Put all the images/js on the main domain. Put a 301 redirect on the main domain for the .ext of the assets to the assets subdomain. Put a rewrite in the .htaccess of the assets subdomain for all images/js to a "get-assets.php" file. Use file_get_contents('/full-server/path-to-the-main/domain'.$_SERVER['REQUEST_URI']); in "get-assets.php" and get the images/js from the main domain. (It's basically a loop, because you redirect to the sub, then rewrite and use PHP on the sub to "walk the file path internally" then "grab" the assets from the main domain's directory "silently" as far as browsers and bots are concerned. (If you do this make sure you set the headers correctly via PHP for the specific type of images/js files or they will not display in some browsers correctly, regardless of the file extension.)

C.) Probably more I'm not thinking of...

JD_Toims




msg:4594948
 12:22 am on Jul 21, 2013 (gmt 0)

D.) One of the things I didn't think of: You could set up the subdomain so the "host directory" is the same as the main domain (usually public_html in cPanel) then you could redirect requests for images/js from the www to the assets subdomain and redirect all non-image/non-js requests from the assets subdomain to the main www domain. This way they would look like they're separate externally, but run off the same directory internally so the entire site could be uploaded in one place. (I haven't thought all the mod_rewrite for it through, but I'm sure it could be done this way too.)

mihomes




msg:4594962
 2:24 am on Jul 21, 2013 (gmt 0)

D sounds interesting and is similar to what I was thinking just in a rewrite/redirect fashion. Setting the same host directory did not cross my mind though. I will have to run some test on this later tonight and see if it causes any issues.

lucy24




msg:4594968
 2:56 am on Jul 21, 2013 (gmt 0)

perhaps not enough people have the need for this type of setup?

Lots of people probably do, but most of them don't speak fluent Apache :) (I, for example, only do Regular Expressions.)

then allow both http and https on that subdomain?

For variety's sake, this may be more a human problem than a search-engine problem. Current browsers tend to yap when a secure page includes non-secure content. Maybe future browsers will decide that the distinction is pretty meaningless when it comes to images, and then it will become a non-problem.

But there have been other threads in this subforum about the http vs. https issue in supporting files. It comes down to --in paraphrase--

If the requested file is a non-page and the protocol doesn't match the protocol of the referer (this is most easily done by giving the name of the referring page or directory, since its protocol is already correct), redirect the request. If there is no referer-- as with search engines-- force http.

This still brings up the issue of dupe content with secure/non-secure of all images, css, and js though

Do you want your javascript and stylesheets indexed? Some people slap a noindex label on anything with a non-page, non-image extension; this is easily done with a <FilesMatch> in htaccess.

I don't know whether Duplicate Content is an issue with images, especially if the duplication is limited to protocol rather than domain name. Someone undoubtedly knows-- but you're more likely to find them in a g###-related subforum.

JD_Toims




msg:4594969
 2:57 am on Jul 21, 2013 (gmt 0)

Cool!

I haven't tested it, but I think I'd get the mod_rewrite for the redirects out of the way quick and put them before everything else, so I'd start the file with: (Edit the extensions to what you want on assets.)

RewriteEngine on
# If the host is assets.example.com or empty
# and the extension is .gif .jpg .js end the rewriting
RewriteCond %{HTTP_HOST} ^(assets\.example\.com)?$
RewriteRule \.(gif|jpg|js)$ - [L]

# If the host is NOT assets.example.com or empty
# redirect to the assets.example.com sub
RewriteCond ${HTTP_HOST} !^(assets\.example\.com)?$
RewriteRule \.(gif|jpg|js)$ http://assets.example.com%{REQUEST_URI} [R=301,L]

# If the extension is NOT .gif .jpg .js but
# the host IS assets.example.com redirect to the www
#
# NOTE: You might need to move this one "lower" in the .htaccess
# to avoid chaining redirects together. I'm not 100% sure, because
# without seeing the whole file together I have a tough time with
# figuring that stuff out... You might need to put this one right
# before any www canonicalization for efficiency and not chaining
# redirects together.
#
RewriteCond %{HTTP_HOST} ^assets\.example\.com$
RewriteRule !\.(gif|jpg|js)$ http://www.example.com%{REQUEST_URI} [R=301,L]

mihomes




msg:4595014
 10:32 am on Jul 21, 2013 (gmt 0)

Alright, actually I didn't think about this till now. When designing the subs are going to be in subfolders in the directory structure so when designing relative links will work fine. Duh... overthinking this whole thing.

I like the idea of the assets as root then just point all assets calls (sub or not) to the assets subdomain. Although, you might as well just do the same for all assets to their normal non-sub location on the main domain if you have to rewrite everything anyways. This would also prevent redirects for the main domain itself. This being that I now realize designing with relative is not an issue (what a stupid moment of non-clarity hehe).

I am not sure if calling the assets from the subdomain would benefit from the speed increase in parallel (maybe not the right word, but I know you can serve more content quickly when using multiple domains/locations) downloading or not since there would be a redirect called on each. Is there a way within apache to 'map' calls to a specific location that would prevent a redirect from happening server side? That is beyond my knowledge of Apache. I am going to read up on this more now. If anyone knows about this please comment.

Lastly, another thing not thought of earlier, what would be the seo repercussions from having a 301 on all these assets (subdomains only if the above method was used)? This is what originally brought up the server map question about to eliminate the 301's.

lucy24




msg:4595018
 11:07 am on Jul 21, 2013 (gmt 0)

A single redirect on images? Honestly I think it isn't even worth worrying about. Especially since the redirect only applies to requests that would otherwise be duplicates. Once the googlebot has found the image URLs you want it to use, it will always find the same images in the same place.

Question I don't know the answer to: Would Googlebot-Image even try to get images using https? I can't imagine why it would bother. Remember, it won't be coming in with any specific page as referer. It just uses the page to get information about other linked files, including images. The actual request is completely independent. There have been rare cases where the googlebot itself-- not the imagebot-- asks for a non-page file, giving the page as referer. But nobody has ever figured out why it does this, so it isn't worth worrying about ;)

Edit:
# If the host is assets.example.com or empty
# and the extension is .gif .jpg .js end the rewriting
RewriteCond %{HTTP_HOST} ^(assets\.example\.com)?$
RewriteRule \.(gif|jpg|js)$ - [L]

# If the host is NOT assets.example.com or empty
# redirect to the assets.example.com sub
RewriteCond ${HTTP_HOST} !^(assets\.example\.com)?$
RewriteRule \.(gif|jpg|js)$ http://assets.example.com%{REQUEST_URI} [R=301,L]

One the first condition is out of the way-- "the host IS assets.example.com or nothing"-- and you've met your [L] flag, is there any possible circumstance where the condition in the follow-up rule would NOT be met? Seems like the condition wouldn't even be necessary.

Note for people reading along: Yes, this is one of the special cases where a non-redirect goes before a redirect in your ordering of rules. In fact rules ending in - [L] ("Stop here and don't do anything") often go before all rules that, well, Do Stuff.

mihomes




msg:4595034
 1:30 pm on Jul 21, 2013 (gmt 0)

More reading has raised a few more question/ideas/thoughts.

I noticed in my server when I create a subdomain, say test.example.com, that in the dns zone for example.com it create an a record for both :

test and www.test

This leaves the question - can I just remove the www.test a record to prevent that url structure from being used/accessed or is there some underlying reason it is needed? Lucy, you will remember in the other thread where I was doing a rewrite in htaccess for this www.subdomain.domain.com example to prevent dupes. More so, that someone could not purposely link to me that way to create dupes.

Secondly, I looked more into cnames for mapping. You can create a cname record in the dns for assets.example.com and then point it to example.com. This maps everything on the main domain.

I could then do a catch all redirect on all assets links to the assets.example.com location.

Issues and thoughts with this method :

- Would need a redirect for the assets on the main to the assets.example.com location or send a canonical header for it. Speaking of everything I have read about canonical (in terms of Goog) says this is only a recommendation to them not a definite rule they will follow.

- Files other than assets could be viewed at assets.example.com since everything is mapped from the main. Redirect non-assets to the main.

- secure and non secure... since all assets are being redirected to assets.example.com the rules would need to include whether the request is http or https and redirect appropriately.

- On top of the above this would let me allow both http and https on the assets.example.com location (assuming this works since these are mapped from the main) and deny everywhere else. This is also another reason to include canonical headers for the assets to specify the non secure version of them.

- Actual secure areas which would probably be secure.example.com could be controlled with redirects as well. Always https.

I really don't like the idea of so many 301s happening on every page request (all assets would be redirected) so with some css, js, and images you might have 30 or more 301s on any given page.

I think that strained my brain a bit heh.

JD_Toims




msg:4595079
 6:08 pm on Jul 21, 2013 (gmt 0)

* Haven't made it to the last post in this thread yet.

I am not sure if calling the assets from the subdomain would benefit from the speed increase in parallel (maybe not the right word, but I know you can serve more content quickly when using multiple domains/locations) downloading or not since there would be a redirect called on each.

The speed is generally gained by using a subdomain to force a visitors browser to open more connections to load the page. The general default for browsers is 2 connections / hostname. By using a different subdomain (hostname) for assets you can force 4 connections.

No, there is no way to replicate it server-side. The browser must actually request the files from the subdomain.

One the first condition is out of the way-- "the host IS assets.example.com or nothing"-- and you've met your [L] flag, is there any possible circumstance where the condition in the follow-up rule would NOT be met? Seems like the condition wouldn't even be necessary.

You're correct, the 2nd condition isn't technically necessary, but I've had other people change the rule order of .htaccess files before and if those were changed and the 2nd condition was not present it would cause a 500 Internal Server Error for all files on the assest.example.com sub as it was repeatedly redirected to itself trying to get the files.



I'm not sure I understand the need to use relative rather than absolute URLs for the asset files. That would make the redirects "insurance" in case one got missed or left as relative.

Personally, if the designer (or who ever) couldn't handle working with absolute URLs I'd find someone who could, because it's really not that difficult to put assets.example.com/the-path-to-the-image/the-file.jpg especially when the path is exactly the same as it would be on the www domain.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved