Forum Moderators: phranque

Message Too Old, No Replies

Question about mod rewrite

mod_rewrite, subdomains

         

Lukich

7:30 pm on Jun 28, 2010 (gmt 0)

10+ Year Member



Hello. I'm fairly new to network technologies, so this question might seem trivial to you, but please, bear with me :) I was asked to create a setup where people would be able to type in a URL in the following form (someword.example.com) and all be taken to the same main page, where different info will be pulled up based on the subdomain. The subdomains have to be generated dynamically, there will be thousands of them. I did some research and it looks like I can simply register a * domain on my DNS server, which will point to my IP address. Then, in order to weed out bad subdomains, I can set up a mod_rewrite on Apache to make sure subdomains would have only alphanumeric characters and rewrite them to the same main page. Does this sound right? Is there a better approach?
Also, here's what I think my rewrite rules/conditions will look like:

RewriteCond %{HTTP_HOST} ^[a-zA-Z]+\.example\.com$
RewriteRule ^(.+) http://www.example.com/

Does this look right?
Any help is much appreciated.
Thanks!
Luka

[edited by: jdMorgan at 2:51 am (utc) on Jul 6, 2010]
[edit reason] example.com [/edit]

g1smd

9:06 pm on Jun 28, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



How will you handle requests for robots.txt, images, CSS, JS files etc?

Is there more than one page URL on the subdomains? Will those page URLs be "extensionless" or not?

Fully defining the requirements is the largest part of this job.

The code is trivial, but can't be started until all requirements are known.

Lukich

10:18 pm on Jun 28, 2010 (gmt 0)

10+ Year Member



We don't care about the robots at this point. CSS/JS will be identical. Essentially, the site will be identical in all respects except for certain information pulled from the db based on the subdomain value. It will be extensionless, like so:
myname.sitename.com/offerName

thank you!
Luka

g1smd

11:48 pm on Jun 28, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is clear that you didn't understand the question.

Your "design" MUST include provision for requests for the robots.txt file - either return a valid file or a 404 error. Your current code will make a request into your script for that file. If your script doesn't return a valid file or a proper 404 header then you have a big problem.

Likewise, your current code requires that your script must build images, CSS and JS files and serve them through the script. They cannot exist as separate files.

Your example code FORCES that, and that is not the right way to do things.

So, again, exactly "how" will those things be handled? Where will they be inside the server, and what URL space will be used to request those things?

So far, your requirements are so vague there is no way to even begin.

Lukich

12:27 am on Jun 29, 2010 (gmt 0)

10+ Year Member



Well, to be honest with you, I understood your response even less. Please bear with me, as I'm not very comfortable with servers yet.

Here are the requirements as they were presented to me - I have multiple subdomains that all lead to the same landing page, but with different urls. (name.site.com). All subdomains share all js/css info with the exception of some data assigned to the page dynamically. That's it.

Regarding your questions - just curious, why do I need a specific provision for robots.txt. If I simply omit it, wouldn't it return 404 if the file is missing?

I'm not entirely sure what you mean by this paragraph: "Likewise, your current code requires that your script must build images, CSS and JS files and serve them through the script. They cannot exist as separate files.

Your example code FORCES that, and that is not the right way to do things. " Would you please clarify?

The URL space I'd like to use for everything would be the url space of the main landing page, since all my subdomains ultimately point to it.

Thank you.
Luka

jdMorgan

3:05 am on Jul 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Simply put, you are rewriting all URL requests to a single script file.

Therefore that script must *generate* a response for every possible URL that could be requested from your server.

It must *create* all image, css, and robots.txt data, and send it back, along with a correct HTTP response header, to the requesting client (e.g. browser or search engine robot).

The typical implementation is to exclude all images, media, css, and document requests from being rewritten. Rewrite only those requests to the script that it knows how to generate a response for.

For example, to exclude some of those filetypes, you could add a RewriteCond like this to your rule:

RewriteCond %{REQUEST_URI} !^/\.(gif|jpe?g|png|bmp|ico|css|js|pdf|doc|xls|mp3|wav|swf|flv|mp4|avi|wmv|mov)$ [NC]

Adjust the excluded "file types" list to suit your site; Anything that your script cannot generate must be excluded.

Or perhaps, you could use a less-specific RewriteCond that excludes *every* request that includes a "file type" extension, such as:

RewriteCond %{REQUEST_URI} !^/([^/]+/)*([^.]\.)+[a-z0-9]+$ [NC]

Jim