Forum Moderators: phranque

Message Too Old, No Replies

At a complete loss as to why my server is requesting files

         

JS_Harris

6:29 am on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



My site has an area that, via php, calls a template file to serve as layout for the content in one section of my site. It's not much different than any site which calls a template file, except this template is not the main template, it only handles the content in one area. No iframes, all php, nothing fancy.

The code requests the file like it does ANY other file, and it works well, but the server logs tell a different story. Have you ever come across this?

- Visitor loads a page
- ip address of that visitor is stored along with the url requested
- ip address of the domain ip is stored along with the template file
- ip adress of that visitor is stored along with the url of everything else(image/css etc).

For the life of me I can't seem to figure out why this template file always displays in log files as having been requested by the domain ip instead of by the visitor ip, in fact I don't know why it appears at all, template file names usually don't. Is there a common issue that might cause this?

Problem it's causing:
- If I block requests with no referrer or user agent ONLY this section results in a 403 error, with the error file appearing right on the page in that area, when the visitor provides both. The server passes neither in delivering the file to the visitor on this page so always returns the 403 error document there.

lucy24

6:45 am on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



this template file always displays in log files as having been requested by the domain ip instead of by the visitor ip

But isn't that exactly who is requesting it? (Assuming "the domain IP" means the IP of the server where all this is happening.) The visitor requested the overall page, and will go on to request images and scripts and stylesheets. But unless there was a serious mistake in coding, they're not requesting your php template; that's an internal request that the user knows nothing about.

If I block requests

If you're doing this in mod_rewrite, make sure you use the [NS] flag so the rule isn't triggered on internal requests. You may also need a RewriteCond referencing %{THE_REQUEST}, though the flag alone should be enough.

JS_Harris

8:13 am on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



No, they aren't requesting the php template file for this area, the server is. It's a shared environment so the ip actually resolves to many sites on that server if you look it up. I have two similar sections on each page but only one is having this problem. The NS flag doesn't work precisely because the request is seen as coming from a different source(hence it shows up in logs). I'm concerned with why only this one template file is being requested by the server instead of... well, not showing up in logs at all like the others.

Bleh, I had to retype that, I keep hitting the "post reply" button below instead of submit when I'm not paying attention, lol. That erases my post! Hope it's clear anyway.

[edited by: JS_Harris at 8:19 am (utc) on Feb 11, 2016]

whitespace

8:15 am on Feb 11, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



My site has an area that, via php, calls a template file


Presumably this "template file" is being include()'d (or file_get_contents(), fopen() etc.) in order to construct the page to return to the client?

It sounds like you are making an external HTTP request to this file instead of an internal filesystem request. ie. you are doing "something like":


// By specifying an absolute URL, PHP makes an external HTTP for the resource
include('http://example.com/templates/mytemplate.html');


Instead of a "normal" PHP include:


// Include the file from the local filesystem (normal)
include($_SERVER['DOCUMENT_ROOT'].'/templates/mytemplate.html');


(EDIT: Why isn't the first code block syntax highlighted?)

JS_Harris

8:22 am on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The php that calls the file is an include. Included is one file that performs all the tasks, one of which is to display the output within the template file contents. Unfortunately that file is encrypted and seems to be detecting the url path all the way to the server level... which is actually working. I know this is probably specific to the software so the answer will be no but is there a generic way of over-riding such a request? None of the other files are encrypted, just the one that barks out orders.

It's using
dirname(__FILE__) . "/etc...

JS_Harris

10:57 am on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



actually, it's automatically finding the path of the template file, which appears to go all the way back to /home/account/public_html/folder/templates/etc.

JS_Harris

7:58 pm on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



**UPDATE**

To keep things simple for further testing I created a file that says hello and included it
<?php include('hello.php');?>

The above works. I added some code to the file which requires a few other files to be processed. again the above still works. At this point in my log files I see hello.php being called by the domain ip while the page itself is called by my computer's ip. hello.php shouldn't be showing up at all which suggests a misconfiguration of some sort.

I added the following to my htaccess file
RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteRule .* - [F]

result: where hello.php appears on the page has now been replaced with my 403 errordocument, but the rest of the page looks fine. The server logs show the request did not come with a user agent or referrer so it was blocked. I changed the htaccess code to this...

RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteRule .* - [NS,F]

and now the contents of hello.php appear BUT the rewriterule is no longer blocking empty(or dashed) user agents, or anything else. I used an addon to strip my browser user agent, for example, and no luck, the page shows. I further tested by blocking xenu and blocking my own IP but with the NS tag the line no longer works. NS is apparently not a supported tag by my host? I don't know.

Would it then be my host that is viewing includes as external and not my code? Still at a loss. The top of my htaccess file on this site looks like this, if it matters....
Options -Indexes
Options +FollowSymlinks
Options -MultiViews
RewriteEngine On
RewriteBase /

lucy24

9:01 pm on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<tangent>
I remember when I made a deny-blank-UA rule, it inexplicably wouldn't work as a RewriteRule. To achieve the identical end I shifted it to mod_setenvif:
BrowserMatch ^-?$ keep_out
followed by a mod_auththingummy directive.

And then, of course, you'll need an exemption for the specified IP to say !keep_out
</tangent>

Incidentally, you can combine all those "Options" statements in a single line, since they all begin in a + or -.

Now, for the heck of it you could try
AddOutputFilter INCLUDES .php
It should make no difference, since you're dealing in php includes rather than SSIs, but who knows.

whitespace

9:21 pm on Feb 11, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



<?php include('hello.php');?>


Is this still within your "software", or is this a completely separate example on your server?

You said earlier that it was just that one "template" that appeared to exhibit in this behaviour, yet you seem to have been able to easily reproduce this behaviour with a simple include?

It's not "normal" for this to trigger an Apache subrequest. (In fact, I would have said it was very wrong?) There are specific PHP functions for doing that, if that is required. It's not uncommon for a site to have hundreds of include()'s to generate a single page and for each to trigger a subrequest would be... well, a nightmare!?

Just try specifying a path in the include, like this:

<?php include('./hello.php');?>


Yes, this is essentially the same as above (the current directory), however, by explicitly including a path (just "./") the include_path will not be searched. What is the include_path?

JS_Harris

10:04 pm on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This was a new example, yes I was able to reproduce the problem so it's not a problem within the software but a general one, apparently.
<?php include('hello.php');?> and <?php include('./hello.php');?> both resolve the same way, both are triggering the subrequest. This is a primary domain(and only domain so far) on a shared account with a host I haven't used before(but it's popular). All of the DNS settings look correct, everything is what it was set as default, I've changed nothing.

According to the server documentation the path is /home/account/public_html/ , with "account" being replaced with my account name. The documentation says that files in public_html will be treated as domain root. I am able to assign other domains within the account by placing their contents in a folder inside of public_html and going through their process, but this site is not contained within a folder within the root, it's all placed inside public_html.

Complicating things is that this is also happening to require_once, not just include(I am still testing). I've asked for support, they say all looks fine and suggest my code for blocking blank referrers is likely wrong but that they can't evaluate code. They said the pathing is fine.

The site is working, I'm not sure what else I can look into to fix this. There are no other addons or CDNs or caching involved, and not even any software besides a nearly empty test file now.

whitespace

10:33 pm on Feb 11, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



Just another example to try...

Create "helloworld.php" containing the following (just a function definition):


<?php
function helloworld() {
return 'Hello World';
}


Then, in your calling file (index.php?):


<?php
error_reporting(E_ALL | E_STRICT);
ini_set('display_errors','1');
include('helloworld.php');
echo helloworld();


Is "Hello World" output? If this involved an Apache subrequest I wouldn't necessarily have expected this to work. (?)

JS_Harris

11:46 pm on Feb 11, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes it was index.php. I performed your suggestion and it does output 'Hello World'. I named the requesting file 'test.php' and the server logs show my IP requesting test.php and the domain ip requesting helloworld.php.

I also /facepalmed when I noticed something else in the logs just now, the site is new but a real visitor hit the site and this happened:

- visitor loaded up the index page
- a hit with the visitor ip was generated for '/' and the referrer was the site linking to mine
- a hit with the visitor ip was generated for the css file and the referrer was my domain (http://www.example.com/)
- a hit with the visitor ip was generated for the header image and the referrer was my domain
- a hit with the domain ip was generated for the template folder requested by the include file, no referrer

I was focusing on the domain ip issue but shouldn't the referrer have been the same for ALL of those requests from that one hit? and all of those been the domain he came from? I don't have this problem with my other shared hosting account at another company or on my main servers with a 3rd company so I'm really wondering if something is wrong with the settings at this point, it's probably not code related... is it?

lucy24

1:30 am on Feb 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



shouldn't the referrer have been the same for ALL of those requests from that one hit?

No, what you're describing is perfectly normal: The human user requests the page, with referer = whoever sent them (linking page or search engine, for example). Once your human's browser has loaded up the html, it then requests all supporting files, now giving the page as referer. That's why a browser is called a User-Agent: it does all the behind-the-scenes stuff your human doesn't know have to be done, so the human doesn't have to type in twenty separate URLs for a single page request. The only time you don't see this referer pattern is in some infuriating Android variants where the page and supporting files all give the same referer, such as a search engine, and boy does it make a mess of logs. Conversely, requests for the favicon will generally come through without any referer if it's not explicitly named in the html (link rel="icon" or whatever the wording is).

I still don't get why the internal request is even showing up in logs at all, though. Mine don't. Is it your own server? Does the internal request go away if you change the Log Format or Custom Log or, uh, something else that whitespace will be able to explain?

JS_Harris

2:01 am on Feb 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's not my own server, it's a low cost shared account on a site that is popular for those, hostgator. I've never hosted a site with them before, until this project. I realized about the referrer logs after I posted the above, I've been starting at them all day to try and figure this out and nothing seems right after a while, oops.

Anyway. An empty htaccess file with only the following(blocks blank user agents) causes whatever is included to return a 403 error for only that section of the page, but the rest of the page itself is fine.
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteRule .* - [F]


.htaccess is in the right place, in the public_html folder where the rest of the primary domain's contents are. Without the above the page loads fine, save for the odd log data which I could live with if it didn't stop me from blocking blank referrer/UA requests. I'm wondering if moving the site contents to a sub-folder would clear up the path issue, perhaps this only happens to the primary site. They don't support having a primary site be in a sub folder so I'd have to get another domain on the account to swap it with.

lucy24

3:20 am on Feb 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As a stopgap, you can add a RewriteCond:
RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteCond %{REQUEST_URI} !whatever-the-filename-is
RewriteRule .? - [F]

JS_Harris

4:26 am on Feb 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{HTTP_USER_AGENT} ^-?$
RewriteCond %{REQUEST_URI} !=/full/path/of/problempage\.php
RewriteRule .? - [F]
The above worked to allow blocking of blank user agents/referrers. !^/problempage.php, !^/problempage\.php or !^problempage\.php did not work, full path with '=' required?! The server still insists on calling that one file so every log file entry for a valid url that requires it be included results in a second log file entry for that one file by the domain ip. I did not add [NC] to the request URI since the server will never request mixed or upper case but people might, or is that incorrect?

This stopgap also added another bit of info to the puzzle:
- when the first include works, they ALL work
- when the first include fails they ALL fail
- I could not, for example, set the REQUEST_URI to the second include file and make that work without adding the first
- Nor do I need to add the second include filename to the above to make the second one work if the first is including properly
- That's why only one file is displayed in the code snippet above, even though two are being called. The problem seems to cascade.

More clues but less of an idea what's going on. At least my visitors won't know, thanks for the help.

lucy24

5:49 am on Feb 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I did not add [NC] to the request URI since the server will never request mixed or upper case but people might, or is that incorrect?

Nope, it's correct.* The only time [NC] or equivalent is appropriate is when you genuinely need to allow all possible casings of the entire request. If the only options are "filename" and "FileName", or "html" and "HTML", write them out. [NC] is equivalent to saying
[Ff][Ii][Ll][Ee]
... et cetera, I get tired even typing it out! The server will not gratuitously change the case of anything.

full path with '=' required?!

That is odd. Does it work if you omit the anchors and just give the filename, assuming you don't have unrelated files with the same name in other directories? I'm also assuming the include file really is located in the root, because otherwise of course the form ^/filename (with anchor) won't work. Requiring the full path does imply that this rule is executing at some later time, after paths have been worked out.

:: vague association with behavior of RewriteRules inside a <Files> envelope, but you're officially not supposed to do that so it's probably not a useful comparison ::

That's why only one file is displayed in the code snippet above, even though two are being called. The problem seems to cascade.

Ooh, this is all so interesting. I don't understand it, but it's interesting. If I'm getting it right, the server only counts the first include as a loggable request, and then everything else happens quietly behind the scenes the way it's supposed to?

:: grasping at straws ::

Does any part of the code involve an unfamiliar php version, located somewhere other than where the server would normally expect to find it?


* At least in English. Negative questions are tricky.

whitespace

10:57 am on Feb 12, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



I still don't get why the internal request is even showing up in logs at all, though. Mine don't.


Mine neither.

The logged referers (ie. your domain) for your CSS and images look correct though - as lucy24 says.


RewriteCond %{REQUEST_URI} !=/full/path/of/problempage\.php


I have come across situations on shared servers where it was required to include "RewriteOptions Inherit" in the .htaccess file in order to seemingly "correct" URL/paths like this - as if dependent on some parent config in some way? Not sure exactly what was going on though. Incidentally, if you are using the "=" operator, there is no need to escape the dot (although I guess it doesn't matter either?).

Requiring the full path does imply that this rule is executing at some later time, after paths have been worked out.


Although the REQUEST_URI variable would never "normally" contain the filesystem path? (Or could it?) The REQUEST_FILENAME variable, on the other hand, contains the full filesystem path when processed later in the request ("after paths have been worked out", such as in .htaccess). However, when processed early (such as in the server config), REQUEST_FILENAME contains the same value as REQUEST_URI.

The REQUEST_URI variable does get updated when the request is rewritten. Could this imply that the request has already been rewritten (by a server config)?

If I'm getting it right, the server only counts the first include as a loggable request, and then everything else happens quietly behind the scenes the way it's supposed to?


Really?! If it's only the first include() that gets logged then a (server) "solution" feels even less likely. (?)

My guess (wild stab in the dark) is that it's something to do with the (CGI?) handler used to process the PHP files? But if it's only happening to the first include()....?!

Aside: PHP provides the virtual() function (never used it myself) for specifically triggering Apache subrequests (although not recommended for PHP files and it's only available when PHP is installed as an Apahce module). However, even for this function, the docs state, "The requested file will not be listed in the Apache access log."

JS_Harris

6:33 pm on Feb 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Incidentally, if you are using the "=" operator, there is no need to escape the dot (although I guess it doesn't matter either?).
Correct BUT it does matter, the dot could not be escaped. I double checked the working code and the dot is not escaped because doing so causes it to fail and throw a 403 error for that request. ie: the internal request is not ignored, the server takes the backslash as a literal part of the URI. I put the wrong version up above, remove the dot escape if you try it.

I moved my test files over to two other domains on two other hosts, both with 5.4+ php and up to date on everything else that is relevant, and this problem is not reproduced. So for that reason, given the workaround to allow me to properly block/allow traffic, I'm going to tuck this issue away and move on. Perhaps a future update of the server will resolve the issue, the workaround shouldn't affect anything.

If you can think of anything else to look at, test or try I'd be glad to see what the outcome is.

whitespace

7:21 pm on Feb 12, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



Correct BUT it does matter


Thanks for the clarification. My first thought was that it was in error and would have matched a literal backslash, but since you said it was working I went with it. (Since backslash escapes are allowed in some other non-regex arguments.)

Can you just confirm whether it was just the first include() that resulted in the loggable request and not all include()s? Thanks.

lucy24

9:42 pm on Feb 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Although the REQUEST_URI variable would never "normally" contain the filesystem path? (Or could it?)

Probably depends on your definition of normality :)

<tangent>
On one site I have a clutch of RewriteRules inside a <FilesMatch> envelope. I didn't know you aren't supposed to, and it seemed a quick way to add rules that apply only to image requests without having to specify in every single rule; now that I've done it, it isn't worth changing. The two things I learned are:
#1 Unlike rules in <IfModule> envelopes, these are executed at an entirely different time, so you need to say "RewriteEngine on" all over again. (I didn't bother with "RewriteOptions inherit", because who cares if an image request comes in with the wrong www, and I don't get enough blockable non-page requests to bother about.)
#2 In the body of the rule, the ^ anchor can't be used, because ^ no longer means "domain root". I've assumed-- but haven't tested-- that the same would apply to a RewriteCond pertaining to REQUEST_URI, since this is generally identical to whatever is given in the body of the rule.
</tangent>

JS_Harris

6:36 pm on Feb 13, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Somewhere between 4:30am and 4:37am the problem has resolved itself. I was asleep at the time so I have to assume someone at the hosting company fixed whatever was causing the issue. I downloaded a copy of all related files and compared them with those on my desktop and they are identical so nothing within the code appears changed. I dropped the exception from htaccess and no errors occur, nor does the extra call by the domain appear in logs.

I can confirm only the first instance caused a logged response, whitespace. I'd still really like to know what was causing this in case it comes back.

JS_Harris

8:30 am on Feb 18, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, the bandage put on by my host behind the scenes worked to resolve this but also broke my cpanel's visitor reports and so they reverted whatever they fixed when I contacted them again and the problem is back... but I figured it out(99% sure anyway). In case others experience this:

- Shared environment
- Host allows addon domains
- For each addon domain the host creates a subdomain of the same name for the primary domain(!?!)
- The addon domain is reachable directly OR by visiting addon.primarydomain.com
- The host creates a catchall redirect from subdomains to a folder of the subdomain at the server level(not in htaccess, can't be overridden)

That last part is the problem, anything.primarydomain.com becomes primarydomain.com/anything and there is no way to stop this behavior, it's how they handle addon domains. They do add proper DNS records but the mandatory catchall type redirect, despite being redundant, remains.

Turns out my htaccess security won't allow some of what my host does behind the scenes to work and so things break, like getting a path via php for an include or all of the stats gathering in cpanel. I have that security in place so that if people look for subdomains that aren't whitelisted they get nowhere.

Thing is I don't want people(and especially not search engines) seeing my addon domain content by visiting addon.primary.com or primary.com/addon and so the blocking will stay. I could redirect those but since it's a catchall redirect my host is doing I don't want to. I just need to put up with a few broken cpanel functions and needing to whitelist some calls to make things work as I want them.

My biggest worry, actually, is that my host notices error messages in THEIR server logs coming from my account and they decide to be helpful and undo what I've done. I'm sure they know that creating subdomains for the primary domain for each addon domain isn't a great idea... they have to know.

lucy24

8:12 pm on Feb 18, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



My "a" key is acting up. I got tired of chsing after it.
The addon domain is reachable directly OR by visiting addon.primarydomain.com
- The host creates a catchall redirect from subdomains to a folder of the subdomain at the server level(not in htaccess, can't be overridden)

That last part is the problem, anything.primarydomain.com becomes primarydomain.com/anything and there is no way to stop this behavior, it's how they handle addon domains

I'm confused. So, with the redirect, each addon is reachable either as
addon.com
or as
example.com/addon
(the latter corresponding to the physical directory structure in a "primry/addon" setup)
but not as
addon.example.com/
because that third option is redirected at the server level? Then what's the point of the /addon redirect (an URL form that nobody would ever think to try, except possibly a search engine looking for loopholes that deserve a 404?)

I understand that you can't stop the server-level redirect. But what's to stop you from instituting a further redirect of your own, from
example.com/addon/blahblah
to
addon.com/blahblah
? You'd need a %{THE_REQUEST} condition to prevent infinite loops. Sure, this will occasionally lead to a chained redirect-- but that serves the search engine right for requesting an URL they had no business requesting in the first place.

JS_Harris

12:24 am on Feb 19, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The redirect takes (.*).primarydomain.com/ and converts it to primarydomain.com/(.*) so to speak. It's happening in a way that cannot be undone by the site owner even with full control of htaccess and DNS registry. It's a permanent entry that the host simply isn't giving the webmaster control of in his/her cpanel. It does show up on the redirects page within the webmaster's dashboard for the primary domain but deleting it does nothing, it remains despite saying it was deleted. The dashboard says "deleted" but then shows you it's not, regardless of how many times you try.

That catchall redirect IS resulting in the problem I described in the OP since my site's htaccess forbids it from working. I've essentially broken a lot of things in my cpanel by returning a 404 for any subdomain or folder access on the primary domain.

Short version: The host redirect has to work for the problem above not to happen. Unfortunately the host redirect causes 3 different versions of the content found on any addon domain.

siteb.sitea.com
sitea.com/siteb
siteb.com

All three of those resolve to the same content. Google doesn't usually find it but a savy webmaster who figures out your host can bombard the internet with links to siteb.sitea.com to create unlimited copies of the content on siteb.com. If he uses adult words the catchall passes those via 301 to the addon domain.

In other words I'm keeping my site broken to prevent that from happening. I choose having problems with an include over potential massive duplicate and spam issues with a search engine.

edit: Come to think of it this is also a huge security issue since a webmaster may think siteb.com is protected thanks to the htaccess file on siteb.com but since reaching the content on siteb.sitea.com uses the htaccess file of sitea to access siteb some of the rules may or may not work at all. I really don't like the host creating a subdomain and catchall redirect for EVERY addon domain placed on the hosting account. BOTH htaccess files would need to work together.

God forbid you have a nocache value on the primary domain's htaccess file and a max age 42000 on the addon domain's htaccess file, that would become "nocache, must revalidate, max age 0, max age 42000" when you visit the addon domain, for example. Yeah, not loving addon domains at the moment, that catchall redirect is not a good thing.

*confirmed*, I just tested that last paragraph by placing this in the htaccess file of both the primary domain and the addon domain
Header always set X-Content-Type-Options "nosniff"
Visiting the addon domain and checking headers I see
X-Content-Type-Options nosniff,nosniff
so it's combining commands from both htaccess files.

lucy24

4:26 am on Feb 19, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



reaching the content on siteb.sitea.com uses the htaccess file of sitea to access siteb

I don't see how this cn hppen. I hve ordered a new keybord.

htaccess is based on physical filepath, not URL, so no matter what URL you use, the request will pass through all the same htaccess file(s), just as it would be governed by the same <Directory> sections in config. (Tht's how, for exmple, I cn hve all my access-control directives in the htaccess file for my userspace, even though this directory is not and cannot be reached by any site visitor. Only the server nd the Fetchpuppy knows tht it exists.) The only difference is in potential <Location> sections of config, but tht doesn't seem to be n issue.

whitespace

9:24 pm on Feb 19, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



...but I figured it out(99% sure anyway).
That catchall redirect IS resulting in the problem I described in the OP since my site's htaccess forbids it from working.


I fail to see how the addon domain, subdomain, redirects or .htaccess has anything to do with your OP of the (internal) PHP include appearing in your log file?

The PHP include discussed above should be an entirely "internal" request (for want of a better term) on the server, handled internally by the PHP engine/handler. subdomains, redirects and .htaccess files act only on external requests. (?)

I think "the problem" is something which happens much earlier. And the catchall redirect is perhaps just another manifestation of that problem.


- The addon domain is reachable directly OR by visiting addon.primarydomain.com
- The host creates a catchall redirect from subdomains to a folder of the subdomain at the server level


Shouldn't that be "to a folder of the primarydomain"?

Errm, that redirect really shouldn't happen (at least not "out of your control"). Although this is a feature of cPanel (to redirect), you generally don't want that (I think it just adds a directive to .htaccess anyway). It is more common to redirect the other way and make the folder inaccessible. Incidentally, accessing a subdomain changes the document root. Accessing a subfolder obviously does not. So, code might not work without modification when accessed by the subfolder.

That last part is the problem, anything.primarydomain.com becomes primarydomain.com/anything and there is no way to stop this behavior


So, it's not possible to have a subdomain behave like a subdomain?!

Unfortunately the host redirect causes 3 different versions of the content found on any addon domain.

siteb.sitea.com
sitea.com/siteb
siteb.com


But doesn't siteb.sitea.com redirect to sitea.com/siteb? So, only really 2 different versions are found?

whitespace

9:34 pm on Feb 19, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



a webmaster may think siteb.com is protected thanks to the htaccess file on siteb.com but since reaching the content on siteb.sitea.com uses the htaccess file of sitea to access siteb some of the rules may or may not work at all.


Since siteb is an addon domain (also a subdomain siteb.sitea.com) which is located in a subfolder of sitea, then the same .htaccess files are processed regardless of which URL is accessed (siteb.com, siteb.sitea.com or sitea.com/siteb) - as lucy24 states, "htaccess is based on the physical filepath". However, the host and file paths are obviously different.

JS_Harris

11:57 pm on Feb 20, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



But doesn't siteb.sitea.com redirect to sitea.com/siteb? So, only really 2 different versions are found?

No, the redirect the host uses isn't behaving like a normal redirect. You can see the redirect instruction(catchall) on the redirects page but if you visit the subdomain it is NOT sending you to the related folder, all 3 versions resolve without redirecting.

That being said if my addon site, which resides in a folder under the primary domain, makes a call to include a file the pathing is wrong. Accessing it with "/includedfile.php" or "./includedfile.php" works but if INCLUDEDFILE.PHP has code on it requiring a further include of something like a template file then things get screwy, the path must go all the way back up the chain and a local call does not work for includedfile.php, hence the 500 error message where the template are would have been on the original page.

I totally get that an .htaccess file works based on the physical filepath, but that's not happening as intended with this host. I further tested it.

To test: On the primary domain level I included an htaccess file telling a browser to return a nosniff header. On the addon domain level, I placed a 2nd htaccess file in the root folder of that addon domain which said to return a noimageindex header. ie: primary domain = only a nosniff entry and addon domain = only a noimageindex header.

Result: When visiting the primary domain there is only a nosniff header result but when visiting the addon domain you get nosniff + noimageindex headers with this host, 2 different commands from 2 different files placed at the root level of 2 individual sites(1 primary, 1 addon) = both show for the addon domain. They are indeed BOTH having an effect on a visitors browser.

I've never experienced anything like this before. I actually went and removed everything from this server to make sure nothing else was influencing what I see during testing and it's not. It's also not a cache issue. Whatever is misconfigured I don't have access to it and, apparently, that's how the host handles addon domains.

lucy24

2:27 am on Feb 21, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To test: On the primary domain level I included an htaccess file telling a browser to return a nosniff header. On the addon domain level, I placed a 2nd htaccess file in the root folder of that addon domain which said to return a noimageindex header.
<snip>
Result: When visiting the primary domain there is only a nosniff header result but when visiting the addon domain you get nosniff + noimageindex headers with this host, 2 different commands from 2 different files placed at the root level of 2 individual sites(1 primary, 1 addon) = both show for the addon domain.

But, but, splutter-- But that's exactly what I would expect to see. Requests for the "primary" domain pass only through the outer htaccess, and are subject only to its directives. Requests for the "addon" domain pass through the outer htaccess and then the inner htaccess, and are therefore subject to both directives. (Obligatory reminder: mod_rewrite behaves differently, since it does not inherit by default. All other mods do.)

Snipped:
ie: primary domain = only a nosniff entry and addon domain = only a noimageindex header
Do you mean that this is what you expected to happen? The expectation was incorrect, though :(

Now, it's possible that your site files are aliased in such a way that you think of the "primary" and "addon" as being parallel. But in fact the addon is inside the primary; that's essentially what the terms mean.
This 33 message thread spans 2 pages: 33