Forum Moderators: coopster

Message Too Old, No Replies

Exploding when you don't know the number of elements

         

csdude55

5:55 am on Jan 7, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have a ton of parked domains, and force all of them to rewrite (R=301) to "www" via Apache.

I include a variables.php script on every PHP script of the site, and I use this to determine the domain name being used:

list($subdomain, $domain, $tld) = explode('.', $_SERVER['HTTP_HOST']);


That's all good 99.9999% of the time, but occasionally I see this warning:

Undefined offset: 2

I'm assuming this means that the $tld param doesn't exist, which implies that there's no www. So I'm guessing that when someone types in my website without the "www", the variables.php script fires for long enough to write this error, but then it redirects before the user notices.

This isn't a major problem, but since I'm working on all the warnings... why not?

I can think of a few ways to potentially fix it:

1. I could use str_replace() to remove any of the subdomains that are expected before exploding;

2. I could use substr_count() to count the number of . in the string, then explode accordingly;

3. The most appealing option is to use an array instead of list(), then assign names backwards manually *.

But since this is a glitch that's only going to occur on one page per session and is totally invisible to the user, I don't think that it's worth any extra load on every page. So before I do that, any other ideas?


* assigning names manually, something like:

$arr = explode('.', $_SERVER['HTTP_HOST']);

$arrLength = count($arr);

$tld = end($arr);
$domain = $arr[$arrLength - 2];

// since this might not exist, give it a fallback
$subdomain = $arr[$arrLength - 3] ? false;

JorgeV

10:40 am on Jan 7, 2021 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Hello,
$arr=explode('.',$_SERVER['HTTP_HOST']);

$tld=array_pop($arr);
$domain=array_pop($arr);
$subdomain=join('.',$arr);


if there is no subdomain, the string will be empty. If there are several subdomains, they'll all be listed in the $subdomain variable, separated by dots.

csdude55

5:07 pm on Jan 7, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks @JorgeV :-)

I tried a third option, too, that turned out to be marginally faster:

$arr = explode('.', $site);
$arrLength = count($arr);

if ($arrLength == 3)
list($subdomain, $domain, $tld) = $arr;

elseif ($arrLength == 2)
list($domain, $tld) = $arr;


I benchmarked all 3 of them over 10,000 iterations, but I had to put the explode() statement in the loop or the test on yours didn't work right (it kept popping off the same array). The results:

Using end(): 0.002885103225708
Using array_pop(): 0.0027980804443359
Using list(): 0.0025370121002197

But! There's also a question of size. Since the 2-element-length is really just a backup plan to prevent an error, I also have to consider if it takes longer for variables.php to download. So end() is 143 characters, array_pop() is 76, and list() is 164... meaning, the fastest to process also takes a few ms longer to download, so it might still take longer in the long run.

The benchmark test:

$site = 'www.example.com';

// Test 1
$arr = [];
$tld = $domain = $subdomain = false;
$start_time = microtime(TRUE);

for ($i = 0; $i < 10000; $i++) {

$arr = explode('.', $site);

$arrLength = count($arr);

$tld = end($arr);
$domain = $arr[$arrLength - 2];

// since this might not exist, give it a fallback
$subdomain = $arr[$arrLength - 3] ? false;

}

$end_time = microtime(TRUE);

echo "Test 1: ($tld - $domain - $subdomain) ";
echo $end_time - $start_time;
echo "\n";


// Test 2
$arr = [];
$tld = $domain = $subdomain = false;
$start_time = microtime(TRUE);

for ($i = 0; $i < 10000; $i++) {

$arr = explode('.', $site);

$tld = array_pop($arr);
$domain = array_pop($arr);

// don't use this, just use $arr as an array of subdomains
// $subdomain = join('.', $arr);

}

$end_time = microtime(TRUE);

echo "Test 2: ($tld - $domain - $subdomain) ";
echo $end_time - $start_time;
echo "\n";


// Test 3
$arr = [];
$tld = $domain = $subdomain = false;
$start_time = microtime(TRUE);

for ($i = 0; $i < 10000; $i++) {

$arr = explode('.', $site);

$arrLength = count($arr);

if ($arrLength == 3)
list($subdomain, $domain, $tld) = $arr;

elseif ($arrLength == 2)
list($domain, $tld) = $arr;
}

$end_time = microtime(TRUE);

echo "Test 3: ($tld - $domain - $subdomain) ";
echo $end_time - $start_time;

w3dk

6:49 pm on Jan 7, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



So I'm guessing that when someone types in my website without the "www", the variables.php script fires for long enough to write this error, but then it redirects before the user notices.


If Apache redirects to "www" then your "variables.php" script would not be called at all (until the next request). So if "variables.php" is being called without the "www" subdomain then it implies Apache has not redirected the request for some reason.

So, to confirm, every domain should be redirected to the "www" subdomain? "subdomain.example.com" should be redirected to "www.example.com" and "example.com" should be redirected to "www.example.com"?

An issue with using array_pop() (as in JorgeV's post) or in using end() (as in your suggestion) is that it will fail on any domains registered under a SLD or that are using a FQDN (or maybe this is already canonicalised)? You could use array_shift() instead, but you then have the same problem if the subdomain is missing (without additional checks) and array_shift() is slower. (Although performance really should not be an issue here. And on an array of just 3 elements the difference is probably infinitesimal.)

Are you only interested in the $domain? (ie. $subdomain and $tld are just a means to get at the $domain?)

csdude55

7:41 pm on Jan 7, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If Apache redirects to "www" then your "variables.php" script would not be called at all (until the next request). So if "variables.php" is being called without the "www" subdomain then it implies Apache has not redirected the request for some reason.

My thoughts, too, but that's the only reason I can think of for this offset error...

So, to confirm, every domain should be redirected to the "www" subdomain? "subdomain.example.com" should be redirected to "www.example.com" and "example.com" should be redirected to "www.example.com"?

Correct, yes. I manually define subdomains to be ignored in Apache's /etc/apache2/conf.d/userdata/ssl/2_4/account_name/example.com/configuration.conf file:

# potential subdomains
RewriteCond %{HTTP_HOST} !^(?:www|ww2|beta|images|i|inc|epsilon)\. [NC]
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]


"images", "i", and "inc" represent directories that don't use the variables.php script, and "ww2", "beta", and "epsilon" are completely separate sections with their own variables.php script. So, in theory, this script is ONLY fired when they go to "www"... or maybe no subdomain at all.

Are you only interested in the $domain? (ie. $subdomain and $tld are just a means to get at the $domain?)

$subdomain is just discarded for now, but I do use $tld...

robzilla

11:15 pm on Jan 7, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I also have to consider if it takes longer for variables.php to download. So end() is 143 characters, array_pop() is 76, and list() is 164... meaning, the fastest to process also takes a few ms longer to download, so it might still take longer in the long run.

Hold up, who's "downloading" variables.php? Your script includes it, probably from memory, and we're talking a few bytes here and there. You're not going to notice the difference. The client only downloads the output from the server, and there's not even any output here in this bit of code.

Anyway, if you like brevity, here's another solution in just two lines:
$arr = array_reverse(explode('.', $site));
[$tld, $domain, $subdomain] = [$arr[0], $arr[1], $arr[2] ?? false];

(Note that those should be two question marks near the end there, i.e. the null-coalescing operator. My code is apparently too emotional for the WebmasterWorld forum software.)

Please don't benchmark it and tell me it's slower. Yes, it probably is a millionth of a second slower.

You could use the same shorthand with array_pop(), like so:
$arr = explode('.', $site);
[$tld, $domain] = [array_pop($arr), array_pop($arr)];

(That's if you don't care about the subdomain. If you do, just employ the null-coalescing operator as above.)

[edited by: robzilla at 11:40 pm (utc) on Jan 7, 2021]

[edited by: phranque at 12:35 am (utc) on Jan 8, 2021]
[edit reason] fixed nulll coalescing operator for clarity [/edit]

w3dk

11:36 pm on Jan 7, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month



So, in theory, this script is ONLY fired when they go to "www"... or maybe no subdomain at all.


Well, I'm not sure what triggers your script, but that rule certainly prefixes the "www" subdomain to every host (providing it doesn't start with one of the subdomains mentioned) - which would include "no subdomain at all". So, "example.com" would be redirected to "www.example.com" by that rule - if it is processed at all.



..."subdomain.example.com" should be redirected to "www.example.com" ...


Correct, yes.


Although that rule would redirect "subdomain.example.com" to "www.subdomain.example.com", if "subdomain" is not manually defined in your rule.

I'm assuming this means that the $tld param doesn't exist, which implies that there's no www.


This is an "assumption"... you need to debug this further and log the value of HTTP_HOST when this notice occurs. Maybe "www." is being prefixed, but the initial request is to "localhost" or something?!

w3dk

11:41 pm on Jan 7, 2021 (gmt 0)

10+ Year Member Top Contributors Of The Month


@robzilla To type double-? you can do this hack: ?[b][/b]?

phranque

12:36 am on Jan 8, 2021 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



@robzilla To type double-? you can do this hack

i made the edit to fix this above

[edited by: phranque at 12:00 pm (utc) on Jan 8, 2021]

JorgeV

11:19 am on Jan 8, 2021 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Hello,

Just to say that PHP code is turned into Opcode and stored into memory.