Forum Moderators: coopster & phranque

Message Too Old, No Replies

Retrieving hostname for a IP address

using $ENV

         

Seige

5:11 am on Apr 24, 2005 (gmt 0)

10+ Year Member



I've been attempting to display some users details with the following codes below. Somehow, the $host always return nothing. Can anyone explain why this happens, and how to fix it?

#!/usr/bin/perl -w
#
use CGI;
#
$browser = $ENV{'HTTP_USER_AGENT'};
$host = $ENV{'REMOTE_HOST'};
$ip = $ENV{'REMOTE_ADDR'};
#
print "Content-type: text/html\n\n";
#
print "$browser<br>\n";
print "$host<br>\n";
print "$ip<br>\n";
#
exit;
#

output:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;)

123.65.43.21

wruppert

2:17 pm on Apr 24, 2005 (gmt 0)

10+ Year Member



You may want HTTP-HOST.

Here is cgi a script that I use to print server stuff.


#!/usr/bin/perl -T
## printenv -- print CGI environment and perl @INC

print "Content-type: text/plain\n\n";

print "Environment variables:\n\n";
foreach $var (sort(keys(%ENV))) {
$val = $ENV{$var};
$val =~ s¦\n¦\\n¦g;
$val =~ s¦"¦\\"¦g;
print "${var}=\"${val}\"\n";
}

print "\n\nPerl \@INC:\n\n";
foreach $var (@INC) {
print "$var\n";
}

Seige

10:56 pm on Apr 24, 2005 (gmt 0)

10+ Year Member



Thanks, but HTTP_HOST returns my own domain name instead of the Remote host which is what I'm trying to obtain. I think I failed to explain that.

That script looks good. But doesn't seemed to give me my visitors' hostnames.

Does anybody else have a solution?

wruppert

12:47 am on Apr 25, 2005 (gmt 0)

10+ Year Member



You have to have "HostNameLookups On" in your apache httpd.conf. This forces a reverse DNS on each access. The env variable "REMOTE_HOST" holds the result.

See [httpd.apache.org...]

Seige

8:45 am on Apr 25, 2005 (gmt 0)

10+ Year Member



Thanks, but those are pretty foreign to me. Don't understand a single thing. :(

Anyway, I found a new set of codes to extract the "Remote Host" using a different method. I'm not sure what it does, but it works and that's what matters.

$ip = $ENV{'REMOTE_ADDR'};
@numbers = split(/\./, $ip);
$ip_number = pack("C4", @numbers);
($host) = (gethostbyaddr($ip_number, 2))[0];

sitz

1:56 am on Apr 29, 2005 (gmt 0)

10+ Year Member



You're doing a DNS lookup on the IP address. Note that you'll take a performance hit for doing this; DNS lookups can be very fast, or very slow; it depends on the the nameserver which ultimately returns a response.

As as aside, I advise against doing this. Performing DNS lookups is *such* a performance hit, they're generally disabled on the webserver (that's why you're getting the IP and not the hostname in your script). DNS lookups can come back in fractions of a second, or take multiple seconds to run; do you really want to incur that kind of performance penalty for a request to your site?

I run webservers for a living; if I found out that one of my developers wrote a script that did a DNS lookup on each request, the first thing I would do is disable the script. The second thing I would do is take the developer (and a large stick) out behind the woodshed for a stern talking to. =)

Seige

3:48 am on Apr 29, 2005 (gmt 0)

10+ Year Member



Thanks for the input.

This script I'm doing has two functions. One acts as a counter, extracting data from the users and stores them in a pipe delimited .db file. When a visitor calls this file, It'll extract relevant data (including hostname) and returns a tiny gif image.

The other function is basically just to display the data to me in a tabled format.

So, will this still be a "hit" in performance although it's just a small gif image?

sitz

11:53 pm on Apr 29, 2005 (gmt 0)

10+ Year Member



Yes; the size of the GIF is not the issue, it's the time it takes to perform the DNS lookup. When you're performing a DNS lookup, one of three basic things can happen (ok, a lot more than three. For this discussion, there's three. =))

1) Your local nameserver has the ip -> host mapping cached already. This will happen when that nameserver has already had to resolve that IP address. In those cases, the only speed limitations are the ones imposed by the network between your webserver and the resolver, and the load under which the resolver is currently operating.

2) If your local nameserver does /not/ have that IP -> hostname data cached, it needs to run out and find it. This can be near-instantanous, or take a while, depending on the IP address. Results can (literally) take fractions of a second:


$ time host www.webmasterworld.com
www.webmasterworld.com has address 64.33.51.156

real 0m0.018s
user 0m0.000s
sys 0m0.003s

...or quite a bit longer:


$ time host www.example.co.uk
www.example.co.uk has address 212.111.222.33

real 0m1.036s
user 0m0.001s
sys 0m0.002s

3) The IP couldn't be looked up for some reason. This can happen because a) the IP has no reverse DNS or b) the resolver ultimately responsible for that IP -> hostname mapping is unreachable for some reason (network outage, power outage, server issues, etc). In these cases, things can take MUCH longer:


$ time host www.example.com resolver.example.net
;; connection timed out; no servers could be reached

real 0m10.063s
user 0m0.000s
sys 0m0.003s

That's 10 seconds during which a) your webserver is tied up, waiting for a response that's never going to happen (which means that child process or thread can't do anything else, reducing your capacity) and b) your user is sitting there waiting for a response.

Seige

10:23 am on Apr 30, 2005 (gmt 0)

10+ Year Member



Geez.. That does look bad. Is there any other less harming method to retrieve hostname? Certainly there's got to be a way.

But I've come to a point where retrieving hostname is a necessity. So, I'll proceed with it until I find a better method.

sitz

1:15 pm on Apr 30, 2005 (gmt 0)

10+ Year Member



Which begs the question "Why, exactly, do you need the hostname?"

Seige

2:48 pm on Apr 30, 2005 (gmt 0)

10+ Year Member



hostname gives a hint of an idea what a set of numbers means. Where is the visitor from, what service is he/she using. If the visitor is on dial-up... etc.

Instead of looking up suspicious IP address one by one, hostname shall give you an idea.

wruppert

3:03 pm on Apr 30, 2005 (gmt 0)

10+ Year Member



I do the hostname and geo lookups from my logs, after the fact, each night. Then I have a nice report waiting for me in the morning. No need to slow down the server and my visitors.

Seige

3:18 pm on Apr 30, 2005 (gmt 0)

10+ Year Member



That's a good idea. That would be an option. But, I'm not perl-literate enough to manipulate the data that is being stored.

Heck, I don't even know how to delete one set of data.

I guess, I'll have to live with this for a while.

sitz

5:22 pm on Apr 30, 2005 (gmt 0)

10+ Year Member



logmucking is easy enough:

cat /path/to/access.log ¦ perl -lane '$ip = $F[0]; $host = (your gethostbyaddr code here); print "$ip:$host"' > ip2hostname_map.txt

Run that one the day's previous logs, and you're set. This assume, of course, that you don't need real-time information.

Seige

7:38 pm on Apr 30, 2005 (gmt 0)

10+ Year Member



Yes, somewhere around there.

The data is stored in a pipe delimited format:
data1¦data2¦data3¦...¦IpData¦HostNameData¦...¦... etc...

If I'm able to make it such that, when a visitor call for it, it'll skip the "hostname-retrieval" process. And when I call for it, it'll scan through the data, find out if the hostname data is empty, then perform the retrieval, and reinsert the data into that blank.

This way, it'll solve the problem of slow responce to my visitors. And i'll still have them in my logs.

Complicated process I would say. Don't have the time, energy nor resources to work on it at the moment.

Thanks very much for your help. Sitz