homepage Welcome to WebmasterWorld Guest from 54.145.191.14
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Which Proxy headers do you capture?
There are so many
Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4222008 posted 4:14 am on Oct 26, 2010 (gmt 0)

Google is becoming useless for research. I started trying to google these individually...waste of time...site after site of garbage....

Even tried to search just for the squid documentation on this...couldn't find it... :(

Anyhow, these are the Server Headers I've found to be indicative that we're dealing w/ a proxy:

$_SERVER['FORWARDED'],
$_SERVER['FORWARDED_FOR_IP'],
$_SERVER['HTTP_CLIENT_IP'],
$_SERVER['HTTP_FORWARDED_FOR'],
$_SERVER['HTTP_FORWARDED_FOR_IP'],
$_SERVER['HTTP_PROXY_CONNECTION'],
$_SERVER['HTTP_VIA'],
$_SERVER['HTTP_X_FORWARDED'],
$_SERVER['HTTP_X_FORWARDED_FOR'],
$_SERVER['MT-PROXY-ID'],
$_SERVER['VIA'],
$_SERVER['X-FORWARDED-FOR'],
$_SERVER['X-PROXY-ID']

but I bet that most of them show practically nothing...so I figured you old pros probably know better than anyone which are worth tracking?

I was thinking of:

$_SERVER['FORWARDED'],
$_SERVER['HTTP_CLIENT_IP'],
$_SERVER['VIA'],
$_SERVER['X-FORWARDED-FOR']

Am I missing any? Are all of the above useful?

 

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4222008 posted 12:34 am on Oct 27, 2010 (gmt 0)

Via and X-Forwarded-For the ones most often received on my servers. Via most often, often with X-Forwarded-For, rarely with Forwarded, and even more rarely with any of the other proxy-related headers.

However, I only log all these headers when responding with a 403-Forbidden, and that may well skew my view of how common each of these headers may be in "acceptable" requests.

To be clear, I do not block requests solely because a proxy is being used. I may block that proxy's IP address range, I may block because *any* of the HTTP request headers are malformed, missing, or inappropriate to the request being made, but not simply because a proxy is in use.

Again, since I only log these proxy headers for rejected requests, take care in interpreting my data...

Jim

Clark

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4222008 posted 11:11 pm on Oct 27, 2010 (gmt 0)

Thanks Jim. I coded it to grab everything. After I launch and give it some time to populate, I'll report back with my findings too.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved