Forum Moderators: DixonJones

Message Too Old, No Replies

Webalizer User Agents

         

akogo

6:12 am on Aug 2, 2003 (gmt 0)

10+ Year Member



Anyone get "PHP/4.2.3" as a user agent in Webalizer? It shows 56268 hits in one day. What exactly is happening?

Goober

7:05 am on Aug 2, 2003 (gmt 0)

10+ Year Member



Howdy,

I've never seen it before, but it generated this search result and others on g.

freshmeat.net: PHP 4.2.3

mincklerstraat

7:25 am on Sep 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This "agent" is simply something (a PC with a local webserver for development or whatever or a "real" webserver with a dns number and everything on the internet) running the scripting language PHP, which is the most widely-installed scripting language that I'm aware of, 4.2.3 is a fairly recent version of it. I get a lot of PHP/4.2.2 in my stats, a whole lot, so I'm also suspicious.

This is a little weird since PHP, though a remarkably powerful and easy-to-use language, is not good for the type of application that's needed for heavy-duty spidering like most of the bot user agents you'll see in your stats. I only know of one search engine / spider combination in PHP, php-dig, and it's very rarely used. PHP is better for producing page-view content and database manipulation based on page queries than it is for running operations that aren't triggered by pageviews and go on for an extended period of time.

I'm guessing at the moment that our PHP hits come from some content we make available to other websites. We have a 'PAD file' written in xml which describes some software we have available, and other webservers can grab this and look at it to display information on our software - either they can put it in their own database, or they can grab it at the moment they need to display the information. There's also a screenshot that can get grabbed by this info.

If you have an rss feed or any other content that other sites might use, this might be the PHP program on the other site grabbing your content, analyzing it, and getting it ready to show to the visitors of that other site.

We don't do rss but have a javascript syndication feed; however, I think the agent for this one would show up as the browser in question, since java-script is a client-side include and not server-side (the browser gets the HTML page, inside it is the javascript command to go get the file on our site and do the stuff that javascript says to do; it's the brower then, that does the actual request, and not the site's server program - the site's server program merely spits out the HTML that the brower interprets and acts on).