Server: Linux/Apache 2.2
In my HTTP access log, a normal request for home page looks like this:
GET /index.htm
which is equivalent to /public_html/index.htm.
Almost all requests have this format.
GET is followed by a space and a /, and the remainder of the request path is relative to that.
But there are also infrequent log entries that look like this:
GET http://example.com/index.htm
There is no forward slash after GET, and the request contains the full protocol and website name, which normally would have been stripped off by Apache.
The result code is 200 and the bytes transferred seem to indicate that the server sent the correct page.
On my Apache server at home, I've tried to reproduce this with a browser and with wget, and cannot.
If I send a request like this:
GET http://example.com/http://example.com/index.htm
it shows in the log like this, still with the leading slash, and the server returns a 404 for it:
GET /http://example.com/index.htm
I cannot craft a request such that the server log entry doesn't start with "GET /"
Any ideas what text format is being used for these strange requests, or what download tool? Actually, it doesn't seem like the download tool would matter. The mystery is how an HTTP request can be crafted to look like that in the log.