Forum Moderators: phranque
In my logs (Apache 2, RH 7.3), I see the script is effective, but I also see a potential problem.
Here is my question:
Why is this log entry request:
www.example.com/ www.example.com- is returning a 302 error? What causes the second type of request log entry? Is it simply the difference between requests for
ht tp://www.example.com/ (first example) and ht tp://www.example.com (second example)? The problem is that apparently the PHP script is not being used for the second example request. How would I include the second example, without killing all such requests regardless of their origin?
TIA.
Also, there are many threads here concerning the trailing slash problem -- most resolved, but some not. They may give you some ideas to test and narrow down the problem.
Jim
I've been reading many of the 'trailing slash' problem threads, but I don't understand *why* there is an issue at all.
I type in
ht tp://www.example.com and receive ht tp://www.example.com/index.php. The same is true for entering ht tp://www.example.com/. In my log, both requests end up being for the one with the trailing slash. Also, in either case, as the page is loading, my browser shows the URL with the trailing slash (no file name in either case) ... whether I leave it off or not. What kind of request would generate an error that results in a request for
ht tp://www.example.com-? <edit>Sorry ... I mean to say that my server has no problem handling either request, trailing slash or not, it's just that I'm trying to understand why any request would end with the hyphen, as noted. It's only found when the root domain address is the request.
I am wondering what kind of request causes the trailing hyphen in my log.
Note that the request with the trailing hyphen results in a 302, where the request with the trailing slash gets processed by my PHP script and results in a 404. Thanks!</edit>
You might want to investigate the IP addresses and hostnames of those user-agents requesting a hyphenated domain -- I'm actually surprised it resolves to your site at all -- and see if they look like they might be in IP ranges known for foul play.
Jim
These log entries (with the trailing hyphen) are exclusively requested by bots of some kind. I see them as requests from known bots (Googlebot et al.) and from automatons of a different nature (voyager, cfetch, et al.).
I suspect you are correct in that the Apache log is indicating a bit of bad data, as it does for other elements in the log string (like
%{SID}e). My guesses are currently leaning toward a bad bot configuration ... maybe it does that with a blank line at the end of an old array of file names to target, or something. Since 302 means 'The requested resource resides temporarily under a different URI', I'm guessing the trailing hyphen log entry is the result of an outdated URL being probed, and the server tries to resolve to the current URI ... which ends up giving the bot a 404 when they finally get there.
I dunno. Always learning ...
Thanks again!