When I idly browsed my server logs today I noticed a number of referrals from Yahoo Search results for which the response length did not made sense. I typed the same search term into Yahoo! search and saw a directory from a site of mine
among the search results. On clicking on the link I saw a page that I did not expect: the output from a script that does certain lookup-y things, taking the file name from an URL request for files other than a directory. (the whole thing has to do with the site being for lookups of data; I wanted to provide short URLs for linking; the script is called via mod_rewrite)
In the standard configuration of most web servers this makes only a small difference (the web server automatically serves a redirect to the correct URL), but in some cases this behaviour breaks the functionality of Yahoo Search's result pages. Is this issue known? I'd imagine it would be readily usable for cloaking by black hats (and at the same time reduce the referrals from Yahoo Search to some white hats)
Yes, Yahoo does normalize URLs to remove trailing slashes. When requesting or recording specific URLs, users don’t use trailing slashes and this form is perceived to be more readable. Yahoo’s goal is a quality user experience, and so this normalization was adopted to be consistent with user expectations and prevailing standard Web usage.
This behavior requires webmasters to have their server configured to accept both requests with and without the trailing slash. Tim