|Yahoo! and Display URIs|
Trimming the trailing forward slash, Bad Yahoo!
Okay, so I'm going through some 404 stuff and I notice more than a few 404s for what should be valid URIs. Upon investigation, it appears we overlooked a "forcing" of the trailing forward slash on this particular group of URIs. Just a note, whenever you implement a rewrite in IIS, the default behavior of IIS is overridden and the trailing forward slash is not appended like it would be without the rewrite.
So, we decided to start backtracking and sure enough, most, if not all of them lead back to Yahoo! and Live. They are the perpetrators of our woes. We're fortunate in that we account for the trailing forward slash in "almost" all implementations. I say almost because apparently we overlooked one. It is a small group of pages but there were enough 404s being generated to take notice.
I go out to Yahoo! and do a site: search and sure enough, every single URI is stripped of its trailing forward slash, every one. Even when you force the slash, Yahoo! and Live trim them back to no slash.
YOU CANNOT DO THAT!
But you are! And you have been forever. Why must you break the protocol?
Here's what I think happens, some scrapers are designed to grab a "visual" URI reference. In Yahoo!'s case, the visual reference is minus the trailing forward slash. For most, this wouldn't be a problem because there are rules in place to prevent the non trailing forward slash version from being browsed to, you just 301 it to the slash. If you are using Content Negotiation, it may be the other way around. But, for some who don't know this little flaw, they may be seeing more than a handful of 404s for those references, I'd check.
So, can someone explain to me why both Yahoo! and Live strip the trailing forward slashes off the display URIs in the SERPs?
I find this behaviour annoying too. The only reason I can think of for them to do it is because they think it looks nicer. But then, I've seen some crazy spidering behaviour from both slurp and msnbot, so maybe they think trailing slashes are optional? ;)
Despite the fact we've always had the trailing slash, Yahoo insists on removing it. We're constantly rewriting it with a 301. I wonder how much wasted work Yahoo creates around the world by doing this - you'd think they'd just store the correct URI.
|So, can someone explain to me why both Yahoo! and Live strip the trailing forward slashes off the display URIs in the SERPs? |
It's hard to believe that after all this time, Yahoo and Live would be clueless on this topic. So I'm going to assume they are smart enough to know they're doing this to us. Therefore, there must only be one answer why...
And this is symptomatic of the problem at Live and Yahoo. They don't care about the details. And the details matter - just ask Google.
This is not a new phenomenon, here are a few earlier discussions on the topic (yes, from 2005):
For the reasons why Yahoo does this, see the comment by Tim in this thread:
I'm sorry, I just don't buy what Tim is selling.
I don't buy it either Billy- Yahoo should be a bit friendlier to Developers and Webmasters, instead of trying to tell us they know what the user wants; a dwindling market share would suggest they don't know as much as they think they do...
|This is not a new phenomenon, here are a few earlier discussions on the topic (yes, from 2005) |
lol, and one I started in 2007 January...
Display URI's in the SERPs - Google vs Yahoo! vs MSN
Us poor Windows folks get treated like "red-headed step-children" as they say. :(
And I too find Yahoo!'s response above unsatisfactory. Although I'm sure their reasoning weighed the pros vs cons. From my perspective, the cons far outweigh any pros from a user standpoint.
I was going to say "That's okay..." but its not. For those of you on a Windows Server who have a rewrite implemented or have done something to override the default Server Settings, I'd recommend verifying that you have the facilities in place to capture these incorrect URI requests and 301 them to their proper destination.
Bad Yahoo! Bad MSN! Who else does it?