| 5:37 pm on Jan 5, 2012 (gmt 0)|
Yes - you can ignore it unless the backlink is valuable.
You don't need to "fix" 404 crawl errors coming from external sites. This has been confirmed by several Google spokespeople on their own Webmaster Forums. So there's no need to generate massive .htaccess files and play whack-a-mole with this nonsense.
| 7:43 pm on Jan 5, 2012 (gmt 0)|
To be clear, is it the URL in the href that is being truncated, or just the anchor text?
If it is the anchor text, then simply ignore it.
| 8:03 pm on Jan 5, 2012 (gmt 0)|
@g1smd great question - it's actually just text on the page that google is picking up on and pulling out URLs. See:
as an example.
| 8:09 pm on Jan 5, 2012 (gmt 0)|
Google started pulling anything that looks like a URL from text on pages and testing the server response for their "guess" only a few weeks ago.
The correct response for junk URLs is to return 404. If Google get tens of millions of 404 responses for these hopefully they will stop this sillyness.
| 10:06 pm on Jan 5, 2012 (gmt 0)|
|Google started pulling anything that looks like a URL from text on pages and testing the server response for their "guess" only a few weeks ago. |
Seems like it's just occurring to Google that this is creating some unnecessary concern, and they are "looking into ways of making that a bit clearer."
This was just discussed at length in another thread here....
Google Following URLs Without Hyperlinks
There's also a link in the thread to some additional comments by Google's John Mueller on the subject.
| 11:42 pm on Jan 5, 2012 (gmt 0)|
@g1smd Trust me, it wasn't a few weeks ago - I started cataloging these bizarre searches (oh, the racy sex ones, you wouldn't believe!) last summer when I first took notice of it. @Robert Thanks for the link, will read there now. I sure hope google figures this out. My Webmaster reports since last summer are just splattered with ".." 404 URLs. By the way, one thing I didn't mention and hasn't been brought up, it's not easy to just dismiss these searches as a nuisance, since search is one of the most resource-intensive things for most servers. In my case, I have 2 "0 found" search results failovers: parsing the search phrase to find near matches, and if that fails, looking for "did you mean" matches. So 2,000 of these bogus googlebot hits in a day adds up.
| 12:05 am on Jan 6, 2012 (gmt 0)|
I meant "months", not "weeks".
I didn't actually see any of this until more recently than some other people.