Is this what you're saying - Google sees the alpha-numeric string that is only a variable for your script and treats it like a URL on its own right?
What you're seeing might be auto-generated spam links that have been improperly formatted.
I have also noticed an upswing in 404s on urls where Gbot is requesting <hr> in place of an underscore. WMT is reporting only 1 discovery for those urls, so I checked the sources and they are linking correctly. I set up a 301 for those, but if the upward trend continues, I will be setting up 301s for everything that Google can't follow correctly.
The URL I posted was edited/modified, but the original URL was from a site/directory that showed the following
1. <link>Click Here</link>
2. Description. Included about 1 or 2 lines of search engine scraped data.
3. www.address (not link). if to long: www.beginning_of address_followed with "..."
Google tries to crawl item 3 (and reports 404 in WMT).
Yes it's all because auto-generated spam links but they have been properly formatted.
I can't tell if you're saying that Gbot is following a link that isn't a link or incorrectly following live links that are properly formatted? I'm confused on that point, but I am seeing lots of 404s on correctly-formatted links being reported in my WMT. I have wondered if Googlebot can create links from text, for cases where people may type a website address in context but not make it an href (which also sounds like what you're reporting?)
This information occurs on the other website, not yours - correct? That means you don't need to worry about the 404 messages. They are just their for your information in case you WANT to make some changes in how your server responds to those requests.
But you are also saying that, maybe, you can get Google "credit" for an inbound link (assuming your server replies with a 200 OK status) from unlinked text on another website that looks like a URL.
Have I got that right?
(My apology for any confusion cause by the edit,
but we will not ask people to visit and analyze other
specific sites, especially spammers and scrapers.)
You get credit for a mention, the site was obviously talking about your site. Google never said a link had to be an href, in fact it can be anything they can quantify as "buzz".
How much value there is in that? Only Google knows, I wouldn't go pasting your address everywhere :-)
What? From when did google started crawling links that aren't clickable? Why do they want to do that?
We often turn links in comment section into links that aren't clickable? why should google crawl them when the site on which they appear doesn't want them to?
It is a big joke if they do that.
Why is google not providing information to the owner of the site on which a broken link appears?
None of the webmaster tools (including bing) report broken links on sites on which they appear.Why is it so?
Sorry for the confusion. Gbot is following a link that isn't a link
|I can't tell if you're saying that Gbot is following a link that isn't a link or incorrectly following live links that are properly formatted? |
Google did not invent the internet. A link, up until now, at least to me, was when it's an href.
|Google never said a link had to be an href. |
Yep - That was the question in mind.
|But you are also saying that, maybe, you can get Google "credit" for an inbound link (assuming your server replies with a 200 OK status) from unlinked text on another website that looks like a URL. |
Have I got that right?
(disclaimer: I'm not a G spokesman). Unless that site is hooked up with them, such as using WMT, Google as any other search engine or any other company has no relationship with the site owner.
|Why is google not providing information to the owner of the site on which a broken link appears? |
For your own site, you may wish to use tools such as xenu.
I checked that se spammer/scraper site "cached" version in google. Its from an earlier date than the date on WMT, and yes, the live link is correctly formatted, while the incorrect formatted text (which includes "...." and google treats as a link) is not a live link.
Just wanted to make sure that the site didn't fix their problem "after" google visited. Now it's clear that Gbot acts odd
I just found this Google Help page that seems to be addressing this situation. In the SERP, it is dated as March 20, 2011, so it seems like it is pretty new, if the SERP date can be believed.
|Unexpected 404 errors |
...Google strives to detect these types of issues and resolve them so that they will disappear from Crawl Errors. In general, 404 errors won't impact your site's search performance, and you can safely ignore them if you're certain that the URLs should not exist on your site.
|why should google crawl them when the site on which they appear doesn't want them to? |
Because it's their Internet and they'll crawl if they want to?
Idk, they're getting annoying ... Really, their need to 'overdo' everything may be their biggest downfall, because to me, they really are getting just plain annoying in some ways, and imo if other site owners feel the same way it could be an issue for them in the long-run.