Msg#: 4521039 posted 11:20 am on Nov 21, 2012 (gmt 0)
I was looking at the webmaster tools "Links to my site" this morning, and there are tons of spammy sites listed there that contain no link to my site at all when I inspect them further. Is there a reason for this, and what can I do about it?
Msg#: 4521039 posted 2:46 pm on Nov 21, 2012 (gmt 0)
Many spam links are auto-generated and only exist for a short period of time, then are deleted and replaced by other spam links. So the links you checked were there when googlebot crawled those pages, but aren't there now.
Msg#: 4521039 posted 5:11 pm on Nov 21, 2012 (gmt 0)
In addition to spam sites that go up and down, and change their content constantly, there are also bugs in Googlebot.
First Googlebot will take piece of text that looks like a url and treat it as if it were a link. This is particularly problematic on some search scraper sites. They sometimes have truncated urls in text linked to tracking urls that googlebot can't crawl. Something like <a href="http://track.scraper.example.com/?8373888383">www.mysite.example.com/main_widget_arti..</a> This is a mechanism through which Googlebot finds all these 404 "pages" on my site that were never meant to exist to begin with.
I also recently found a case where a scraper site did link to my site as well as to other sites and Googlebot seemed mash the links together. The site linked to othersite.example.com/big_bad_widgets.html and to mysite.example.com Wouldn't you know that Googlebot started crawling mysite.example.com/big_bad_widgets.html on my site. That url turned up as 404 in my logs (from googlebot) and also on the WMT dashboard. No idea what Googlebot would be doing in that case other than making a mistake.
Msg#: 4521039 posted 7:25 pm on Nov 21, 2012 (gmt 0)
Directories and search-results pages. The ones g### begs you to exclude from its searches because their own algorithm isn't clever enough to weed them out. Sites where you click More, and More, and More, and eventually there's a link to something on your site. Or there was, at the time g### crawled the page. Oh, and the occasional hotlink.
Collectively: Garbage Links.
But at least they've figured out that piwik isn't a link. Whew.
Msg#: 4521039 posted 4:56 pm on Nov 27, 2012 (gmt 0)
I sometimes see these auto-generated links in Webmaster Tools and am wondering if they have an impact on a site?
Should you do something about it or since when you go to the site with the links and the links to your site are not there, the auto-generated links showing up in Webmaster Tools are probably not impacting you?
If something should be done, what would be the best thing to do?