Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

My GWT is stuffed with "not found" mistakes. Time to use no-archive?

         

Sgt_Kickaxe

8:24 am on Aug 31, 2010 (gmt 0)



Way back in Feb of 2007 I migrated a section of one of my sites to a sub domain. During the move I accidentally broke a series of sidebar links for all of about an hour, during that hour they had <webURL> type tags attached.

Fast forward to 3.5 years later some scraper site has gone in and copied, verbatim, my entire site from that one hour period including broken sidebar links. All of the links point to my site, of course.

Google Webmaster Tools is reporting this to me as one instance of MY site having a "page not found" last linked from my home page sidebar in 2007 and dozens of missing pages all linked from this scraper site.

What I don't get is that the links are broken, they lead to 404 error pages, yet GWT still assigns them as missing from MY site.

This, combined with the clickable "cache" version of my entire site on a Google.com domain, have me feeling that it's time to add a "noarchive" meta tag to my site. I can't see anything that says I shouldn't at this time. I already implemented a frame buster script that shot my traffic up 11% on an image heavy site and unless someone can point out an obvious reason NOT to remove archived copies of my site... is it time?

tedster

8:37 am on Aug 31, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What I don't get is that the links are broken, they lead to 404 error pages, yet GWT still assigns them as missing from MY site.

If a good site were linking to a 404 URL on your site, you'd definitely want to know about it. So this is a bit of courtesy information for the webmaster, not a report that you MUST fix an "error".

I don't completely follow you with noarchive, because it only keeps a copy of our page out of the Google public cache. That won't stop scrapers.

Sgt_Kickaxe

10:39 am on Sep 1, 2010 (gmt 0)



The only place to get a 2007 copy of my site is from the archive. Without it there wouldn't have been anything to scrape in the first place.

I'm not seeing any value from having old copies of my site floating around on other companies domains.