Does anyone know of a tool/util/software that will analyse a websites indexed pages and report back on old urls (from google or other websites) that are not being redirected to new urls via the htaccess.
I am trying to make sure that over the websites 10 year history, no direct links are being wasted by throwing up a 404 instead of being captured and redirected properly.
Is there a system/software that would do this for me?
Have you been going through google webmaster tools and looking at the crawl errors?
I have found LOTS of old broken links by going through there. It seems like over the last several months, google has really been digging up a lot of old links.
If there are broken links to your site that GWT is NOT showing, then those broken links that GWT doesn't mention MIGHT NOT be worth worrying about... (hopefully others will chime in here).
the other thing is to ask your web host if they have any software that will help you look through the logs and find the referring pages to your 404 page. So you find any time a 404 page was served, and you can look at what the referring page was.
I think that will be better than trying to find something that would "scrape" the search indexes looking for broken links to your site, but maybe I am wrong.