Forum Moderators: goodroi

Message Too Old, No Replies

Link Checker / Site map generator

One that observes robots.txt and meta tags

         

fish_eye

10:25 pm on Nov 2, 2006 (gmt 0)

10+ Year Member



Does anyone know of a link checker that honors robots.txt and meta tags (FOLLOW/NOFOLLOW specifically)?

FYI: The problem behind my request is that I am using a CMS to generate several thousands of pages (less than 20k) but I only want decent ones indexed. That is, those that have a decent amount of content. So, what I'm doing in the php backend is generating a robots meta tag of "nofollow, noindex" in some cases (most), a "follow, index" when there's a base set of data attributes and images (many) and a "follow, noindex" for pages that are used to browse the site.

My motivation is, to some degree, fear of duplicate content (I mean "being perceived as having several thousand remarkably similar pages" not "people stealing my content").

If there isn't one - does anyone want to pay me to develop it ;) I can't be the only person who wants something like this shirley!

goodroi

4:05 pm on Nov 3, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



if you are using a link checker on your own site, i would think you would want to check all pages accessible to users (regardless of robots.txt) to make sure you don't send users to a page full of broken links?

fish_eye

1:27 am on Nov 4, 2006 (gmt 0)

10+ Year Member



I'm trying to determine (and test) what it is that spiders should see (and therefore potentially index).

netchicken1

3:30 am on Nov 4, 2006 (gmt 0)

10+ Year Member



I bought A1 Sitemap Generator,

It does the job well, highly configurable.

fish_eye

5:33 am on Nov 7, 2006 (gmt 0)

10+ Year Member



Excellent, thanks - and a 30 day trial to boot! You would not believe the urls I'm generating and trying to get indexed from my hacked joomla code!