Forum Moderators: Robert Charlton & goodroi
What would be the best way to search for those bad outbound links on your site?
Then you would evaluate them one by one to find bad neighborhoods. Only a hand check of the unbroken outbound links will do this, because you've got to evaluate the quality of the neighborhood you are linking out to.
For easier approach:
If not verify your databases (table by table, field by field) with regexp and extract all links.
Or just write a script and parse all static html files from disk for links.
It's kind of easy ... for a coder.
Good luck!
I guess in order to be a rogue link the link must be to a live page, so dead link checkers would not help at all.
@TheSEODude:
I am a coder and I could write something like that but the problem is: in case of an XSS attack, the rogue link does not exist on your site! It is only being created for Googlebot and only exists when a particularly crafted URI is fed into it ON ANOTHER SITE. It makes any reporting that Google provides so much more valuable than any testing tool I can write because I need to know how Google (or Y! or MSN for that matter) PERCEIVE my site, not what I (and other "normal" visitors) think it contains.
Trouble is: they don't provide any reporting for this particular site. In their view the site is clearly abusing something (only G knows what), it's stripped of its former PR5 and Webmaster Central says that Googlebot was on my homepage last on Jan 01, 2007 even though I see the hits every other day.
I wish MSN could be used as tedster suggests but their reach is laughable and they did not manage to index more than 300 pages of this 300,000-page site in all these years.
Bottom line: not knowing exactly what's going on with G banning the site makes me really paranoid and makes me look for and blame things that are out of my control. A couple of confirmed cases of XSS links (long since fixed) is just pouring more oil into the flames, so to speak.
Justin
If you can't use any mentioned method on this thread you should not have asked the question unless you just needed reassuring words from us like:
I'm sure your site is clean and pretty!
I'm sure googlebot / others won't mind the extra links!
I'm sure attackers could not break in!
I'm sure hacking links into your site is not possible!
Good luck.