Forum Moderators: open
With a *nix system, or even a Win system with *nix utilities (my setup) it is quite easy (at least for those familiar with regular expressions):
1. Google advanced serach, set results to 100 per page.
2. Do link: search
3. Go to end of last page and click on "repeat the search with the omitted results included."
4. "Save As" the page or pages returned, I use type "Web Page, HTML only".
5. Now comes the fun part, as Google periodically changes the format of the results. I just want to save the URL and the TITLE as anchor. Currently the result is preceded by a <p class=g>, I use sed to replace that with a <li> starting on a new line. Also put all <br> tags on a new line, this puts the URL plus anchor on a line of its own.
6. Use grep on output to get lines starting with <li>, since there are no <li>'s in the original that only gets the lines where you have put it, i.e. the ones you want.
7. Do a grep -v yourURL to exclude internal links.
Basically the same approach (different sed commands) can be used to get the links returned by alltheweb into the same format, if you sort both files by URL you can compare the results. Note that alltheweb does not give internal links, also gives external links to all your pages.
Definitely complicated if you are not intimately familiar with regular expressions, quite easy to modify as Google modifies format if you are.