Hi all -- I am using wget to check for bad links on my site. We have a lot of links on our site that our servers redirect to other sites (e.g. a link to an affiliate that we want to track in our logs). They are dynamic links that look static, that is they end with a special ending like *redir.html.
I am having a hard time preventing wget from following these links. Here's an example:
wget --recursive --delete-after --no-directories --no-host-directories --reject="*redir.html" [mysite.com...]
There are two problems: first, wget doesn't seem to be honoring the --reject= command (which should prevent it from following all links with this pattern, or so I believe. Second, regardless of whether host spanning is on, off or what I have in --exclude-domains it still follows this redirect.
I know this is not the greatest place to post this, but I appreciate any help I can get :-)
Thanks --
Tom