homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Hardware and OS Related Technologies / Linux, Unix, and *nix like Operating Systems
Forum Library, Charter, Moderators: bakedjake

Linux, Unix, and *nix like Operating Systems Forum

wget question
I don't want it to follow our redirects to other sites

 6:00 pm on Sep 20, 2004 (gmt 0)

Hi all --

I am using wget to check for bad links on my site. We have a lot of links on our site that our servers redirect to other sites (e.g. a link to an affiliate that we want to track in our logs). They are dynamic links that look static, that is they end with a special ending like *redir.html.

I am having a hard time preventing wget from following these links. Here's an example:

wget --recursive --delete-after --no-directories --no-host-directories --reject="*redir.html" [mysite.com...]

There are two problems: first, wget doesn't seem to be honoring the --reject= command (which should prevent it from following all links with this pattern, or so I believe. Second, regardless of whether host spanning is on, off or what I have in --exclude-domains it still follows this redirect.

I know this is not the greatest place to post this, but I appreciate any help I can get :-)

Thanks --




 4:53 am on Sep 22, 2004 (gmt 0)

--reject doesn't get applied to .html files.

The argument is that --recursive doesn't make any sense if you are excluding some .html files. I don't entirely agree, but I see the logic.

I didn't know that wget would follow a redirect recursively into a new site (without specifying --span-hosts). I'd call that a bug.

I've never seen that problem... Looking at a bunch of wget scripts, I notice that I always use --no-parent. It doesn't make much sense, but perhaps --no-parent causes wget to skip the link because it's not below the starting point?

It's worth a try.


 9:16 pm on Sep 24, 2004 (gmt 0)

Thanks -- I'll give that a try and let you know how it works. In subsequent research, I found I was using version?.8 and there was a version?.9 available that was supposed to fix a problem with redirects going to foreign sites. After downloading, configuring, compiling and so on, it still did the same thing. Maybe I'll dust of my C programming and fix it myself :-)


 11:03 pm on Sep 24, 2004 (gmt 0)

Linklint might me be more suitable to your purposes...

Global Options:
 top home search open messages active posts  

Home / Forums Index / Hardware and OS Related Technologies / Linux, Unix, and *nix like Operating Systems
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved