Forum Moderators: mack

Message Too Old, No Replies

Site extractor recommendation please

         

Jon_King

12:06 pm on May 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From time to time it is really handy to just download an entire site via www using a site 'ripper' as they are called, than to deal with clients ftp access issues.

I've been using Joc Web Spider and actually very happy with it and the trial expired so I ordered it online and must have made a typo so they took my money and sent me an email with an error message!

NO POSSIBLE way to contact the company. No phone, no address, no email (except for the processing company) no anything. I'm sorry I just can't do business like that.

I wish I had noticed the lack of contact information prior to ordering. I would not have ordered.

Can someone please recommend one of these tools? I am happy to pay for it (although there are many freeware versions), for I would like a trusty one we can rely on.

jpjones

12:10 pm on May 16, 2003 (gmt 0)

10+ Year Member



I use the unix command "wget", though I'm sure there's a windows version to. It can mirror a site if necessary.

One point to know about these site ripper tools though, they only save the output from the webserver, and so if your client sites' use any form of server-side scripting, e.g. php, you won't get the source code :(

HTH,
JP

outrun

12:13 pm on May 16, 2003 (gmt 0)

10+ Year Member



I have used BlackWidow in the past, but I heard Teleport Pro is really good.

regards,
Mark

Jon_King

12:14 pm on May 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, I only use this kind of tool on the simplest of 'nothing dynamic' html sites.

Woz

12:29 pm on May 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have actually used Teleport Pro to take a snapshot of a dynamic site, mine, to brun to CD. Quite useful.

Onya
Woz

nobody

7:48 pm on May 16, 2003 (gmt 0)

10+ Year Member



Hello,
I normally use wget to download, tis very good and quick. ( -r -k are the options to get it to download a whole site and convert the links), however, unlike WinHTTrack(?), wget doesn't understand JS rollovers and some other JS related linking..

Jon_King

11:47 pm on May 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have tried both Black Widow and Teleport Pro. Both are similar and function well. I appreciate the suggestions very much.

Thank you,
Jon

[edited by: Jon_King at 1:34 am (utc) on May 17, 2003]

john316

12:07 am on May 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Heres a java based solution, it works quite well.

[www-2.cs.cmu.edu...]

quotations

5:40 pm on May 19, 2003 (gmt 0)

10+ Year Member



I have used lynx in both UNIX and Windows versions.

There are command line options for various ways of getting the site contents.

Look at -source and -dump and there may be some other useful variants but those are what I use most often.

I have also used AnaSoft's Websnake, but not for a while.