Forum Moderators: open

Message Too Old, No Replies

Web Image Miner / Scraper

Seeking free software for web mining.

         

jasonfnorth

4:45 pm on Jul 20, 2006 (gmt 0)

10+ Year Member



I need to gather some images a TON of images together to battle test some software, and I can't seem to find an open source or free "web miner" or "web scraper" that sucks images down off web sites.

Is anyone aware of such a tool that I can quickly use?

Cheers,
NORTH

jimbeetle

4:51 pm on Jul 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hey jasonfnorth, I don't want to be too sarcastic, but can you see the irony of asking a community of webmasters for a recommendation for software that will suck down a ton of their images? ;-}

jasonfnorth

5:11 pm on Jul 20, 2006 (gmt 0)

10+ Year Member



Yea, sorry... I didn't mean to ignore addressing the ideea of unlawful intent. Trust me, I'm not interested in leveraging copyrighted work for a for-profit venture, if that were my case, I surely wouldn't be posting here. I simply need a LOT of images to run a simulation. I don't even care what they are of.

kaled

6:35 pm on Jul 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How many is a lot? A hundred, ten thousand, a million?

What simulation are you trying to run? There may be a better way of doing it.

Kaled.

jimbeetle

6:41 pm on Jul 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Beat me to it, Kaled.

Depending on what's needed, I was going to suggest picking up a couple of those 10,000 image CDs. If necessary, copy the whole shebang into a new folder and use a file renaming utility. Repeat as needed.

jasonfnorth

6:45 pm on Jul 20, 2006 (gmt 0)

10+ Year Member



Yea, we need a few hundred thousand images. I've found PixGrabber and Web Site Downloader, but but these products seem weak. Where's the php script or simple prog to do this?

[edited by: encyclo at 11:20 pm (utc) on July 20, 2006]
[edit reason]
[1][edit reason] no URL drops to tools please, see forum charter [/edit]
[/edit][/1]

rocknbil

6:46 pm on Jul 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



can you see the irony of asking a community of webmasters for a recommendation for software that will suck down a ton of their images?

Agreed, but there is more to this issue than just legalities. It is also probably why this thread [webmasterworld.com] went completely ignored, such practices present untold annoyances - false hits in stats, unneccessary increments in bandwidth (which site owners do have to pay for,) and of course now the W.M.'s must find ways to stop you from doing so - we've no idea of your intent and can only assume it's nefarious.

See if you can find another way, those of us that know of scraping techniques will most likely not share. :-)

jasonfnorth

6:52 pm on Jul 20, 2006 (gmt 0)

10+ Year Member



No problem. I'm very resourceful and will figure it out.

THE END.

jimbeetle

8:09 pm on Jul 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



but there is more to this issue than just legalities

Agreed, that's why I didn't cite 'em, kept it general.

jimbeetle

8:18 pm on Jul 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



jasonfnorth, resourceful is good, but why about it the hard way when you can find plenty of CDs with a few hundred thousand images at your nearest Staples or CompUSA?

zCat

8:25 pm on Jul 20, 2006 (gmt 0)

10+ Year Member



Wikipedia has masses of images available for download:
[download.wikimedia.org...]