homepage Welcome to WebmasterWorld Guest from 54.196.24.103
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderators: physics

Databases Forum

    
Custom Spider/Scraper - Help!
Ivan




msg:3800985
 4:34 pm on Dec 5, 2008 (gmt 0)

Hey!

Im new over here, so a small introduction. I am from Canada, Toronto, and run a small financially focused website.

The problem - many financial insitutions publish their data online, and update it on daily basis. There are over 60 institutions, and to follow each one is very challenging. I want to create a summary page with financial data from those institutions. Release a spider once a day, get their updates, and then post them all together on the website.

Obviosuly copy&paste is off the table since it takes at least 1.5 hour to go through all lenders and get their data. The only possible solution it seems is to set up a custom spider who will crawl specific fields (div tags, table cells), extract data and compile it into one file. The question is - do you know any software that is capable of doing this? I know there are plenty of scrapers out there, but the requirement for a spider is to be able to extract data from specified table cells and in some cases div tags.

I cant go to a data extraction company since they charge too much (do they?). Please let me know if you're aware of any applications that can match those requrements.

Any help guys! Thanks!

 

LifeinAsia




msg:3801016
 5:00 pm on Dec 5, 2008 (gmt 0)

I think the bigger problem is the legality of what you want to do. Do you have permission to republish their information?

If so, why don't you ask them for RSS feeds or some other way of having them deliver the data to you in a more easily usable format?

ZydoSEO




msg:3801603
 4:36 pm on Dec 6, 2008 (gmt 0)

Sounds like this post should be in the Content, Writing, and Copyrighting forum.

And I agree w/ LifeInAsia... If you don't have permission to scrape these site and take their content, then you have much bigger issues with the law.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved