Page is a not externally linkable
- WebmasterWorld
-- Ecommerce
---- auto adjustment of prices


loft - 5:27 am on Nov 27, 2011 (gmt 0)


Hello there,

since this topic is left unanswered I would like to finally answer my own question "Where can I find a tool to read the "source code"? "

I finally found a couple of so called "web / screen scraping" tools that do what I need:

- Mechanize (Ruby based; not tested yet because I habe no Ruby knowledge)
- Nokigiri (Ruby based; not tested yet because I habe no Ruby knowledge)
- Hpricot (Ruby based; not tested yet because I habe no Ruby knowledge)
- ScrAPI (Ruby based; not tested yet because I habe no Ruby knowledge)
- Rubyful-soup (Ruby based; not tested yet because I habe no Ruby knowledge)
- ARIEL (not tested yet)
- Scrapy (Python based and very powerful, tested and works)
- prudsys (sounded very promising, but very expensive and some powerful features I don't use/need, not tested yet)
- scRUBYt (Ruby based; not tested yet because I habe no Ruby knowledge)
- Anemone (Ruby based; not tested yet because I habe no Ruby knowledge)
- mozenda (very easy to use, but I miss some functions)
- uBot (very easy to use, but I miss some functions, tested the demo version, but I will not update to the full version)
- Google Docs (supports 50 ImportXml functions only, useless)
- 80legs (80app) (very easy to use, but I miss some functions)
- 30digits (Web Extractor) (very easy to use, but I miss some functions)
- my own PHP-crawling script (works great but it's not finished yet, the script should save the item into a SQL database)

"use the right tool for the right job!" so I need more time to test them all, my recent favorite is Scrapy btw!

Thanks for reading

[edited by: lorax at 2:49 pm (utc) on Nov 28, 2011]


Thread source:: http://www.webmasterworld.com/ecommerce/4388672.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com