Hello there,
since this topic is left unanswered I would like to finally answer my own question "Where can I find a tool to read the "source code"? "
I finally found a couple of so called "web / screen scraping" tools that do what I need:
- Mechanize (Ruby based; not tested yet because I habe no Ruby knowledge)
- Nokigiri (Ruby based; not tested yet because I habe no Ruby knowledge)
- Hpricot (Ruby based; not tested yet because I habe no Ruby knowledge)
- ScrAPI (Ruby based; not tested yet because I habe no Ruby knowledge)
- Rubyful-soup (Ruby based; not tested yet because I habe no Ruby knowledge)
- ARIEL (not tested yet)
- Scrapy (Python based and very powerful, tested and works)
- prudsys (sounded very promising, but very expensive and some powerful features I don't use/need, not tested yet)
- scRUBYt (Ruby based; not tested yet because I habe no Ruby knowledge)
- Anemone (Ruby based; not tested yet because I habe no Ruby knowledge)
- mozenda (very easy to use, but I miss some functions)
- uBot (very easy to use, but I miss some functions, tested the demo version, but I will not update to the full version)
- Google Docs (supports 50 ImportXml functions only, useless)
- 80legs (80app) (very easy to use, but I miss some functions)
- 30digits (Web Extractor) (very easy to use, but I miss some functions)
- my own PHP-crawling script (works great but it's not finished yet, the script should save the item into a SQL database)
"use the right tool for the right job!" so I need more time to test them all, my recent favorite is Scrapy btw!
Thanks for reading
[edited by: lorax at 2:49 pm (utc) on Nov 28, 2011]