Welcome to WebmasterWorld Guest from 54.197.171.28

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Scraper Test Drive

Open Discussion of Scraper Tools and Success Rates

   
8:58 pm on Aug 29, 2013 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Someone from Distil networks posted a series of scraper test drives using various tools and techniques against their technology.

Scrape Bot Protection Test:
[extract-web-data.com...]

You can obviously use their testing methodologies to validate your protection against bots.

It's possible you find holes in your methods and either decide to switch to a service like theirs or get a better script for your own hosting.

Scraping through a CAPTCHA:
[extract-web-data.com...]

Couple of other related scraper testing posts also worth a read.

I found it interesting to say the least ;)

DISCLOSURE: I'm not related to or have any personal interest in the site, service or links posted as it's presented here strictly for educational purposes.
11:35 am on Aug 30, 2013 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Bill,
Personally, I'd be leery of running an outside product against my sites and/or htaccess. It may not fly anyway and I'd likely have to make exceptions for access.

This is surely what could be considered a 3rd party product and why would I/We invite the possibility of this org using creative solutions to increase their profits?

Don
3:41 am on Aug 31, 2013 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



The guy that wrote the article actually if from a hosting company that claims to have content protection built into their basic service. They aren't the only ones doing combined content protection and hosting as it's becoming somewhat of a trendy thing.

However, I was more interested in all the tools and methods he used to test their service, including the CAPTCHA blow through services.

I use a few programs to attack my own sites every now and then just to see how well they stand up and found some of his methods interesting as well.

Going to see how his stuff measures up to mine, should be amusing.