---- Should Google Tank the Crowd Sourced Content Scrapers?
incrediBILL - 9:08 pm on Sep 8, 2012 (gmt 0)
They are not bots and aren't bound by bot conventions. Hotlink protection and robots.txt aren't going to help.
Who said anything about robots.txt? That's just a suggestion for good bots and many don't obey it anyway but that's another discussion for another day.
The humans are directing single purpose scraper bots/tools to make copies of stuff. Those tools don't always ID themselves as browsers all the time and even if they do, most often they can be stopped from copying things from sites without permission. You can't copy images off some of my servers unless you jump thru serious hoops, you REALLY have to want to steal it badly.
Not only that, you can build web pages that trick people and those tools. A prime example of my favorite trick is to make the image the BACKGROUND image and then use a transparent 1x1 size pixel which is resized to fit the image area (table cell, div, etc.). All they end up copying is blank transparent images. Not to mention all the tricks to disable the right mouse button and also disable view page source hot keys, etc.
( inserting URLs, meta data, meta tags etc etc ) get stripped with just a 1 pixel crop on the image by the
I'm not sure how stripping 1 pixel strips a URL typed across the image, 'splain it to me.
Must be confusing URLs on the image with meta data URLs in the image, totally different.
... and bot blocking encompasses both whitelisting and blacklisting.