thirteen - 3:56 am on Aug 13, 2011 (gmt 0)
In this particular case, I think I want to deny the referring site and not the visitor. This site is acting like a portal. The visitors goes to this scraper site and they click on a link to my page.
Instead of sending the visitors to my site, the scraper site goes to my site and scrape the page the visitor wants to see and serves it up the visitor.
I want to block any traffic coming from that scraper site so the visitor will have come to my site directly if they want to see my content. I want to keep the visitor and lose the scraper site.
I can identify when they come scraping. In my log, I will see an entry like this:
Number of Entries:1
Entry Page Time:Aug 11 2011 10:45:44 AM
Visit Length:0 seconds
Location:Arlington, Massachusetts, United States
IP Address:Psinet (188.8.131.52) [Label IP Address]
Referring URL: www.#*$!.org/cgi-bin/nbbw.cgi
The only constant is the line on Referring URL. It's always the same URL. So when I see www.#*$!.org/cgi-bin/nbbw.cgi on that line, I know my pages been scraped.