Welcome to WebmasterWorld Guest from 54.226.62.251

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Fighting bad crawlers with a php script.

     

rudyten

6:07 am on Apr 25, 2008 (gmt 0)

5+ Year Member



Am new to the whole php thing, but i like it... hehehe

EVIL Spammers & Scammers
I know i can block them with Apache or .htaccess, these little evil crawlers that visit my site.

I know that IP addresses can be spoofed

But there are some...IP's that do indeed belong to these Evil Spammers & Scammers.

When I detect them, i like redirecting them to a nice place like a popup test site or give them header 404.

I place a simple IF statement on top of my header to do this.
I log them, then i sent them to paradise.
So every time I learn of a new Spammer/Scammer I modify my IF statememnt

I been wanting to change that IF statement to go against a Database Table of EVIL Spammer/Scammer IP's. Make the whole thing a simple INCLUDE file I can just add to my header. And of course create a simple ADD/REMOVE/Edit page to manage these little evil buggers. I have seen several software that promise this, but they are so complex

Some DOMAINS produce nothing but EVIL... So blocking a Evil domain i guess is good too....

What are some thoughts on this?
What do you do to fight these spammers?

irix

12:32 pm on Apr 25, 2008 (gmt 0)

5+ Year Member



Try bad-robots.php by Alex Kemp. He has posted the code on this forum.

I have an 8CPU database server, and my site was being crawled so aggressively that the server would run at 800% utilization and robots were pushing humans off the server - causing me to loose money.

I put the code in as is and it worked great, then i started tinkering and customizing it to exclude google and yahoo from being throttled back. All in all im very happy and Alex's program is nothing short of a miracle. I run a shopping site and my revenues have been up about 50% ever since - which goes to show how much money those badly behaved robots were costing me. I found some very intersting things going on as well. Fast scrapers that were pulling down my web pages without regard to my sites performance. When i find fast scrapers now, i just block them out of my site after i've confirmed thier IP address has no business scraping me.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month