Forum Moderators: open

Message Too Old, No Replies

Bot banning consultants and/or bad bot banning services

Do they exist? Help for those of us who haven't a clue

         

Webwork

11:28 pm on Sep 12, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It would likely take forever and a day to figure out how to do what you all routinely talk about back here. I mean, identify the bad guys, lay traps, master the game of whack-a-bot, . . .

Which leads me to ask: ARE THERE folks or companies that offer services to know-nothings like myself, to help us set up bot banning systems?

What can you all tell me about such companies, folks and services - if they exist? Any such entities have street cred? Any offer some form of automated service? Any school (besides here) where I can go to learn "how to" if there is no such service provider?

Shame if such bot banning assistant services done exist. Seems like it should be a booming business based on what I am reading back here and "out there", about scrapers, content thieves, MFAs, etc.

Thanks for your guidance. I wish I had your brains for this stuff. Anyone want to give your brain a rest and loan it to me? I promise I won't work it too hard. :0)

P.S. I love the cloak and dagger intrigue of the spider/bot threads. Kind of like Sherlock Holmes meets Dashiel Hammett meets Harry Potter meets Monte Python. ;)

Ocean10000

2:58 pm on Sep 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There are a few. I think they will chim in or sticky you, since droping the names would get me TOS slaped. But I can at least ask some of the more common questions they would ask you.

What OS and Webserver software are you currently running?
Examples
(1). linux/unix & apache
(2). Microsoft OS & IIS

What type of sites are you running?
(1). Personal
(2). directory
(3). wiki's
(4). others

What exactly are you looking for them do to for you?
(1). Help identify the bots in your logs?
(2). Help Identify & block them from accessing your site(s)?
(3). Cloaking to hide your real content to show them fake content?

wilderness

3:42 pm on Sep 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It would likely take forever and a day to figure out how to do what you all routinely talk about back here. I mean, identify the bad guys, lay traps, master the game of whack-a-bot, . . .

actually all it takes is some initative!
Do a serach at google for htaccess and sift through the crap until you find an example that your able to comprehend and then you have your beginning.
It doesn't take a schizophrenic on meds to determine crawls in your visitor logs, just the time to analyze.

Which leads me to ask: ARE THERE folks or companies that offer services to know-nothings like myself, to help us set up bot banning systems?

There are people and/or companies that will accept your fees to do most anything your desire, both set-up and maintenance.

What can you all tell me about such companies, folks and services - if they exist? Any such entities have street cred? Any offer some form of automated service? Any school (besides here) where I can go to learn "how to" if there is no such service provider?

You mean their company gets like twenty milion visitors a month, they have advertising revenue and a Dun & Bradstreet credit rating exceeding $10 mil?
Good luck!

Shame if such bot banning assistant services done exist. Seems like it should be a booming business based on what I am reading back here and "out there", about scrapers, content thieves, MFAs, etc.

There's realy not a market for this?

Let's assume that you have a website or two that gets 5-10 million visitors a month?
Those websites also reap substantial income from adverting.

Why on earth would the company or websmaster reduce the fees of their advertising revenue by reducing visitor traffic that inflates their counts toward fixing ad fess?
It's a bit of a circle, sort of like a dog chasing it's own tail, however some websites do it quite effectively.
Eventually advertisers understand they are being hoaxed and disconnect themselves from the union.

Your tongue-in-cheek comments below are not worthy of comment.

Thanks for your guidance. I wish I had your brains for this stuff. Anyone want to give your brain a rest and loan it to me? I promise I won't work it too hard. :0)

P.S. I love the cloak and dagger intrigue of the spider/bot threads. Kind of like Sherlock Holmes meets Dashiel Hammett meets Harry Potter meets Monte Python. wink

Webwork

9:40 pm on Sep 13, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Ocean, my answer to your questions in every case would be "all of the above" (a-c, etc.): Win2K and soon LAMP. All manner of websites. Ban all the bad behavior.

Wilderness, with my tongue firmly planted where it wasn't heretofore planted: Thanks.

I think.

;-P

incrediBILL

11:01 pm on Sep 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I feel like I'm being baited.

However, Googlebot spoofing is ancient history, stay tuned for the post when Dan wakes up.

Bewenched

11:39 pm on Sep 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I do think that there would be a market for a company who actively helped identify site scrapers, notified appropriate search engines, notified their customers, kept a running list of bad bots and or helped identify them.

I spend about 4 hours a week doing my "search and destroy" on these types of sites and I would certinly pay a reputable company to do this for me and give me a weekly report.

martinibuster

12:02 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



>>>I feel like I'm being baited.

Bill is being modest. ;)

Bill gave a lively talk on this at the San Jose, SES this summer that had the Google and Yahoo guys hanging off every word. Great stuff.

He will be reprising that presentation with a newly updated discussion in Vegas. I would encourage anyone with an interest in preventing scrapers and other bandwidth munching, content stealing rogue bots to check out his presentation.

wilderness

1:23 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I do think that there would be a market for a company who actively helped identify site scrapers,

Gary does this free with his weekly reports

notified appropriate search engines,

notify what SE's?
1) the majors
a) most only answer with automated replies requesting the contactee to jump through the same hoops that were followed which iniated the automated reply. Rarely is there any resolution presented. The same for internet providers

2) harvesters
a) notify them of what?

notified their customers,

I'm not sure who the their is here?
The inital paying company's customer or the SE's customers?
I submit a report to you and then it's also required that I submit the same report and initiate communications with your customers?
1) I've not given you permission in the fees you paid to pass my reports onto your customers?
2) Let your customers contact me an arrange a contract the same as you did.

kept a running list of bad bots and or helped identify them.

Gary does this free with his weekly reports

I spend about 4 hours a week doing my "search and destroy" on these types of sites and I would certinly pay a reputable company to do this for me and give me a weekly report.

Just go to Gary's website and download the files.

However. . . you need to keep in mind that each website has it's own different marketing goals.
Each website decides on their own what is detrimental and what is beneficial.
Each website and/or may not have the need (or customers) from different regions of the world. As a result, eliminating those regions from your traffic could reduce your problems drastically.

Four your four hours how much are you willing to pay weekly for a "standard" report of which you would not have any input into the content?
$400, $4,000 or even $40,000?
The various prices could all depend on the customiaztion of explained contnet that you require in your reports?

If, on the other hand, your looking for somebody to spend hours going over your visitor logs for an inital set-up of scrapers and harvetsers?
The simpliest solution is to initate a bot trap or train some personnel in your comapny to do the monitoring.

In final summary, most companies or webmasters don't have the time and could give two squats about analyzation of data in their visitor logs. The subject is just NOT cost/profit efficient.

Hell, most people don't even know how to find either their visitor logs or even ARIN. Nor, are they aware of the capabilites of a firewall or htaccess.
Of course they could learn all this with a little understanding and poking around the internet.

incrediBILL

1:35 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If, on the other hand, your looking for somebody to spend hours going over your visitor logs for an inital set-up of scrapers and harvetsers?
The simpliest solution is to initate a bot trap or train some personnel in your comapny to do the monitoring.

It's possible to do it with 100% automation so you don't need to train anyone to do anything except install the bot stopper which is no harder than installing a blog.

In final summary, most companies or webmasters don't have the time and could give two squats about analyzation of data in their visitor logs. The subject is just NOT cost/profit efficient.

Sorry, but that's an opinion, not a final summary.

Of course people that don't know any better and/or are unaware of the situation won't be concerned, that's what marketing is all about. You would be amazed how people that know nothing about these things react to as little as 15 minutes of education on the topic.

Just go to Gary's website and download the files.

No offense to Gary, but although Gary's files are a good resource they aren't the end-all-be-all that you think. Gary is only aware of things that look at his niche or that other people in specific niches share with Gary. I have several sites in different niches and can tell you for certain that there are bots in common with all the sites and other bots that only visit a specific niche.

[edited by: incrediBILL at 1:39 am (utc) on Sep. 22, 2006]

wilderness

1:46 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Gary's files are a good resource but not the end-all-be-all that you think as Gary is only aware of things that look at his niche. I have several sites in different niches and there are bots in common and other bots that only visit a specific niche, you would be amazed.

Your making quite a few assumptions here Bill.
And you know the reference to that!

I used Gary's files as an example.
There's not an "end-all-be-all" to anything, either websites or any other subject you may chose to inject.

After doing this for six years, nothing amazes me.
Not even the lack of effort by visitors to improve the simpliest of search skills.
Nor the extent of data that many people assume exists on the internet when they are unable to find the hard copy data. Or for that fact, even what the name is of what they are looking for.

Back to lurking and my work as I've spent enough non-productive time at Webamster World.

incrediBILL

1:53 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Your making quite a few assumptions here Bill.

I'm a big fan of Gary's work but I've compared Gary's data to my data and he's missing a few things and so am I, so I don't know what assumption I'm making because my data isn't the end-all-be-all either.

Someone alerted me to 4 commercial bots I've never even heard of just 2 days ago so I don't pretend to know everything either ;)

My purpose wasn't to slam Gary, but you made it sound like getting that file, while it's a great start, was all someone needed to do, My point was, which may not have come across well, was it may not contain something that crawls in your niche so more work may be required.

"I know what I know if you know what I mean" - Eddie Brickell

[edited by: incrediBILL at 1:55 am (utc) on Sep. 22, 2006]

wilderness

2:32 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I suppose then the best option for the two inquires in this thread to do is sticky you, since you seem to have both the answers and the questions ;)

"I am trying to see your point of view.....
But I can't get my head that far up my ass" KAT

incrediBILL

3:01 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



lovely quote LOL

I never said I have all the answers but maybe, someday, if I keep asking all the right questions.

When I don't know I ask Gary ;), Ocean or Pfui [wherever the heck she is...]

GaryK

3:01 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



For some reason the DEA comes to mind. They like to brag about how much they catch, but they usually neglect to mention how much more gets through.

I think it's the same with user agents. Even with Bill, Ocean, myself and others all looking for bad bots we will never, ever have the definitive list. And if we somehow did, not everyone would agree with what we consider definitive. :)

wilderness

3:41 am on Sep 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



there's a gal that rarely participates in this forum any longer that is sharper than most.

Unable to recall her screen name.

Jim might as she provided him with a solution to make single line expressions sort of an OR option.

volatilegx

3:00 am on Sep 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



By the way I'm really looking forward to hearing incrediBILL's presentation in Vegas this November. Brett asked me to speak during that session, too, but I imagine Bill's talk will be a little more fun.