Forum Moderators: open
Which leads me to ask: ARE THERE folks or companies that offer services to know-nothings like myself, to help us set up bot banning systems?
What can you all tell me about such companies, folks and services - if they exist? Any such entities have street cred? Any offer some form of automated service? Any school (besides here) where I can go to learn "how to" if there is no such service provider?
Shame if such bot banning assistant services done exist. Seems like it should be a booming business based on what I am reading back here and "out there", about scrapers, content thieves, MFAs, etc.
Thanks for your guidance. I wish I had your brains for this stuff. Anyone want to give your brain a rest and loan it to me? I promise I won't work it too hard. :0)
P.S. I love the cloak and dagger intrigue of the spider/bot threads. Kind of like Sherlock Holmes meets Dashiel Hammett meets Harry Potter meets Monte Python. ;)
What OS and Webserver software are you currently running?
Examples
(1). linux/unix & apache
(2). Microsoft OS & IIS
What type of sites are you running?
(1). Personal
(2). directory
(3). wiki's
(4). others
What exactly are you looking for them do to for you?
(1). Help identify the bots in your logs?
(2). Help Identify & block them from accessing your site(s)?
(3). Cloaking to hide your real content to show them fake content?
It would likely take forever and a day to figure out how to do what you all routinely talk about back here. I mean, identify the bad guys, lay traps, master the game of whack-a-bot, . . .
actually all it takes is some initative!
Do a serach at google for htaccess and sift through the crap until you find an example that your able to comprehend and then you have your beginning.
It doesn't take a schizophrenic on meds to determine crawls in your visitor logs, just the time to analyze.
Which leads me to ask: ARE THERE folks or companies that offer services to know-nothings like myself, to help us set up bot banning systems?
There are people and/or companies that will accept your fees to do most anything your desire, both set-up and maintenance.
What can you all tell me about such companies, folks and services - if they exist? Any such entities have street cred? Any offer some form of automated service? Any school (besides here) where I can go to learn "how to" if there is no such service provider?
You mean their company gets like twenty milion visitors a month, they have advertising revenue and a Dun & Bradstreet credit rating exceeding $10 mil?
Good luck!
Shame if such bot banning assistant services done exist. Seems like it should be a booming business based on what I am reading back here and "out there", about scrapers, content thieves, MFAs, etc.
There's realy not a market for this?
Let's assume that you have a website or two that gets 5-10 million visitors a month?
Those websites also reap substantial income from adverting.
Why on earth would the company or websmaster reduce the fees of their advertising revenue by reducing visitor traffic that inflates their counts toward fixing ad fess?
It's a bit of a circle, sort of like a dog chasing it's own tail, however some websites do it quite effectively.
Eventually advertisers understand they are being hoaxed and disconnect themselves from the union.
Your tongue-in-cheek comments below are not worthy of comment.
Thanks for your guidance. I wish I had your brains for this stuff. Anyone want to give your brain a rest and loan it to me? I promise I won't work it too hard. :0)
P.S. I love the cloak and dagger intrigue of the spider/bot threads. Kind of like Sherlock Holmes meets Dashiel Hammett meets Harry Potter meets Monte Python. wink
I spend about 4 hours a week doing my "search and destroy" on these types of sites and I would certinly pay a reputable company to do this for me and give me a weekly report.
Bill is being modest. ;)
Bill gave a lively talk on this at the San Jose, SES this summer that had the Google and Yahoo guys hanging off every word. Great stuff.
He will be reprising that presentation with a newly updated discussion in Vegas. I would encourage anyone with an interest in preventing scrapers and other bandwidth munching, content stealing rogue bots to check out his presentation.
I do think that there would be a market for a company who actively helped identify site scrapers,
Gary does this free with his weekly reports
notified appropriate search engines,
notify what SE's?
1) the majors
a) most only answer with automated replies requesting the contactee to jump through the same hoops that were followed which iniated the automated reply. Rarely is there any resolution presented. The same for internet providers
2) harvesters
a) notify them of what?
notified their customers,
I'm not sure who the their is here?
The inital paying company's customer or the SE's customers?
I submit a report to you and then it's also required that I submit the same report and initiate communications with your customers?
1) I've not given you permission in the fees you paid to pass my reports onto your customers?
2) Let your customers contact me an arrange a contract the same as you did.
kept a running list of bad bots and or helped identify them.
Gary does this free with his weekly reports
I spend about 4 hours a week doing my "search and destroy" on these types of sites and I would certinly pay a reputable company to do this for me and give me a weekly report.
Just go to Gary's website and download the files.
However. . . you need to keep in mind that each website has it's own different marketing goals.
Each website decides on their own what is detrimental and what is beneficial.
Each website and/or may not have the need (or customers) from different regions of the world. As a result, eliminating those regions from your traffic could reduce your problems drastically.
Four your four hours how much are you willing to pay weekly for a "standard" report of which you would not have any input into the content?
$400, $4,000 or even $40,000?
The various prices could all depend on the customiaztion of explained contnet that you require in your reports?
If, on the other hand, your looking for somebody to spend hours going over your visitor logs for an inital set-up of scrapers and harvetsers?
The simpliest solution is to initate a bot trap or train some personnel in your comapny to do the monitoring.
In final summary, most companies or webmasters don't have the time and could give two squats about analyzation of data in their visitor logs. The subject is just NOT cost/profit efficient.
Hell, most people don't even know how to find either their visitor logs or even ARIN. Nor, are they aware of the capabilites of a firewall or htaccess.
Of course they could learn all this with a little understanding and poking around the internet.
If, on the other hand, your looking for somebody to spend hours going over your visitor logs for an inital set-up of scrapers and harvetsers?
The simpliest solution is to initate a bot trap or train some personnel in your comapny to do the monitoring.
It's possible to do it with 100% automation so you don't need to train anyone to do anything except install the bot stopper which is no harder than installing a blog.
In final summary, most companies or webmasters don't have the time and could give two squats about analyzation of data in their visitor logs. The subject is just NOT cost/profit efficient.
Sorry, but that's an opinion, not a final summary.
Of course people that don't know any better and/or are unaware of the situation won't be concerned, that's what marketing is all about. You would be amazed how people that know nothing about these things react to as little as 15 minutes of education on the topic.
Just go to Gary's website and download the files.
No offense to Gary, but although Gary's files are a good resource they aren't the end-all-be-all that you think. Gary is only aware of things that look at his niche or that other people in specific niches share with Gary. I have several sites in different niches and can tell you for certain that there are bots in common with all the sites and other bots that only visit a specific niche.
[edited by: incrediBILL at 1:39 am (utc) on Sep. 22, 2006]
Gary's files are a good resource but not the end-all-be-all that you think as Gary is only aware of things that look at his niche. I have several sites in different niches and there are bots in common and other bots that only visit a specific niche, you would be amazed.
Your making quite a few assumptions here Bill.
And you know the reference to that!
I used Gary's files as an example.
There's not an "end-all-be-all" to anything, either websites or any other subject you may chose to inject.
After doing this for six years, nothing amazes me.
Not even the lack of effort by visitors to improve the simpliest of search skills.
Nor the extent of data that many people assume exists on the internet when they are unable to find the hard copy data. Or for that fact, even what the name is of what they are looking for.
Back to lurking and my work as I've spent enough non-productive time at Webamster World.
Your making quite a few assumptions here Bill.
I'm a big fan of Gary's work but I've compared Gary's data to my data and he's missing a few things and so am I, so I don't know what assumption I'm making because my data isn't the end-all-be-all either.
Someone alerted me to 4 commercial bots I've never even heard of just 2 days ago so I don't pretend to know everything either ;)
My purpose wasn't to slam Gary, but you made it sound like getting that file, while it's a great start, was all someone needed to do, My point was, which may not have come across well, was it may not contain something that crawls in your niche so more work may be required.
"I know what I know if you know what I mean" - Eddie Brickell
[edited by: incrediBILL at 1:55 am (utc) on Sep. 22, 2006]
I think it's the same with user agents. Even with Bill, Ocean, myself and others all looking for bad bots we will never, ever have the definitive list. And if we somehow did, not everyone would agree with what we consider definitive. :)