Forum Moderators: open
wilderness,
You're blocking Apple browser
Safari/413 UP.Link/6.3.1.15.0
Hobbs,
If Apple is dense enough to use such a term in their UA?
They deserve denying.
Same for similar exceptions of "crawl or spider". Alta Vista or somebody else uses one of these and gets denied every time they visit my sites.
Don
My server blocked it because of the "http://" in the user agent
Googlebot 2.1: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
It is a good custom to include in user-agent a link back to the creator explaining what the bot is doing and what for, I think people in this forum are the first ones to cry foul over bots that don't include such information in their user-agents.
As for blind blocking on the basis of keywords then it is just sad really - all you achieve is encourage bot writers to avoid providing this information in user-agent.
It is a good custom to include in user-agent a link back to the creator explaining what the bot is doing and what for
Well stated from the perspective of a bot operator.
From the perspective of a webmaster, this feature is often abused, specifically on blogs, and it called "REFERRER SPAM" which is why I block the "http;//" in the first place.
If the user agent is whitelisted then it bypasses my referrer spam block.
Obviously it is your choice but holding against legit bot operators who obey robots.txt in the first place the fact that they actually put in url back to their site (you'd be the first to complain if it was not done) so that webmaster could decide if they want to block it or not is wrong in my view.
This referer spam is overrated anyway - if you don't publish your log reports to the whole world then it won't affect you anyway, and in any case search engines can trivially detect backlinks from such log reports - in fact if they see lots of those spammed urls from logs then it is easier for search engines to decide that the page was spammed. Effectively this spam approach helps weed out spam.
You guys just need to take it easy - your efforts are probably not making a smallest dent in spammers activities anyway, there is really no need to take it so personal.
IGMC.
What happens beyond my whitelist is black magic
Disclaimers for the coding challenged please. Limited knowledge is more dangerous than full ignorance, I'm still floating in the middle somewhere :-)
If Apple is dense enough to use such a term in their UA?
They deserve denying.
wilderness, that's scary stuff, "link" is not an offensive harm meaning word, I'll give you reap, grab and capture, but link?
wilderness, that's scary stuff, "link" is not an offensive harm meaning word, I'll give you reap, grab and capture, but link?
Really?
How many link grabbing tools are out there?
I use Xenu now and again to verify my pwn sites, however the majority of the time it's denied.
There are many other similar tools.
Most of these run through entire sites in seconds.
Each webmaster must decide what is benefical or detrimental to their own websites.
The Mac/Apple users that visit my widget sites are a smaller % than California residents, which is exceed by visitiors from Oceanic countries.
And just to get your goat ;)
I also have the following OS-UA's denied.
Linux, Opera and a couple of others, while in other instances these are based on multiple critera (UA & IP or UA & Refer, or Refer and IP, etc, etc)
In addition, nearly all cell-phones and/or PDA's are denied at my sites. The majority of my pages are lengthy articles and contain simply to much material for these small viewers.
It's not my intention (regardless of future technologies) to provided multiple views for different OS's.
There was a time when my web pages were my primary focus.
Today however, my websites are simply a tool that makes avaialable a very small selection of high profile articles and images from my immense accumlation of older materials.
Should visitors not desire to conform to my restrictions (whether they are aware or unaware of the restrictions), than they may simply go somewhere else (non-existent) and view the materials.
(A broader explantion of this preference and my widgets is simply beyond the scope of this forum).
Dob
How times has Bill stated that he uses MULTIPLE criteria for his whitelisting?
And yet he's taken to task on "beginning with ["!...]
Go figure.
Don
Obviously it is your choice but holding against legit bot operators who obey robots.txt
Hundreds of copies of Nutch and Heritrix out there obey robots.txt and put a path to their server in the user agent with "http://" but that doesn't make them legit IMO until they have a viable service that can send traffic.
your efforts are probably not making a smallest dent in spammers activities anyway, there is really no need to take it so personal.
Here's an example of why it becomes personal.
One of them hit my server hard the other day during a period of heavy load, with all the visitors and legit bots on at the same time, that their requests for 8K pages in a few seconds, a literal DoS, caused a complete server overload.
It took a couple of minutes for it to clear up even with the automatic bot blocker nailing them quickly because all of the legit visitors and bots that were then backlogged as the whole thing snowballed out of control.
So I can either:
a) block as much as possible or,
b) buy bigger server hardware upgrading from dual CPUs to quad CPUs.
At the moment, blocking seems to be the cheapest solution, and I'd still need to do it even with quad CPUs for many other reasons as well.