Forum Moderators: open
The block worked as designed, so no more bots without any ID could look at my website. Then I read a post from one of the more knowledgable members of this Forum, who suggested that blocking blank U-As might also exclude friendly check bots, sent in stealth mode to verify links, or terms of service.
I took this reasoning to heart and removed the two lines that blocked blank U-As, and low and behold, this arrived yesterday:
216.71.84.187 [Sun May 25 22:19:44 2003] "<undefined>"
216.71.84.187 [Sun May 25 22:27:25 2003] "(HTML Validator [searchengineworld.com...]
My advise is do NOT block blank U-A's. If a bot with a blank UA visits your website do a DNS Lookup in Sam Spade. If it comes from China, or Russia, or another country that is home to spammers, and if it also just indexes your home and guestbook pages, it is probably unfriendly. Block its IP with a "deny from" rule, instead of because of its blank ID. That is what I now do.
Wiz
As I've mentioned previoulsy denying blank UA's is not a sound policy. However that doesn't prevent me from using it to suit my purposes and traffic trends.
In the end each webmaster must make their own decison as to what is crucial to their individual sites.
I use Xenu when veirfying broken links on my websites, however I turn off the deny for that software during that session and turn it back on afterwards.
There are exceptions to many things however IMO, I wouldn't base my daily methods on those exceptions. Should a webmaster do that? More ofte than not, we will end up on the wrong end of the proverbial "shaft."
I suppose you could just use the "Report Problem" link at the bottom of this page to report it as an "enhancement request."
Wizcrafts,
The bottom line is that you or someone else used the HTML validator and one of the other tools in the Search Engine World toolbox to check your site. One provided a User-Agent string and the other did not. So, as Don says, you could always block it until you needed to enable it, or make an exception for that IP address.
HTH,
Jim
I see that I have opened a can of worms with my report. It sure would be a boring world if everybody had the same opinion on security issues. ;-)
It sure would be a boring world if everybody had the same opinion on security issues. ;-)
Wiz
That's part of what make this forum and the others at Webmaster World successful.
For the most part, we have folks discussing key issues and sharing similar ideas and goals with diverese objectives, CALMLY. And yet, there is rarely a participant which has the need to be so overbearing in determining that "their way is the only way."
Who would have thought a few years back that the goals of individual web sites could be so directed to cater to specific clientele?
Or even that very regionalized websites have the capability for a global share?
Don't post anything without the permission of the other party, though...
It's not a bug, but I believe that it is "good form" for all user-agents to identify themselves and provide a link to a page that indicates their purpose.
Getting back to your "can of worms" post, I'm far more tolerant of user-agents that identify themselves than I am of those that don't - or worse yet, that try to disguise themselves. My rule-of-thumb is that if the purpose of a user-agent helps my visitors, my sites, or people interested in the subjects of my sites, I'll let it in. Otherwise, I suppose it depends on my mood and recent bandwidth overages. :)
Jim
From what I have seen in my logs there are more unidentifiable, or RIPE Network IP blank agents then otherwise. Therefore, I have uncommented the blockers to make them active again. I will keep a watch on my logs to see if I am blocking any tracable good guys, and try to allow them to bypass the gate.
As it stands right now all spam is being sent to previously harvested addresses on my website, most of which were in plain-text mailto links, before I knew better. Some I redirect to a catchall account to view and report to SpamCop, while others are sent directly to a dead-drop accont the I never see at all. I have implemented a policy whereby all email links are Javascript includes, with a noscript tag telling those non-scripted viewers to use my contact form instead of email. My form has all of the addresses in a Perl script, not world-readable (711), and the form recipient lines use numeric aliases only. This is working fine for now, but I am ready to change again if the bots become smart enough to decode the scripted links, or infiltrate my cgi-bin files.