Forum Moderators: open
The intent is not in gathering names and IP's rather in monitoring and to some extent controlling what is done and who is given the content of your websites.
A collection of names and IP's is just part of being organized. At least in some instances ;)
EX: Your selling grits and nobody in the far east has either heard of grits or even worse everybody in the far east is allergic or opposed to grits?
Why allow these visitors and bots related to these visitors to traverse your grits website?
Their not going to buy your products? Why waste your bandwidth and expose your content to parties whose only intent is in acting ill-behaved in their gathering methods.
Additionally,
Why allow companies which are charging fees to customers to use your resources in gathring content which is resold without even providing you a thank you or a FOAD?
I can see a point to it, as the reason I have been researching this is because some robots are returning errors when they visit the site. I need to make sure that the errors are not being reflected from the customer standpoint. Specifically, I am concerned that "LNSpiderguy"
at 198.185.18.207 is using malformed data as a link and trying to muddle its way through my website.
What I guess I can't see is a reason for spending tons of time analyzing the data for new i.p. addresses. as far as I am concerned, the more people that visit, the better.
Avoid e-mail address scraping by robots for spammers
Prevent page-jacking and site-jacking
Avoid excessive bandwidth penalties
Maintain effective cloaking - there are legitimate reasons to cloak; some call it UA- and IP-based redirection
Prevent mass (automatic) theft of proprietary content
Deny access to companies which spy on employees and filter their internet access (It's your management problem, so don't spend my money fixing it by stealing my bandwidth!)
Deny companies that offer the preceding service to their clients for a fee - and use my bandwidth to do it.
Don't abuse my sites and keep out if you are not personally interested in the information it contains or at least interested in helping other people who are interested in it. I'm willing to pay for legitimate visitors, directory editors, and search engine robots to access the sites. Otherwise, I'll kick out anyone I choose - It's my house, and I pay for it...
If you are not monitoring your site logs, you might be surprised at the amount of cr@p going on - On a small niche site, abuse can account for 50% of the bandwidth consumed!
There are some nice scripts and techniques floating around here on WebmasterWorld to keep this under control so that you don't have to baby-sit your site constantly to avoid trouble.
Jim
Until the realization hits slaps you in the face that much of your traffic is the both the useless gathering of your data and gathering of your data to be used for commercial resources in which YOU are not being compensated for.
To gage the effectiveness of your site and any goals you may have pointed your site towards REQUIRES that you separate these useless visitors from you intended market.
So yes there is a difference in visitors. Just because the numbers are a success doesn't mean either you or your sites are a success.
Although I realize it is confusing as to what goes on here?
On the Lexis/Nexis site I saw referecnes to "alliance" as in traffic as well.
Just deny the ranges I previously provided.
I'm curious as to your intent?
Your initail mail acted rather coy?
What not just ask for this information up front rather than bury it in a thread?
It's similar to the deception the ill-behaved bots attempt while spidering sites :(
Don
BTW I'll believe there also exists a private subscription Nexis data service used by attorneys.
Also, it allows me to see that over 90% of my traffic is tracked from legitimate sources. e.g. only U.S. visitors.
Maybe there is something that I am missing, but I would consider us a fairly large and high growth company that has no need to worry about bandwidth or content theft as it is already available from product manufacturers or is part of the public domain.
I do keep my own list of known spammers which I actively stop at the webserver based upon e-mail or open relay.
I would think that placing certain restriction based upon I.P. address would be somewhat futile though. Most people that employ the types of tactics that you are talking about, generally do so from behind a firewall, proxy server, or use a dynamic I.P. address.
This would mean that if I blocked an I.P. address for something on Friday, that my "real" visitor or potential customer could obtain that very same I.P. address from their ISP on Saturday and be denied access ...right?
Nothing more, nothing less.
I am not trying to be deceptive of bury my real questions.
I was just curious why everyone seemed so concerned about logging every I.P. address for the major search engine spiders. I thought there was a way to coax the spider into visiting or re-visiting your website once you had this type of information.
Belive me, I have been put through the wringer when it comes to web marketing.
I owned a web-based company that was built in frames. it was also banned from Open directory project where the editor was not even native to the english language, yet had omnipotent control over my category.....meaning I had ZERO recourse for over 3 years.
In the way in which you've applied it your are correct.
However why would you deny access to a potential customer with first having made the relaization of who your poetential customers are?
In the follow up mail:
<snip>I was just curious why everyone seemed so concerned about logging every I.P. address</snip>
I believe your referring to the Google bot mail?
Brett ask for assitance to clear up his corrupted list and was assisted by many participants.
BTW, There is a way to effecively use frames. I have two portions of my sites that are in frames and each page of the two fifty page sections are spidered.
Then one day I looked up one of those IPs and I immediately realized it couldn't be an ordinary web surfer. I did a little more research and was lead to a web page. I don't remember the name of the company, but something about their purpose was a little disturbing. My impression at the time was that that company would gather information from various web sites and then sell it to a third party. Their purpose was tailormade to their clients' needs. I don't know why, but the word "commando" popped into my head at the time. This raised a red flag in my mind. But, I was only mildly disturbed.
Sometime later, I got another hit from a similar company. Well, not "a hit," exactly. More like, it gulped down my entire web site in a matter of minutes.
I did further research and almost immediately landed in these forums. The more I read around here, the more it made sense the need for blocking various "factions" from having access to my content - every single word and picture of which I sweated over while creating myself.
I quickly slapped together an htaccess file consisting of three lines of "deny from" three different IPs and nothing else. Shortly thereafter, someone came back from one of those IPs and tried to suck down my now bigger site and I knew I was on the right track.
There's a lot more to it by now, and in some respects I'm more lenient than some, and in other respects more strict than some, but that's the short of it.
And, I'd like to take this opportunity to thank everyone, especially Brett, JD Morgan, wilderness, pendanticist, fiestagirl, and well, there's too many to think of right now... But thank you for everything I've learned here.
Sorry for going long and a little off topic here.