Welcome to WebmasterWorld Guest from 54.204.165.156

Forum Moderators: Ocean10000 & incrediBILL

Blocking Monitoring Services

looking for list of ips.

   
2:36 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Does anyone have a comprehensive list of website monitoring service IP address?

I've got a few collected from the forum here, but I just thought maybe someone had a full list of these rogue unwanted bots.

At a minimum, do you know these 4:
www.aignes.com
www.watchthatpage.com
www.trackengine.com
www.infominder.com

Aside from blocking them, what kind of "fun" can be had via cloaking with these bots?

<added>

Website monitoring services take users. Why would a user visit the site if they can "monitor" it from elsewhere. That defeats mission critical branding, it defeats promotion efforts, it defeats advertising, and that defeats your sites goals.

When a user visits your site and does not find updated content, they may find content or advertising they have not been exposed too. It's like channel surfing - it is how visitors are exposed to content they may not have seen before.

On the technical side, if people don't visit a site, it also means you are not counted by page counters such as the search engine toolbars like Google, Yahoo, and Alexa. That in turn may hurt your search engine rankings.

By not actively blocking these monitors, you are allowing and endorsing the poaching of your visitors. Website monitors are worse than Gator too me. Atleast with Gator, they have to visit your site.

Aside from the approved bots and partnerships with search engines, we do not allow unauthorized programmed querying of the site.

2:58 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



InternetSeer.com

But this one is fairly polite, only hitting the server once every couple of hours.

Can't quite see how you could cloak 'em. If you pretend your site was not there, you'd just generate lots of spam emails notifying you of their "service".

3:09 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



My one thought on that is to send them the same page forever. Users use their service, and get stale pages.
3:13 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



...Or use the current time and date to generate pseudo-random content, and send them something different every millisecond. Then, assuming they send e-mail notices, they'll spam their subscribers...

Must have had some "evil" coffee this A.M. ;)

Jim

3:39 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Jim,
Althogh I don't recall the reference I do recall it being mentioned of sort of redirecting a visitor to a sort of "null land" where the bot is held in space for a very long delay.

Any details or reference on that?
TIA

3:46 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



internetseer is the worst. every hour almost on the hour and sometimes your requests for them to stop go ignored, even robots.txt! tired of sending emails to those guys.
3:52 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Yesterday I saw an interesting thread in Usent about copyight infringment.
Somebody provided this link (for US)
http ://www.copyright.gov/onlinesp/list/

It would be nice if some similar accepted compliance was in place for spidering.
It would sure clean up these pests.

3:56 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Don,

SSI include to slow down bad_bots.
<!--#exec cmd="sleep 20" -->

Jim

3:59 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I use InternetSeer and it only hits every hour on average for my site, I think its a good free service.
4:08 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I use InternetSeer on a few sites, too. But if you are not a subscriber, they access your site anyway, and send you a marketing e-mail pitch if your site goes down.

Has anyone tried 403ing them?

(I think this thread needs to be split - Brett is asking about "web site content update monitoring services" and we are apparently veering off into "server uptime monitoring services".)

Jim

4:10 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Jim
ONLY SSI no option in other modules?
TIA
Don
4:22 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Two other abusers ...

Zeus
Turnitin

4:29 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



changedetection.com
changedetect.com

But they appear to be rather polite.

4:30 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Brett
A google turned up these:
1stMonitor Web Site Monitoring
Site Vigil
YourSiteUp.com
NodeBlue.net
Uptime100 Professional Website Monitoring
Alertra Web Site Monitoring
WatchDog
Affiliate Selling and Marketing Software
Atomic Watch
WebSitePulse
Peer-to-Peer distributed
InternetSeer
SiteProbe
WatchMouse
NetWhistle
PingAlink
Elk Fork
Uptime100

I also used to get pestered by a Canadian one which stopped on the first dead link it hit and stated MANY dead many links or some such nonsense.
It was named twentyfourseven or something similar.

4:53 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



In PHP <?sleep(20);?>

Every script language will have something like that.

A guy by the name of Gary Keith keeps an updated browscap file and he has a flag for web strippers. Along with search engines and every obscure web browser out there.

As links are not permitted to a google for gary keith browscap.ini

I use this all the time in my PHP scripts using the PHP get_browser function.

4:56 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi:

Off the original topic, but here goes...

>>> Has anyone tried 403ing them?

I do. They go away after a while.

>>> redirecting a visitor to a sort of "null land"

Send them to a "black Hole." simple mod-rewrite to send certian offenders (requests for default.ida, formmail.cgi, UA of internetseer, referer of iaee.org, a whole bunch of stuff you do not want to bother with, and you so not want to bother you!) and send them to a non-existant IP address. That will hang each request for 20 seconds, and all your server does is send out a redirect.

I have a post on this in Webmaster General, and someone there suggested an IP range that is pretty good!

dave

4:58 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



IP's baby - who's collected ips?
5:17 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



FairAd Client
217.226.85.248
tracerlock
209.61.182.37
Cyveillance
63.148.99.224 - 63.148.99.255
65.118.41.192-65.118.41.223
TurnitinBot
64.140.48.25-27
NameProtect
12.148.196.128 - 12.148.196.255
Linkwalker
209.167.50.16 - 209.167.50.31
BDFetch
204.92.59.0 - 204.92.59.255
5:30 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



I know a lot about web page monitoring services.

Some of you are mixing up website monitoring services and webpage monitoring services.

Brett asked about webpage monitoring services which are entirely different than a robot that spiders/downloads your entire website over and over again.

Web page monitoring services are not:
> email harvesters
> domain spiders or
> trademark search tools

Web page monitoring services allow your visitors to check a single web page on your site once a day.

So web page monitoring is not in the same class as something like internetseer, zeus or tunitin.

Brett: Your list of web page monitoring services is near complete if you add changedetect.com and changedetection.com (similar names, but not related) as Markus pointed out:

Markus said:
changedetect.com changedetection.com
But they appear to be rather polite.

And ChangeDetect is indeed a polite web page monitoring service that does not negatively impact bandwidth. Here is a quote from the website:

The ChangeDetect automated page monitor tool is a "good bot" (robot). No matter how many users monitor a single page on your website, your web server opens only one session per page. ChangeDetect runs only once a day to monitor the page.

http://www.changedetect.com/?page=reduce-bandwidth-website

Why would someone block a tool like this?... Brett?

5:55 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



I think that Turnitin is a plagiarism fighting service.

I'd love to drop internetseer. Those guys are a pain in the neck, especially for really small new sites. Nothing more annoying that opening up a 7kb log file and seeing half of it is internetseer and nimda scans...

6:03 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



<snip>Why would someone block a tool like this?... Brett?</snip>

Frank
Change detection comes as a accessory to my sitemeter.
I have a particular page in which the content is only valid from late May to late October.
I mistakenly added a page date which resolves a new date daily. The Change detection send notification and the visitor views the page EVERY DAY.
I attempted removing "new date daily" from the html and it had no effect on change detection or this visitor. Guess there are exceptions to everything.

6:13 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have seen three different incarnations of turnitin bots over the last couple of years. They do obey the robots.txt file if anyone cares.

ai archiver can chew up a lot of bandwidth if you do not ban it. However you will see your site in the wayback back machine.

Sorry Brett, slightly off topic.

6:30 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



What would the need be for IP Addresses? If there is a good purpose I'll whip up a small script to start loging the IP and UA for UA's that are considered strippers.

Then we can parse the log to get the IPs.

But I don't see how this is better than just checking the UA in the first place since strippers may be run from an xDSL account where the IP changes. The next person that gets that IP may be a valid user. I wouldn't want to block him.

6:33 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



see what I added to the lead post above.

Ip's are needed to block the bots in an htaccess file. We will add them to the close to perfect htaccess ban list [webmasterworld.com] that has spread across the net the last year.

6:49 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



Ok I'll spend some time to write a little script that will process an apache log using browscap.ini to find IP addresses. I'll post it when there is something to show.
6:54 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



Brett,

I noted your update to the original message. Interesting fears that do not really match up with my experiences.

Actually a web page monitoring service actually increased my site traffic and generated repeat visitors.

Have you seen those alert services or "monitor this page" forms on a content page?

Sure you have... because you even have an alert service for this forum. ;) Well a page monitoring service is really nothing different.

Anyway I do not have a forum, but I was able to use a web page monitoring service to replace a giant mailing list. I now push my content/news to a targetted audience with zero spamming and no newsletter publishing.

However since you have a zero-tolerance policy toward automated programs and already have an alert service for this forum, I somewhat understand your wanting to block any and all page monitoring services.

For me though, I consider them a good way to generate repeat visitors.

Frank

[edited by: frankray at 7:40 pm (utc) on April 9, 2003]

7:24 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



As much as I understand your concern, Brett, I have a few problems with this.

1) These programs (for the most part) is helping users solve a huge problem on the Net: Keeping up with all the information. Software that are very helpfull to users tend to stick around for a long time - even if hole webmaster communities or governments for that sake try to block it. Just look at Gator, KaZaA etc. Many people, companies and governments have done all they can to close this down but it's still here becauser users want it so much.

2) I am not exactly sure what programs and services it is you want to block? Is it all services that request your site outside of a "normal" browser? Is it programs that grab only part of your page? If so, one very large new project is going to course you a great deal of headache: The new Lycos Europe personalization and "clipping" feature. I think you can only see it on Min.Jubii.dk (Danish) now but it will soon be rolled out all over Lycos Europe. With this tool you can "clip" any part of any website you wish and show all the clippings on one page. The full page is downloaded, using your local IP and browser agent name - but only the part you selected is shown.

Personally I love this service, and I use it to monitor news-sites and forums that I frequently use. It gives me access to more information at less time and help me pick the right news to read on and the right discussions to participate in. In fact, this thread was one that "poped up" in my personalized min.jubii.dk portal - who knows when I would have got the time to go here and see it :)

7:59 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



Ok I've whipped up a little PHP Shell Script that will process an apache log and find Unique IPs with UA's that my browscap.ini considers "Strippers".

Sticky me if you would like the url for the script along with my output file.

My log isn't finished processing yet and I have found 199 unique IPs so far...

10:12 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Mikkel_Svendsen,

Gator and KazaA are "still here" because:
a. The users don't even know it.
b. The user knows it but it cannot be uninstalled with regular methods.

On of the losses for web owners with off-line readers is the loss of page views. That is - if a group of people look at cashed pages, my advertisements, and other links do not register on my web site. In essence the caching is depriving me from income. An other one is registering actual visits, which, as we all know, a key selling factor to advertisers.

I believe Brett is concerned about non-friendly systems, that does not take the visited web site into consideration.

10:37 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member


Is this one of them.
http://www.seventwentyfour.com
I find it very useful. It tells me about broken links for free.
This 38 message thread spans 2 pages: 38
 

Featured Threads

Hot Threads This Week

Hot Threads This Month