homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

WHAT is this critter?

 9:17 am on Apr 14, 2006 (gmt 0)

I found a good 40-50 hits from www.yournetdetective.com in my logs.
Each one requested the same 3rd level page (some map with a little text)
and every last one had a different user agent.
Yournetdetective really rang the changes.
Every version of MSIE, Windows, Linux, Macintosh, Konquerer .. even Commodore 64!

Each UA was different, one hit each. Now what is the purpose of all that?

I Googled up www.yournetdetective.com expecting to find some discussion here, but nothing.
Instead, I find all sorts of 'Affiliate Programs'. \
Is THAT is what they are all about? If so, I want no part of it.

The question remains: What possible purpose is a long crazy string of hits like that?
Is somebody trying to see if my pages render differently? If I'm cloaking?
They certainly won't get me to sign up. Anybody else see this recently? -Larry



 2:25 pm on Apr 14, 2006 (gmt 0)


Well, they are back again, on today's access logs.

Same exact method of operation, only now they picked another 3rd level page,
as always without the accompanying image, and with the same stew of unique user agents.

Before I 86 them via .htaccess, I would like to know if this pest is unique to my site.
Any info at all much appreciated. -Larry


 3:19 pm on Apr 14, 2006 (gmt 0)

Looking at the site, it looks like they offer some sort of net monitoring services.

When I get hit by bots from people like this, especially when they lie in their user agent sting, I block them first and (sometimes) ask questions later.


 5:30 pm on Apr 14, 2006 (gmt 0)

I'm curious. Why block them?


 8:36 pm on Apr 14, 2006 (gmt 0)

cgrantski asked: I'm curious. Why block them?

What good reason would they have for violating robots.txt and then lying about who and/or what they are?



 9:58 pm on Apr 14, 2006 (gmt 0)

I'd be interested in seeing the exact string that included "Commodore 64" within it, because the web browser I wrote for the 64 includes "Commodore 64" within the user-agent string.



 10:37 pm on Apr 14, 2006 (gmt 0)

Anyone who abuses "Commodore 64" unlikely to honor Good Friday and Chocolate Saturday.
PS: The creep is back again. - Larry


 10:27 pm on Apr 15, 2006 (gmt 0)

Here's the exact Commodore user agent given:

"http://www.yournetdetective.com" "Mozilla/4.0 (compatible; X 10.0; Commodore 64)"

There were others for Amiga, macintosh, all sorts of stuff. -Larry


 4:59 pm on Apr 16, 2006 (gmt 0)

cgrantski asked: I'm curious. Why block them?

stapel said: What good reason would they have for violating robots.txt and then lying about who and/or what they are?

me: I too would like to know why they should be blocked, other than the reason of disliking their tactics or thinking they are suspicious. I'm serious. What tangible harm could they do?

I'm not defending them and I don't think cgrantski is either based on his past posts about ethical subjects. I'm just wondering in general about actual bad effects to the site of this kind of visitor. A little bandwidth usage, yes. The question of "what if everybody did this," okay. General principles, sure. But other than those generalities, is there a tangible threat, any real consequences?


 5:28 pm on Apr 16, 2006 (gmt 0)

McElvoy asked: What tangible harm could they do?

Since most of us pay for our bandwidth, bandwidth theft isn't a "hypothetical" "generality". Since most of us don't prefer to be hacked or scraped, or to face the dangers and difficulties of hack attempts, most of us view these issues as ones involving "tangible harm".

I'm curious as to why you aren't leery of people/bots/etc who intentionally break the rules. Why would you assume violators to be benign?



 1:35 pm on Apr 17, 2006 (gmt 0)

First, I'm not convinced anybody is "lying." They could be using an emulator program to see how a certain kind of page displays in a lot of browsers. Maybe they are about to try a page that contains the same map technology and they want to see on an existing page whether that map technology displays correctly under all circumstances. Or maybe the page has another bit of code that they want to check for compatibility before using it. Emulators are legitimate tools used by developers.

Yes, it's using another site for their own purposes, but who hasn't gotten ideas from another site, looked at source code on another site to help with something we're working on, looked at links to another site as a way of finding advertisers, or checked out a company before doing business with them? Or, for that matter, looked closely at a competitor's site or business?

Regarding bandwidth theft - 40 or 50 pages a day with no images is not going to cost anybody anything.

Regarding scraping, I don't see how this could be scraping. And I can't think how this is hacking or preparation for hacking, i.e. stealing somebody's data or bringing down a site. I guess anything's possible.

It's anybody's choice to block anybody for any reason, but in this case it seems to me like an overreaction. If somebody skillful was planning on hacking the site, blocking the IP won't stop them at all.


 1:46 pm on Apr 17, 2006 (gmt 0)

It could be referrer spam. They just hit your site with a weird referrer and you go visit their site. Looks like it worked.


 6:40 am on Apr 19, 2006 (gmt 0)

First, I'm not convinced anybody is "lying."

Occasionally I see phenomena like this where my sites receive a number of obviously "fake" hits, e.g. multiple hits from the same IP range, no or random referers, "real" UA strings which don't match the hit pattern (i.e. fetch pages but not CSS, Javascript etc). Often when I trace the IP I it belongs to some "web analysis" or "intellectual property managment" company who are evidently mining my sites for their own commercial purposes without having the decency to inform me about it e.g. by an explanatory URL in the UA string.

OK, it doesn't directly cost me anything, but it doesn't benefit me either, so in the filter they go.


 7:35 am on Apr 19, 2006 (gmt 0)

>I'm curious. Why block them?

If you run a PPC site then you have to block all bots that fail to obey robots.txt or you end up allowing bots to generate false clicks to your customers.

Paying for the bandwidth they consume is an issue, them generating false clicks can be a killer!

There are spam bots that can easily generate 20,000+ false clicks per day. They must be blocked.

I hate all bots that fail to obey robots.txt. I block them as fast as I can catch them. But, some of my more savvy clients still get click throughs from there crawling....never a good scene!


 10:05 am on Apr 20, 2006 (gmt 0)

Good points for a lot of situations. My question was about this particular situation. Nobody mentioned PPC clicks happening because of this entity. 40-50 page views is the situation here and its bandwidth cost is trivial. And so forth. In this case, it seems to me that the minutes spent fussing with this one entity could be better spent improving the site or its marketing. My main action in this particular case would be to filter them from the stats for the main report, and filter them into the stats for the spiders/bots report. I'm looking for balance, you're looking for the 100% solution and a litmus test. That's fine. I just want to clear up the impression that I was saying that nothing ever needed to be blocked. I've been running and analyzing web sites since Mosaic 1.0 and, believe me, I would not be that naive.

Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved