Welcome to WebmasterWorld Guest from 54.162.239.134

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Been getting 200+ page loads per day from same IP

recorded by js based analytics

     
9:33 pm on May 11, 2012 (gmt 0)



Hi All

This visit is recorded by statcounter so, it doesn't appear to be a traditional spider and there is up to 5 minutes between some visits.

Its no load problem for my servers but, it does make a nonsense of analytics for a low traffic site espicially given shortcommings in unique visit identification.

Could it be 1 person visiting 200 pages day after day, same pages too sometimes ?

do other folk see this type of profile?

Do you ban em ?

Can supply IP range if helpfull
5:00 am on May 12, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



You haven't provided any information that would enable anybody to assist you.

Stats and stats software provide generalized data.

You need to view and provide the data from your "raw visitor logs", before anybody may help.
7:40 am on May 12, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



:: peering into crystal ball ::

It started out as a human. But then they forgot to close the tab containing your site, and

(1) every so often their computer crashes, and when it restarts, the browser reopens all former tabs (OK, this is not likely to be happening every few minutes rather than every few days)

(2) after a while the browser's auto-caching kicks in and the page is quietly reloaded at fixed intervals

(3) every time the user does anything else in the browser, all open tabs also reload.

Oh, wait. 200 different pages? I get weirdly repeated visits to the same page.

Does it seem to be a human? Don't look at the spacing of hits to pages themselves. See whether all associated files-- css, images etc-- are loaded up. This part happens faster with a human than with a robot. The js-based analytics alone isn't enough; I'm waging an ongoing battle to keep robots out of piwik. And I don't mean slimy Ukrainian bots either-- I mean well-known search engines.
8:47 am on May 12, 2012 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month




You need to view and provide the data from your "raw visitor logs", before anybody may help.


ditto
1:48 pm on May 12, 2012 (gmt 0)



38.99.***.*** HTTP/1.1 Mozilla/5.0+(X11;+U;+Linux+x86_64;+en-US)+AppleWebKit/533.3+(KHTML,+like+Gecko)+Qt/4.7.1+Safari/533.3


There appear to be up to 4 Javascript enabled crawlers from 38.99. range.

I am now certain they are crawlers cos i diverted them into a cul de sac and they just keep hitting the same page at the same rate even though they're been redirected from up to 500 different pages
2:21 pm on May 12, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



38.99.***.***


forum practices are only to obscure the Class D (last group)

Deny from 38.
or
RewriteCond %{REMOTE_ADDR} ^38\. [OR]

You need two lines or one combined to catch this UA and similar pests.
1) missing semi-colons
2) plus signs as opposed to spaces.

There are some others things you could key on, however they are personal preference.
I don't allow "Linux" users, at leas when so designated in the UA.
2:34 pm on May 12, 2012 (gmt 0)



Thanks , I give those a shot
6:41 pm on May 12, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



I have 96.0 - 99.255 listed (and enabled) as Scoutjet robot from Blekko.

Beyond that I have cogent completely blocked at 38.96.0.0 - 38.127.255.255.

I have notes of bots from Discovery, Trustwave and Voyager/Kosmix with the blocked range going back more than two years; they may be no longer on that range.

Whatever, if it's a (semi-)genuine bot it should have an identity within the UA. Your example UA has no such thing.

I would have suspicions about safari coming from linux unless it were a skilled user or hacker OR a bot. Default browser (at least for ubuntu) is konqueror but I think most people use firefox or opera. Webkit is mostly a Mac or Chrome browser or used by google as a site scraper - sorry, "Web Preview" bot. Can't see the latter running on cogent but it's always possible, I suppose, although I think the UA is wrong for that.
6:53 pm on May 12, 2012 (gmt 0)



Interesting you mention Cogent,

Where these automata you blocked js enabled ?
7:01 pm on May 12, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Sorry, no idea about JS. I think scoutjet probably is, though. Most large engines seem to be, now.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month