homepage Welcome to WebmasterWorld Guest from 54.235.36.164
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
Unusual Traffic from Limelight Networks
kinboshi




msg:3560876
 1:56 pm on Jan 29, 2008 (gmt 0)

Anyone know anything about Limelight Networks in Tempe, Arizona (other than what I've been able to read on their website)?

The site I'm working on is a UK-focused car sales site. The vast majority of the traffic comes from the UK, and a sizeable chunk from search engines.

We noticed that over the past week, a high amount of 'direct traffic' started to come to the site. Looking through the stats, they all originated from one source - Limelight Networks in Tempe, Arizona. Until a week ago, they'd never visited the site before.

In Google Analytics, it's showing each of the visits as a separate visitor (rather than the same visitor viewing multiple pages). They aren't focusing on any particular pages, but are acting in the way I'd expect a search-engine spider to act (possibly). They aren't visiting the same page multiple times, they're going to new pages each time, and spending only a second on these pages.

It's not caused us any issues in terms of load on the server, so we're not unduly worried. However, I'd be interested to hear if anyone else has had them suddenly come to your site and act in the same way? I've done a bit of looking around on Google, and did find one other mention of them doing the same thing on another site back in 2006.

 

blend27




msg:3560923
 2:47 pm on Jan 29, 2008 (gmt 0)

Hi Kinboshi,

Is there a specific User Agent associated with these requests?
Are visitors coming from the same IP Address?
Do these visitors download the images as well as JavaScript/CSS files?

kinboshi




msg:3560935
 3:03 pm on Jan 29, 2008 (gmt 0)

Hi Kinboshi,

Is there a specific User Agent associated with these requests?
Are visitors coming from the same IP Address?
Do these visitors download the images as well as JavaScript/CSS files?

I'm just getting access to our log files now. Unfortunately, Google Analytics doesn't provide IP addresses of visitors or details of exactly what they are doing.

Hopefully the log files will tell me a bit more.

kinboshi




msg:3561009
 3:58 pm on Jan 29, 2008 (gmt 0)

The visits are all coming from the same IP range.

Yes, they are downloading images and requesting JS, and the user is declaring itself as HTTP/1.0 Mozilla/5.0+ and is running on a Linux box.

Brad




msg:3562876
 2:39 pm on Jan 31, 2008 (gmt 0)

I am getting hit with the same spider. Here is the info I have:

Domain Name llnw.net? (Network)
IP Address 208.111.154.# (Limelight Networks, LLC)
ISP Limelight Networks, LLC
Location
Continent : North America
Country : United States (Facts)
State : Arizona
City : Tempe
Lat/Long : 33.4357, -111.9171 (Map)
Language English (U.S.)
en-us
Operating System Linux UNIX
Browser Mozilla 1.8.1.11
Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.11) Gecko/20080109
Javascript version 1.5
Monitor
Resolution : 1300 x 1300
Color Depth : 16 bits

I have written to Lime Light asking for an explanation although I'm not holding my breath for a response.

Anyone know anything about them?

kinboshi




msg:3562910
 3:08 pm on Jan 31, 2008 (gmt 0)

I did get a response. In fact I got two.

One said it could be related to the banner ads on my site that might be delivered by their CDN (which I guess is Content Delivery Network).

Someone else from Limelight also responded, and they said that the IPs related to Searchme, who are a client of Limelight and it could be their spider.

They did say that they don't want to cause any problems with anyone's server, so if we wanted we could be added to their no-crawl list. If it's a new search engine that's going to be launched then we're quite happy to be spidered, indexed and ranked. It is interesting though that the spider doesn't appear to be acting like other search-engine spiders.

Rehan




msg:3562935
 3:50 pm on Jan 31, 2008 (gmt 0)

A whois on the IP address will show that it belongs to Limelight, but if you do a reverse DNS lookup then the hostname of the IP address should give you more info. My guess is that it's kavam/searchme that's crawling your site.

Web_speed




msg:3565206
 12:43 pm on Feb 3, 2008 (gmt 0)

i am seeing the same thing over my logs.....5 sites that i currently monitor.

very wired spider behaviour....it requests java scripts, images css the lot....behaves like a browser and inflates impressions on a number of ad networks currently running over these sites...no increase in click activity though...impressions only.

Seems to be all over the place and has been showing an increase in activity over the last few days.

Limelight Networks Inc.

Reverse dns:
v21.nat.svl.kavam.net
v18.nat.svl.kavam.net
v20.nat.svl.kavam.net ......etc. all over my logs

does not smell too good.

[edited by: Web_speed at 12:45 pm (utc) on Feb. 3, 2008]

Aldebaran




msg:3565490
 1:09 am on Feb 4, 2008 (gmt 0)

Hi,
I too have been seeing traffic from Kavam.net and Limelight Networks and contacted my traffic tracking company for their opinion.

<snip>

The traffic is from a Searchme robot...these are not human visitors.

For some odd reason, they are triggering javascript web traffic tracking code, which normal spiders don't do. Please do read my article and let me know if this is what you all are seeing too.

[edited by: engine at 9:37 am (utc) on Feb. 5, 2008]
[edit reason] No urls, thanks. See TOS [webmasterworld.com] [/edit]

Web_speed




msg:3566369
 2:17 am on Feb 5, 2008 (gmt 0)

@ Aldebaran

Yes exactly what i am experiencing.

Update:
This bot continues to hammer my sites heavily. This is no normal crawler...it acts like a browser and executes javascript code, not just URLs in the code but the entire code just like a browser would.

I'm itching to completely block this thing on all fronts...thoughts anyone?

Rehan




msg:3566376
 2:25 am on Feb 5, 2008 (gmt 0)

For the foreseeable future you'll probably get more traffic from the spider itself than what you get from the search engine (which is not even live yet). I'd block it if I was in your shoes.

phranque




msg:3566392
 3:18 am on Feb 5, 2008 (gmt 0)

welcome to WebmasterWorld [webmasterworld.com], Aldebaran!

Aldebaran




msg:3566448
 6:04 am on Feb 5, 2008 (gmt 0)

Thank you very much for the welcome!
J

Aldebaran




msg:3566449
 6:05 am on Feb 5, 2008 (gmt 0)

Oh, and I suppose the best way to block it, would be to via an .htaccess file. When I simply did the robots.txt, it seemed to ignore it, but I suppose you do try the robots.txt first.

Web_speed




msg:3566515
 9:35 am on Feb 5, 2008 (gmt 0)

For the foreseeable future you'll probably get more traffic from the spider itself than what you get from the search engine (which is not even live yet). I'd block it if I was in your shoes.

Thanks. Makes a lot of sense. I decided to just block the darn thing.

In case anyone is intrested here is the .htaccess code:

<Limit GET HEAD POST>
order allow,deny
##--> Bye bye Limelight Networks. You are not welcome here.
deny from 208.111.154
allow from all
</LIMIT>

[edited by: Web_speed at 9:40 am (utc) on Feb. 5, 2008]

tobyism




msg:3566931
 6:20 pm on Feb 5, 2008 (gmt 0)

We found a page to the searchme site in our logs. Its not linked anywhere from the homepage.. not sure yet if the bot follows its own advice but the robots.txt information is:

User-Agent: Charlotte
Disallow: /

great name for a spider eh?

sc440




msg:3567434
 6:04 am on Feb 6, 2008 (gmt 0)

Yesterday this bot did 20 clicks on the Adsense ads of a couple of my sites... It doesn't show on the Google's report but my tracking software picked it up.

This is not cool!

Thanks for the htaccess code...

Aldebaran




msg:3567436
 6:17 am on Feb 6, 2008 (gmt 0)

SCC40,
How does your traffic tracking pick up on adsense clicks?

sc440




msg:3567827
 5:26 pm on Feb 6, 2008 (gmt 0)


I use a script named asRep

Bewenched




msg:3568173
 2:20 am on Feb 7, 2008 (gmt 0)

When they spidered my site recently it was multiple IP addresses not using any identification.

208.111.154.16
208.111.154.189
208.111.154.15
208.111.154.67
208.111.154.193
208.111.154.66
208.111.154.182
208.111.154.21
208.111.154.183
208.111.154.184
208.111.154.65
208.111.154.68
208.111.154.199
208.111.154.188
208.111.154.186
208.111.154.195
208.111.154.197
208.111.154.200
208.111.154.62
208.111.154.69
208.111.154.63
208.111.154.64

[edited by: Bewenched at 2:23 am (utc) on Feb. 7, 2008]

Mone




msg:3568363
 8:50 am on Feb 7, 2008 (gmt 0)

I've been watching this ever since I began seeing visits from Tempe, Az. Today it changed from just displaying the IP to crawl1.nat.svl.searchme.com.

I found an explanation of what they're up to.
[searchme.com...]

They're still hammering me like I'm a nail that won't go in. Grrr.

jenkers




msg:3568380
 9:39 am on Feb 7, 2008 (gmt 0)

I was having a look through my logs to see why the hell I was being hammered so badly and went to the help page listed above.

Unfortunately the UA that is blitzing my site does not show Charlotte - it appears to be spoofing a normal browser UA - basically their advice is - pretty much useless.

I've added the lines shown in that earlier post to my htaccess and it worked a treat. Don't know what these guys are up to but they're about to get a stinking email - nobody needs to spider any of my sites so completely in such a short space of time...

tobyism




msg:3568738
 5:19 pm on Feb 7, 2008 (gmt 0)

This SearchMe company definitely does not follow its own rules, they are still hammering our site. Going to try the .htaccess now. Not sure how these 'experimental' bots these people come up with plan to get anywhere by ticking off webmasters. Had the same problem with cuill.com and their twiceler bot not too long ago.

Need More Hits




msg:3569067
 10:54 pm on Feb 7, 2008 (gmt 0)

Hey guys I am sure we are currently discussing the same issue in this thread
[webmasterworld.com...]

I have been getting hit by the same bots like kavam.net as Web_speed
Has mentioned Maybe we can get an admin to merge these threads so we can all try to figure this out together.

Brad

Mone




msg:3569074
 11:06 pm on Feb 7, 2008 (gmt 0)

I wrote them and this is the reply:

"Yes, the activity you are seeing is coming from one of our crawlers. We are busily refreshing our index in preparation for the public launch of our search engine. We hope that inclusion in our index will prove beneficial to you, but understand if you would prefer that we exclude you. We will add you to our “do not crawl” list today. You should see all activity from our spiders cease within an hour or two, so you should not need to modify your robots.txt configuration. We will stay off your site until such time as you explicitly request that we index you again.
Sorry for any inconvenience."

Hmmmmm.

Tusserte




msg:3569390
 6:18 am on Feb 8, 2008 (gmt 0)

This bot is making my counter go crazy! Oh well, I guess I could use the traffic :D

Badger37




msg:3569504
 9:47 am on Feb 8, 2008 (gmt 0)

"Need More Hits" - No I think these are different issues.

I'm also seeing the kavam.net activity.
NB today they seem to have changed from kavam to searchme with the same UA: X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.11) Gecko/20080109

While they aren't using a meaningful UA like a normal bot I can't say.

The link you mention is from another source trying to do something malicious with URL redirects - as discussed in the thread.

Unless... everyone seeing the kavam/searchme activity has also started seeing the strange outgoing links. Then I guess they could be connected.

But to me kavam/searchme just looks like an amateur bot when the other activity looks like a malicious attack.

Just checked my logs and kavam came visiting on one site and the dodgy outgoing link activity (see NMH's link) started on the same day on one site and 3 days later on another. Does all seem a bit coincidental now you mention it!

Mone




msg:3569800
 5:23 pm on Feb 8, 2008 (gmt 0)

It seems my email to them worked. True to his word, I haven't been crawled. Maybe that's the best way to go. I didn't have to do anything to stop or block them.

The tip is to write in a friendly manner, make it a bit funny and non-threatening. Works like a charm.

Too many males (sorry, guys) tend to be formal and at times confrontational and it backfires. I'm tired of the phrase "outside the box" but that's where I go and it hasn't failed me yet.

I'm going to monitor the site and wait for it to become operational and then perhaps send another email inviting them back.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved