homepage Welcome to WebmasterWorld Guest from 54.227.77.237
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Anonymous Visits from Hosting Companies
aristotle




msg:4680264
 12:35 pm on Jun 16, 2014 (gmt 0)

I've been seeing quite a few bot-like visits from IP addresses associated with small web hosting companies. Does anyone know the purpose of these visits?

 

incrediBILL




msg:4680393
 10:27 pm on Jun 16, 2014 (gmt 0)

You would have to give us the user agents of those bots and a list of their actions before we could give you any insights.

There are tons of scrapers and spammers hosted everywhere, often using compromised machines to avoid detection via fast flux IP changes.

It's crazy out there.

aristotle




msg:4680397
 11:03 pm on Jun 16, 2014 (gmt 0)

Here's the Latest Visitors entry for a case I saw today. It just downloaded the bare page without images. I haven't saved any other cases, but think they're similar to this.
Host: 209.190.64.233
/
Http Code: 200 Date: Jun 16 11:45:17 Http Version: HTTP/1.1 Size in Bytes: 39902
Referer: -
Agent: Mozilla/4.0+(compatible;+MSIE+8.0;+Windows+NT+5.2)

Here is the IP information I found:
IP: 209.190.64.233
Hostname: e9.40.be.static.xlhost.com
ISP: eNET
Organization: XLHost.com
Services: None detected
Type: Corporate
Assignment: Static IP
Country: United States
State/Region: Ohio
City: Columbus

So I really don't know what it is, or its purpose.

not2easy




msg:4680401
 11:41 pm on Jun 16, 2014 (gmt 0)

I blocked that whole XLHost range about a year ago when a scraper was stuck in a trap. It was from 209.190.3.218, not the same IP, but humans don't generally visit from hosting IPs while anonymous robots do. It is always best if in doubt to see what they were up to. In this case they were not asking for robots.txt
209.190.0.0 - 209.190.127.255
209.190.0.0/17 ENET-XLHOST

brotherhood of LAN




msg:4680405
 12:04 am on Jun 17, 2014 (gmt 0)

Possibly a VPS used as a proxy or VPN.

keyplyr




msg:4680407
 1:01 am on Jun 17, 2014 (gmt 0)

@ aristotle

I manually looking through my raw server logs many times, each and every day. I look up every questionable hit. If it is from a hosting company, colocation service, cloud, or anything aimed at business services... I block that IP range.

IMO none of the above have any valid reason to request files from my server.

Occasionally there is some collateral damage, like company employees surfing during work hours. Each webmaster has to fine tune these types of gray areas.

Pfui




msg:4680425
 3:30 am on Jun 17, 2014 (gmt 0)

xlhost is eminently blockable, and mixed-ranging. Here's another of their Columbus, Ohio-based ranges:

64.79.85.200 - 64.79.85.207
64.79.85.200/29

Last month, "niki-bot" (that's the entire UA) ran from one of those IPs and slammed into 403s for five minutes straight.

keyplyr




msg:4680438
 5:46 am on Jun 17, 2014 (gmt 0)


xlhost is eminently blockable, and mixed-ranging. Here's another of their Columbus, Ohio-based ranges:

64.79.85.200 - 64.79.85.207
64.79.85.200/29

That XLHost range is actually:

64.79.64.0 - 64.79.95.255
64.79.64.0/19

Pfui




msg:4680567
 3:36 pm on Jun 17, 2014 (gmt 0)

Thanks! I usually block the closest CIDR and go upstream as needed. But when it comes to iffy server farms, blocking the parent's probably a good first move.

wilderness




msg:4680625
 9:34 pm on Jun 17, 2014 (gmt 0)

For this one (and others using similar broken UA's?
See this thread [webmasterworld.com ]

SetEnvIf User-Agent " ; " keep_out
SetEnvIf User-Agent " \( " keep_out
SetEnvIf User-Agent "; " keep_out
SetEnvIf User-Agent "\) ; " keep_out

or

RewriteCond %{HTTP_USER_AGENT} \)\ \) [OR]
RewriteCond %{HTTP_USER_AGENT} \ ;[\ ] [OR]
RewriteCond %{HTTP_USER_AGENT} \ \([\ ] [OR]
RewriteCond %{HTTP_USER_AGENT} ;\ [\ ] [OR]
RewriteCond %{HTTP_USER_AGENT} \)\ ;[\ ]
RewriteRule .* - [F]

lucy24




msg:4680699
 6:32 am on Jun 18, 2014 (gmt 0)

So I really don't know what it is, or its purpose.

If it's a new, just-getting-started robot it probably doesn't have its own server yet. So any IP lookup will take you only back to the (shared) host's name.

dstiles




msg:4680887
 6:18 pm on Jun 18, 2014 (gmt 0)

Major reasons for bots on multiple IPs, even from servers, include scraping and virus-implant.

Hits may come from broadband-based botnets, server-based botnets or server space rented for the occasion, although that may be more expensive and less fun than hiring a botnet.

Dammy




msg:4685413
 1:52 pm on Jul 6, 2014 (gmt 0)

I'm also facing the same issue.
regardless of the robots.txt file, they keep visiting and they are wasting my bandwidth.

not2easy




msg:4685429
 3:28 pm on Jul 6, 2014 (gmt 0)

Hi Dammy, welcome to the Forums. The robots.txt file does not have any control over what any robots can or will do, it is your wish list, as in, "This is what I want you to do". Not all robots even look at that file and some that do, go on to ignore it.

The place where you can set limits is your .htaccess file and there are many resources here in the Forums to help you learn how to do that. I suggest you look through this Forum's Library if you want to be able to make them stop.

Dammy




msg:4685646
 2:57 pm on Jul 7, 2014 (gmt 0)

@not2easy
Thanks for the reply, i'm always getting into problems with .htaccess, anyway i'm going to look 4 that.
Thanks

aristotle




msg:4688140
 12:39 pm on Jul 16, 2014 (gmt 0)

Another small hosting company that shows up is Iliad Hosting in France. I found some information about it:
IP-range/subnet: 62.210.0.0-62.210.255.255
Description: IP Pool for Iliad-Entreprises Business Hosting Customers
Location: France (FR) flag
Registry: ripe
Number of domains hosted: 40,660
Number of nameservers hosted: 1,665
Number of mailservers hosted: 1,849
Number of SPAM hosts hosted: 120
Number of IP routes 2

Can someone give me advice on whether I should block this entire IP range. Also, what does it mean where it says that the number of "SPAM hosts" hosted is 120? Should I block the whole thing or not?

not2easy




msg:4688184
 3:20 pm on Jul 16, 2014 (gmt 0)

"Business Hosting Customers" says you will never see a human visitor from that range. Whether to block or not is your call, I would if they were found "visiting".

aristotle




msg:4688206
 4:52 pm on Jul 16, 2014 (gmt 0)

not2easy -- Thanks for the reply. But I just got a private email from Leosghost telling me that Iliad is also an ISP to the general French public. So I've decided not to block it out of concern that I could also block ISP users. There might be a way to differentiate, but I don't have the time, knowledge, or skill to make it worthwhile to pursue it any further.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved