homepage Welcome to WebmasterWorld Guest from 107.20.30.170
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 92 message thread spans 4 pages: < < 92 ( 1 2 [3] 4 > >     
Server Farms - March 2013
Ongoing WMW server farm report
wilderness




msg:4552057
 10:45 am on Mar 7, 2013 (gmt 0)

Continued from previous thread: [webmasterworld.com...]


The old thread has become too large, and there is no longer any method of linking to individual submissions within threads at Webmaster World, thus making the previous thread useless as a reference (they do come up in the search results).

Joe's Datacenter
JOESDC-02 204.27.56.0 - 204.27.63.255 204.27.56.0/21
JOESDC-01 208.94.240.0 - 208.94.247.255 208.94.240.0/21
JOESDC-01 69.195.128.0 - 69.195.159.255 69.195.128.0/19
JOESDC-01 96.43.128.0 - 96.43.143.255 96.43.128.0/20
JOESDC-01 2604:5800:: - 2604:5800:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF

[edited by: incrediBILL at 12:59 am (utc) on Mar 8, 2013]
[edit reason] Added link to previous thread [/edit]

 

Kendo




msg:4567798
 6:53 am on Apr 25, 2013 (gmt 0)

I have the whole of the hurricane range blocked


Why would anyone want to block services running on the Hurricane network?

Also, do you realise that the IP ranges mentioned above are only a tip of their iceberg?

wilderness




msg:4567833
 7:45 am on Apr 25, 2013 (gmt 0)

I have the whole of the hurricane range blocked

FWIW, the only reason I listed the Hurricane ranges (which we've all denied previously) was because they are shown in the Fork ranges.

Why would anyone want to block services running on the Hurricane network?


You've apparently been residing on another planet for the past decade!

[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]

keyplyr




msg:4567834
 7:53 am on Apr 25, 2013 (gmt 0)



RE: US Net Inc
They list abuse contact as nobody@example.com

wilderness




msg:4567838
 7:54 am on Apr 25, 2013 (gmt 0)

They list abuse contact as nobody@example.com


Please note; This is not the Apache forum ;)

keyplyr




msg:4567844
 8:38 am on Apr 25, 2013 (gmt 0)


Point being, the company is non-compliant to web standards by falsifying registration information and possibly in violation of ICANN. Who knows what else they're hiding... making it block-worthy by default.

wilderness




msg:4567845
 8:46 am on Apr 25, 2013 (gmt 0)

i. e., the subject line of this thread and it's predecessor.

keyplyr




msg:4567871
 10:34 am on Apr 25, 2013 (gmt 0)

FYI - US Net Inc is registered as an ISP, not a server farm. However, they do host private web sites as well as business web sites and offer other server products associated with hosting. Some people would automatically block them just because of the hosting services, not me. I usually do not block companies that *also* offer ISP connectivity to residential & biz accounts because of the potential loss of human traffic, loss of sales.

However, since these guys *seem* to be unethical and hiding information, that's enough to sway me to block them. That is the reason I posted about it.

I really don't care about the subject line of a thread. I'm going to comment if I feel the information is relevant to the discussion. Just because a company registered as an ISP is mentioned in a server farm thread, doesn't in itself make that company a server farm, and as such, doesn't automatically qualify it as something to block IMO. Everything here is worthy of discussion. This is a forum after all.

There's a wide audience here at WW. Some members block almost everything, others are more selective. Some don't even understand why some companies should be blocked at all.

dstiles




msg:4568041
 8:04 pm on Apr 25, 2013 (gmt 0)

Other block-worthy registrations are those declaring their email address to be @hotmail, @gmail, @yahoo etc. They are HIDING! I doubt very much if they ever read those mailboxes - or even own them any more.

dstiles




msg:4568280
 4:52 pm on Apr 26, 2013 (gmt 0)

Google appengine hits from a new (to me) range today...

8.35.192.0 - 8.35.207.255
OrgName: Google Apps.
Range Owner: Level3

Blocked.

Kendo




msg:4568286
 5:14 pm on Apr 26, 2013 (gmt 0)

You've apparently been residing on another planet for the past decade!


Actually, what I have been doing for the last decade and more has been the realisation and enjoyment of the services provided by Hurricane Electric as the best on the planet, which is why I was curious about the fuss.

bots hosted by Hurricane Electric


Is there such a thing as an ISP of any significant size who doesn't have a bots and spammers hidden amongst their clientele?

dstiles




msg:4568306
 7:38 pm on Apr 26, 2013 (gmt 0)

I am reasonably happy with my own server host but I also block their complete range of server IPs.

There is no reason for one server to access another outside of a mutually established relationship. Search engines are tolerated - just - but 99.99% of all bots are harmful or wasteful or both.

If a company offers broadband then let them identify this fact in their DNS records; few do, although some ISPs are now starting to add DSL to their DNS entries. If they operate only server farms or are suspect and do not identify broadband ranges they get blocked. Being the best server host on the planet has nothing to do with blocking their IP ranges.

It is my own experience that hurricane electric users have only my disinterest at heart, no matter how good the service itself may be; although I admit they are by no means the worst offender. But unless they can be shown to have broadband ranges they are completely blocked. Why would you NOT block a server farm, excepting only specific beneficial bots? I block most of G and MS, excepting only certain bot ranges: I certainly would not hesitate to block arbitrary scrapers and (quite probably) compromised servers.

I have, in the past, posted replies in the google forum hereabouts to people complaining their sites are being scraped. I suggested they employ bot-blocking methods to alleviate their problems. Not even a reply, just more complaints about scraped content!

moxie




msg:4568323
 8:29 pm on Apr 26, 2013 (gmt 0)

I agree with dstiles; since the beginning I have blocked all IP ranges from our hosting provider, and they are "the best on the planet". ;)

keyplyr




msg:4568373
 10:55 pm on Apr 26, 2013 (gmt 0)


@dstiles - Do you have other ranges assigned to Google Apps? (sorry I missed them if posted earlier.)

Kendo




msg:4568391
 1:08 am on Apr 27, 2013 (gmt 0)

Why would you NOT block a server farm


By "server farm" you are probably refering to a data center that provides hosting for colocated and owned servers that will include a large variety of services:

- sites/services that you have used in the past and probably still use
- DNS servers for hundreds of clients and possibly thousands of domains
- mail services for hundreds of companies (not just ADSL users)
- ecommerce gateways linked to banks
- caching servers
- proxy gateways
- website mirrors
- perhaps one of your social networks
- this forum is sitting on one of them somewhere!

A while back I was managing an ISP, back in the days of modem diallup, and our users were having problems contacting family and friends overseas. The most common cause was that networks in Europe were blocking the IP range that we lived in assuming that the whole range belonged to Taiwan when in fact it was broken up and shared by many different locations like us in Australia.

Also, often I have seen ownership of an IP range listed as belonging to a major provider and the actual usage located on a different continent.

keyplyr




msg:4568396
 1:31 am on Apr 27, 2013 (gmt 0)


Kendo you're missing the point. We are blocked hits to our web sites coming from server farm ranges. These ranges have no reason to send requests to our web site servers other than to steal content, gather data for their own use, or to hack and inject malicious scripts.

Sure some websites hosted at server farms (i.e. Hurricane Electric and others) offer great services. But these sites have no business hitting our servers. Yes, there are a few beneficial bots that many of us allow... the rest we block.

Kendo




msg:4568399
 1:37 am on Apr 27, 2013 (gmt 0)

This is kinda related...

Yesterday I received abuse from someone who had received a support email from moi. His complaint was about the MailScanner warning in bold red letters about his beloved web site being a scam. Well I thought that we were sending and receiving via our own mail server that is a virtual hosted somewhere near New York. So I mistakenly assumed that it was his service using MailScanner. But this wasn't the case because we must have switched over to our local ADSL provider's service while the data center re-routed their network... and it was never changed back.

The point? Lots of IT professionals use their own mail servers and they can be hosted anywhere, perhaps in one of the "server farms" that you have blocked. Do you really think that rabidly blocking whole IP ranges is good for your business?

Kendo




msg:4568402
 1:49 am on Apr 27, 2013 (gmt 0)

no reason to send requests to our web site servers other than to steal content


Hey, I feel the same pain. You should see our block list. In fact most of our sites run scripts to weed out non-persons-of-interest that traps most bots and scrapers. Well that is, commonly used scrapers that are provided as free open sources of annoyance.

But what about the bots that will create membership on your site, reply to email validation and then log in to scrape member content? Anyone seriously scraping content will have such an app custom designed. I see projects like this advertised on outsourcing services all the time.

If you have forums a heck of a lot of hits will appear to come from genuine surfers who are really using SEO forum spam software that may never be seen coming from the same IP address.

wilderness




msg:4568404
 2:01 am on Apr 27, 2013 (gmt 0)

The point? Lots of IT professionals use their own mail servers and they can be hosted anywhere, perhaps in one of the "server farms" that you have blocked. Do you really think that rabidly blocking whole IP ranges is good for your business?


In this instance your using the term "IT professionals", rather loosely.
Any professional that uses his personal email servers for his IT business has a few marbles missing.

I hope the day never comes when email logs are integrated into "websites raw visitor logs", as that would be a real nightmare. Viewing them via my router logs is already a nightmare.

The primary premise of this forum for more than a decade has been the following:

"Each webmaster must determine what is beneficial or detrimental to their own website (s)."

keyplyr and I agree on many methods (IP's UA's and more), however simultaneously, we also disagree on the many methods. Because we each have different goals and detriments.

Behind the same logic (beneficial or detrimental) is reason that there does NOT exist a standard set of IP's and UA's to "black-list" in place for all websites to copy and paste.

Behind the same logic (beneficial or detrimental) is also the reason that nothing similar exists for "white-listing".

keyplyr




msg:4568421
 4:42 am on Apr 27, 2013 (gmt 0)

The point? Lots of IT professionals use their own mail servers and they can be hosted anywhere, perhaps in one of the "server farms" that you have blocked. Do you really think that rabidly blocking whole IP ranges is good for your business?

Not related. Blocking an IP from requesting files from my web server has nothing to do with sending mail from one mail server to another. I also agree with what wilderness spelled out.

Kendo, since you've been a member at WW since 2005 it hard to imagine you haven't read the thousands of posts regarding server farms and the logic behind blocking them. I'm beginning to think you just like to argue.

dstiles




msg:4568553
 6:36 pm on Apr 27, 2013 (gmt 0)

keyplr - I have the usual set of G IPs, all of which are blocked except those assigned crawler status in DNS and a very small range of utilities. This range was a new range, set up in DNS late-ish last year. I can post a full list of what I have if required.

kendo - you are confusing web and other internet services. Mail, FTP, SSH etc should never access web sites on ports 80 or 443. Apart from which, they run on different servers (although often on the same computer); I run mail, FTP and web servers on a single computer and each service has a different server and associated ports. Mail servers run on such ports as 25, 587, 110 and will never send get or post requests to web servers. Hence they can be blocked as unwanted. If any mail service tries to access our web sites they are doing so "illegally". Those of us running web forms or webmail make our own inter-service arrangements.

As we've said, there are only a few bots we allow or should allow.

As to "IT professionals" using their own mail servers - in some cases that is valid, providing the server is hosted on a proper server with a fixed IP and proper DNS setup. I do that for myself and my clients. I've seen (and blocked) a fair few idiots trying to send mail from their "home" or "office" computer, often on a dynamic broadband line. They invariably claim to be "professionals" and that their mail is not rejected by other mail services. It's always fun when they turn off the computer each night.

People who run internet services of any kind need to know at least how to run their own services and preferably know a reasonable amount about services they do not operate.

Kendo




msg:4568611
 1:21 am on Apr 28, 2013 (gmt 0)


In this instance your using the term "IT professionals", rather loosely.
Any professional that uses his personal email servers for his IT business has a few marbles missing.


Not loosley at all. Firstly, I said "use their own mail servers" and not "uses his personal email servers" which has a totally different meaning.

Only amatuers and lazy people might use Gmail or their ISP mail for business mail. Gmail is not secure and cannot be considered private, especially when considering that the major leakage of corporate information (data leakage) has been found to be from Gmail. ISP mail usually has too many restrictions most of which are designed to suppress bulk mail and at times they can be overloaded, have a long mail queue or taken offline for regular maintenance. For example many provincial ISPs shutdown and reboot all services regularly. One major ISP in Australia reboots DNS and mail services at 3 am every morning.

So maintaining one's own mail server is the best option for anyone trading on the Internet, especially if you need to manage many domains each with several different email addresses for different staff or just to use as sacrificial anodes.

And if you are going to block my mail server then you are preventing me from signing up for or buying anything on your web site, because you will more than likely have your own mail services on the same server that is affected by your firewall rules.


But these sites have no business hitting our servers.

My point is that you don't know that for sure. I can think of scores of reasons for those hits without them being a site scraper.


I'm beginning to think you just like to argue.

Argument or open discussion of alternatives?

keyplyr




msg:4568614
 1:53 am on Apr 28, 2013 (gmt 0)



So Kendo... which server farm do you work for?

wilderness




msg:4568615
 1:57 am on Apr 28, 2013 (gmt 0)

So Kendo... which server farm do you work for?


Hurricane is a no brainer ;)

keyplyr




msg:4568630
 3:42 am on Apr 28, 2013 (gmt 0)

Any Hurricane ranges I've missed? Thanks in advance :)

64.62.128.0 - 64.62.255.255
64.62.138.0/25

64.71.128.0 - 64.71.191.255
64.71.175.0/24

65.19.128.0 - 65.19.191.255
65.19.128.0/18

65.49.0.0 - 65.49.127.255
65.49.0.0/17

72.52.64.0 - 72.52.127.255
72.52.64.0/18

74.82.0.0 - 74.82.63.255
74.82.0.0/18

184.104.0.0 - 184.105.255.255
184.104.0.0/15

209.51.160.0 - 209.51.191.255
209.51.160.0/19

216.218.128.0 - 216.218.255.255
216.218.128.0/17

dstiles




msg:4568989
 6:31 pm on Apr 29, 2013 (gmt 0)

kendo - you really need to read up on mail and web differences. There is no way that me blocking your mail server from accessing my WEB site on port 80 will affect the capability of your mail server from accepting mail from my on-server mail server sending on port 25. Here, we all block certain types of WEB activity. How we block MAIL activity is entirely different and is carried out via the mail server, not the web server.

keyplr - Some of those are a bit short! I have the following, all blocked:

64.62.128.0 - 64.62.255.255
64.71.128.0 - 64.71.191.255
65.19.128.0 - 65.19.191.255
65.49.0.0 - 65.49.127.255
66.220.0.0 - 66.220.31.255
72.52.64.0 - 72.52.127.255
74.82.0.0 - 74.82.63.255
184.104.0.0 - 184.105.255.255
199.192.152.0 - 199.192.159.255
204.140.16.0 - 204.140.31.255
209.51.160.0 - 209.51.191.255
216.66.0.0 - 216.66.95.255
216.218.128.0 - 216.218.255.255

keyplyr




msg:4569085
 12:13 am on Apr 30, 2013 (gmt 0)

Thanks, had those other ranges, just missed them when posting. None look short.

I have:
204.140.16.0 - 204.140.31.255
204.140.16.0/20
as LISINC, Connex Internet Services which looks to be current host.

wilderness




msg:4569102
 1:29 am on Apr 30, 2013 (gmt 0)

as LISINC, Connex Internet Services


ditto, figurred it was the result of a stray mouse ;)

dstiles




msg:4569413
 7:55 pm on Apr 30, 2013 (gmt 0)

Thanks for the change: my entry was only a year old, too. :(

Kendo




msg:4569551
 6:34 am on May 1, 2013 (gmt 0)

kendo - you really need to read up on mail and web differences. There is no way that me blocking your mail server from accessing my WEB site on port 80 will affect the capability of your mail server from accepting mail from my on-server mail server sending on port 25


This is the first mention of blocking
    port 80
in this thread and it is most important because most visitors will not know that web traffic can be limited to certain ports. If they do not know that, then they are likely to block the IP outright and the methods for doing that are diverse.

When someone adds rules to block an IP range will they indeed specify that the rule only apply to port 80, 8080, 443 or any of many different ports are used by different web editors and web services such as ftp, chats, conferencing, media streaming, etc?

Would not these people noticing such bots and not really knowing what their real purpose is, be inclined to block their IP ranges from all services (across all ports) and be done with it?

Kendo




msg:4569556
 6:40 am on May 1, 2013 (gmt 0)

So Kendo... which server farm do you work for?


None. But I do provide support to hundreds of IT managers to solve problems with their own web apps and here's a question...

How do you block users running TOR software:

- on linux servers?
- on Windows servers?

keyplyr




msg:4569611
 9:38 am on May 1, 2013 (gmt 0)

How do you block users running TOR software:

- on linux servers?
- on Windows servers?

Let's get back On Topic shall we. Kendo please ask these types of questions in the proper forum.

This 92 message thread spans 4 pages: < < 92 ( 1 2 [3] 4 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved