homepage Welcome to WebmasterWorld Guest from 54.145.141.127
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Hailoo Search
Hailoo? Hailoo? Anybody Home?
incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3883354 posted 7:32 pm on Apr 1, 2009 (gmt 0)

I keep getting hit furiously by something who's reverse DNS claims it's from hailoo.com which is a dead domain.

IP: 38.105.244.nnn -> hailoo.com
"Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.16) Gecko/20080702 Iceweasel/2.0.0.16 (Debian-2.0.0.16-0etch1)"

If you didn't know, Iceweasel is the name of Firefox on Debian because Debian doesn't use the Mozilla build of Firefox which causes licensing issues, more legal nonsense than I have ever read in one sitting but truth is stranger than fiction:

[en.wikipedia.org...]

Anyway, back to the source of this IP...

Whois on the IP says:
network:Org-Name:Hailoo LLC

Searching for Hailoo LLC turns up this parked domain:

Domain Name: HAILOO.US
Registrant Organization: Hailoo Search Inc.
Registrant Email: <snip> @hailoo.com

So it all appears to be related yet nothing to see anywhere.

Hailoo, is anyone there?

 

Umbra

10+ Year Member



 
Msg#: 3883354 posted 9:04 pm on Apr 1, 2009 (gmt 0)

I just googled the domain and found a job listing. It describes the company as another tech startup with "potential for massive future growth" that's developing search for Middle Eastern users.

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 3883354 posted 10:30 pm on Apr 1, 2009 (gmt 0)

Domain is "live" here (UK) but getting "no server" at the domain and www.domain.

Record expires on 13-Oct-2009.
Record created on 13-Oct-2005.

Registrant:
Hailoo, Dwan
Hailoo Search Inc.
(address in East Setauket, NY)
US

Whois gives the IP block 48/29 for hailoo but a different (Newark) address.

Sounds like a (badly behaved?) SE using mozilla?

Just found a bit in a job advert:
"(developing) sophisticated search technology for a large, Middle Eastern user-base..." (Linux-based)

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3883354 posted 11:22 pm on Apr 1, 2009 (gmt 0)

Whatever it is, it's definitely crawling around the web as I'm seeing a lot of referrals from image loads of my banners on other sites.

The use of Firefox could indicate someone taking screen shots, which is exactly what I saw with Snap, Searchme, etc.

Demaestro

WebmasterWorld Senior Member demaestro us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3883354 posted 11:45 pm on Apr 1, 2009 (gmt 0)

Bill, any signs that it is respecting or even reading robots.txt?

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3883354 posted 12:11 am on Apr 2, 2009 (gmt 0)

Firefox typically doesn't use robots.txt, I saw nothing

Demaestro

WebmasterWorld Senior Member demaestro us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3883354 posted 12:21 am on Apr 2, 2009 (gmt 0)

Oh I thought that it was using that as the UA string but that it was acting as a bot. The registrant organization is "Hailoo Search Inc." since it was hitting you hard I assumed they were developing something on top of iceweasel.

[edited by: Demaestro at 12:27 am (utc) on April 2, 2009]

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3883354 posted 12:36 am on Apr 2, 2009 (gmt 0)

That's my theory and mostly people develop screen shots on top of Firefox, and my batting average of spotting those things has been running really high ;)

thetrasher

5+ Year Member



 
Msg#: 3883354 posted 12:01 pm on Apr 2, 2009 (gmt 0)

IP: 38.105.244.nnn -> hailoo.com
FTR: IP(hailoo.com)-3 = 38.105.244.nn -> hailoo.com

Her search engine is in stealth mode (since 2005) or her servers are (again) under foreign control (hailoo+one+of+my+computers+is+sending+out+spam).

.com, .net, .org, .biz, .info, .de, .dk, .ru, .ir, many domains for "a small hi-tech Internet start-up company based in New York", operating in stealth mode.

a large, Middle Eastern user-base
Russia is not in the Middle East.

dstiles

WebmasterWorld Senior Member dstiles us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 3883354 posted 5:31 pm on Apr 2, 2009 (gmt 0)

There are a few references on google in what looks to my untrained eye like Arabic. One (google-translated) is a forum that suggests it's a valid SE but another is very confusing.

It's not unusual for an SE type of company to register domains world-wide.

I don't dispute that the server(s) may be hijacked or otherwise under the control of spammers. A lot of botnets are controlled through and utilise US servers and this may be the case here, especially if the site has been taken down - although in that case it's odd there is no "hijack" site replacing it.

phred

5+ Year Member



 
Msg#: 3883354 posted 9:21 pm on Apr 2, 2009 (gmt 0)

38.105.244.nn -> hailoo.com

38.0.0.0/8, Performance Systems International Inc. = home of Voyager, Kosmix, Scoutjet, Hailoo, and those are only the ones I'm aware of.

Entire range blocked.

Phred

Hobbs

WebmasterWorld Senior Member hobbs us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3883354 posted 7:15 pm on Apr 3, 2009 (gmt 0)

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.18) Gecko/20081030 Iceweasel/2.0.0.18 (Debian-2.0.0.18-0etch1)

Same user agent came from PSI's 38.99.65.nn got denied
Then tried from Level3 8.20.84.nn and also was denied.

That IP is showing Apache 2 Test Page powered by CentOS

Slightly different agent then tried and got denied from an Egypt DSL IP and Austria 212.31.90.nn

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.6) Gecko/2009020409 Iceweasel/3.0.6 (Debian-3.0.6-1)

Chatter on forums includes forum owners reporting a signed in member called 'Hailoo' going through all posts fast (crawling) and one of them confirming it is a new search engine.

I'd say the name Hailoo is a fake front for a bot owner running several boxes to populate a database for another project under a different name.

Umbra

10+ Year Member



 
Msg#: 3883354 posted 8:59 pm on Apr 3, 2009 (gmt 0)

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.18) Gecko/20081030 Iceweasel/2.0.0.18 (Debian-2.0.0.18-0etch1)

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.6) Gecko/2009020409 Iceweasel/3.0.6 (Debian-3.0.6-1)

Forgive my ignorance, but aren't these user agents for an obscure but legitimate browser?

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3883354 posted 1:21 am on Apr 4, 2009 (gmt 0)

Forgive my ignorance, but aren't these user agents for an obscure but legitimate browser?

Re-read my first post, I linked out to all the information about it.

Iceweasel is a Debian build of Firefox, which isn't obscure at all, just the wacky alias.

Umbra

10+ Year Member



 
Msg#: 3883354 posted 1:52 am on Apr 4, 2009 (gmt 0)

Re-read my first post, I linked out to all the information about it.

Right, but how did Hobbs know to deny the user agent from the Egyptian and Austrian residential IP when it could have been a real browser?

Iceweasel is a Debian build of Firefox, which isn't obscure at all, just the wacky alias.

I should've used a better word than "obscure". How about "rare". I was implying that only a tiny minority of people use Linux for personal PCs, so it's an uncommon user agent but can be still a legitimate user.

[edited by: Umbra at 1:59 am (utc) on April 4, 2009]

Hobbs

WebmasterWorld Senior Member hobbs us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3883354 posted 8:28 am on Apr 4, 2009 (gmt 0)

Hi,
Will PM you an answer for that question.

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3883354 posted 8:35 am on Apr 4, 2009 (gmt 0)

I was implying that only a tiny minority of people use Linux for personal PCs, so it's an uncommon user agent but can be still a legitimate user

It's less rare in countries other than the US, many people are trying to break the bonds of MS software.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved