homepage Welcome to WebmasterWorld Guest from 50.17.107.233
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 43 message thread spans 2 pages: 43 ( [1] 2 > >     
new private search engine?
jeteye.com
macrost




msg:407823
 5:32 pm on Jul 11, 2004 (gmt 0)

Been having this "engine" spider me today. Will post IP and other details in a bit.

 

macrost




msg:407824
 5:43 am on Jul 12, 2004 (gmt 0)

Well, waited for a while for the thread to be posted. Found out it was gigabot. Anyone seen any activity from this bot lately?

tafkar




msg:407825
 10:06 am on Jul 12, 2004 (gmt 0)

Gigabot hit a few of my sites yesterday. Each time it grabbed only the robots.txt and the index page, nothing more.

fiestagirl




msg:407826
 2:25 pm on Jul 12, 2004 (gmt 0)

Started seeing it 7/9.
UA: Gigabot/1.0
IP: 64.62.142.226 - 64.62.142.246
Host: jeta.jeteye.com (64.62.142.231)

volatilegx




msg:407827
 2:29 pm on Jul 12, 2004 (gmt 0)

Excellent! Thanks macrost.

fiestagirl




msg:407828
 4:52 am on Jul 27, 2004 (gmt 0)

Now it's crawling with the UA: Jetbot/1.0

balam




msg:407829
 3:10 pm on Jul 27, 2004 (gmt 0)

Is this someone perhaps licensing technology from Gigablast?

It was only a couple of days ago that I saw the Gigabot go from v1.0 to v2.0...

(I've also seen Gigabot visit from 207.114.174.2 to .26, as recently as a couple of weeks ago. I thought there was a post somewhere around here from Matt, the Gigablast owner, listing IPs he crawls from, but I can't find it...)

Argyll




msg:407830
 3:46 pm on Jul 27, 2004 (gmt 0)

JetBot/1.0 has been hitting everything on my site including trackbacks and comments. Is it benign or no way to tell?

D_Blackwell




msg:407831
 10:36 pm on Jul 31, 2004 (gmt 0)

Jetbot/1.0 - Took the robots.txt and left. (Hit 404 redirect first.)

macrost




msg:407832
 2:19 pm on Aug 2, 2004 (gmt 0)

Ok, seems that this bot is well... stupid in some regards. It sees what the url looks like, but sometimes doesn't follow the actual link. Kinda weird, it generated 8 errors within my application because of that yesterday.

Basically, instead of following what the actual href is, it will take something out of the href display and add it to the actual href.

Am I making any sense? :)

JetEye




msg:407833
 12:58 am on Aug 5, 2004 (gmt 0)

Hello all. I meant to post this sooner. Yes, JetBot is the spider for JetEye.com, a new Internet technology company with a public beta due in a couple months. Yes, we are benign, and yes we are licensing some technology from GigaBlast. So JetBot is as well behaved as gigabot. If you have any questions or concerns you can use the contact page on our site at [jeteye.com...]

All the best,
The JetEye Team

volatilegx




msg:407834
 7:14 pm on Aug 5, 2004 (gmt 0)

Thanks for posting your info JetEye, and welcome to WebmasterWorld.

nativenewyorker




msg:407835
 11:01 pm on Aug 8, 2004 (gmt 0)

JetEye,

Why is your site so vague about your business? What do you actually mean by JetBot being benign? Am I supposed to interpret that as the spider is not a mail harvester?

Given that JetEye is collecting all this info and offering limited access via a login, my suspicions lead me to believe that JetBot is a spybot.

D_Blackwell




msg:407836
 11:20 pm on Aug 8, 2004 (gmt 0)

Aren't they all? We just decide which ones we like - targeted and qualified traffic generators:))

IRWINjim




msg:407837
 11:29 am on Aug 9, 2004 (gmt 0)

I also was visited by JetBot. If only googlebot indexed as many pages as JetBot, I'd be happy.

Wonder what JetBot is? My site had lots of cities & states on it. Is that consistant with other sites it's indexed?

uncle_bob




msg:407838
 11:41 am on Aug 9, 2004 (gmt 0)

JetBot seems to be acting up over the weekend. I have a site on example.com with a 301 redirect from www.example.com to example.com There are no inbound links anywhere on the web pointing to www.example.com, all links point to example.com yet ... JetBot repeatedly requests pages from www.example.com. This includes the robots.txt page, which it then ignores.

Finally decided to ban it at the firewall

chrisnrae




msg:407839
 12:49 pm on Aug 9, 2004 (gmt 0)

Interesting. I was wondering what that was when I saw it in my logs. Anyone have any information on how this company plans to gain any market share worth letting them spider me for?

volatilegx




msg:407840
 4:09 pm on Aug 9, 2004 (gmt 0)

Given that JetEye is collecting all this info and offering limited access via a login, my suspicions lead me to believe that JetBot is a spybot.

Looks like they are not quite ready for a public beta test, that's probably why a login is required. I don't know why that would set any alarms ringing.

surfin2u




msg:407841
 12:45 am on Aug 13, 2004 (gmt 0)

Jetbot has visited 500 of my site's pages so far today. It's still at it, but seems well behaved in that it's taking them slowly.

This is the first time I've noticed them. I agree that it's annoying that they (jetbot/jeteye) won't reveal anything about what they're up to. Has anyone registered there or contacted them to try to get more info? (now don't violate that NDA of theirs ;-)

wilderness




msg:407842
 2:04 am on Aug 13, 2004 (gmt 0)

When the UA was first submitted here, I added it into my robots.txt.

Thus far, the bot has respected that entry on my sites.

surfin2u




msg:407843
 1:00 pm on Aug 13, 2004 (gmt 0)

I can't decide whether to block them or not. They could turn out to be a good source of traffic for me in the future. Anyone know a good reason not to wait and see?

wilderness




msg:407844
 2:03 pm on Aug 13, 2004 (gmt 0)

surfin,
PERSONALLY, I previously has Gigabot denied and this bot is an extension of that technology. For me a simple choice.

For you?
If there is any possibility that the bot will increase a desired traffic to your site (s) than why deny it?

surfin2u




msg:407845
 2:26 pm on Aug 13, 2004 (gmt 0)

Thanks, wilderness. Why did you decide to block Gigabot?

Lord Majestic




msg:407846
 2:28 pm on Aug 13, 2004 (gmt 0)

Anyone know a good reason not to wait and see?

Would you pick a free lottery ticket with some potential of win, big or small?

And what are the costs of having these crawled documents served to that spider, Gigabot or not? Next to zero? Then why even contemplate banning it so long as it does not overload your site?

Easy: live and let live - you might benefit from this in the future.

wilderness




msg:407847
 3:07 pm on Aug 13, 2004 (gmt 0)

Why did you decide to block Gigabot?

surfin,
hopefully it's not a misunderstanding of words?

I relate the use of block to htaccess.
"deny" may be used in both htaccess and robots.txt however in those instances there are two entirely different definitions.

It's not my desire to appear facetious here and please do not interpet that so?

Giga is listed in my robots.txt and has thus far honored that request as has JET.

I do NOT have giga in my htaccess under a reference to either the UA or IP range.

I'm not exactly sure when I began utilizing htaccess. It was even before I came to Webmaster World.
Over time I have simply made decisions based on what had transpired and the potential of what possibly might transpire. Making decisions in the process.
In the beginning, I didn't keep the detailed notes to myself for later reference that I do today.

When my sites first began, I used one of those submission pages and I recall being Gigabot part of those submissions.
Some bots are just not worth the effort they take for what they return to websites. I've "apprently" made that determination about giga without documenting it in my notes. Were it a major personal issue, I'd go back into my monthly back-ups reviewing the logs in the process and make and evaluation. It's just not an urgent issue.

Jeeves is another that has far too much activity at my sites for the small amount of visitors the SE returns.

surfin2u




msg:407848
 6:24 pm on Aug 13, 2004 (gmt 0)

I have heard good reasons to allow these spiders access to my site and none to deny them, so I'll let them have their fun. Thanks for your help.

Rugles




msg:407849
 2:49 pm on Aug 20, 2004 (gmt 0)

Real nice he shows up here, but jeteye, you need to provide a little more info.
You are grabbing hundreds of my pages, why?

Docsboard




msg:407850
 4:10 pm on Sep 27, 2004 (gmt 0)

Jetbot has indexed EVERY page of my forums, has been on my site for 6 days 24-7
I am getting ait nervous reading here whether it is some sort of spam harverster

Come back tell us what you are doing Jeteye

idoc




msg:407851
 3:21 pm on Sep 28, 2004 (gmt 0)

I banned them straight off by i.p. when I saw their homepage. You need to really be careful these days who you let download your content in whole... especially with all the cloaking and hijacking going on. Not to imply this is going to happen from this bot... only that to me the risk is not worth the perceived benefit to my sites... not that in this case I see any perceived benefit to start with.

surfin2u




msg:407852
 12:53 pm on Oct 1, 2004 (gmt 0)

I can understand wanting to block spam harvesters from getting email addresses. I accomplish that by requiring anyone interested an obtaining an email address to allow me to set a cookie in their browser first. I use the cookie to count how many email addresses they've taken today and cut them off after 5.

As for sites that view all of my content, I don't have a problem with that. I want my site to be more widely known and allowing bots to view the content is a good way to do that.

This 43 message thread spans 2 pages: 43 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved