homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Ask - Teoma
Forum Library, Charter, Moderator: open

Ask - Teoma Forum

Deep Crawl.

 5:48 am on Sep 30, 2004 (gmt 0)

Ask is crawling my site deeper then google has ever done, its fetched over 60,000 pages in the last 24 hours... How often do they come up with a new index?



 8:27 am on Sep 30, 2004 (gmt 0)

Ahh - I thought it had finnised crawling and was looking out for an update, must still be doing its rounds.

In my experience Jeeves has been updating on about a 6 weekly(ish) schedule - although it is more than 6 weeks since the last update, perhaps they want to get a bit more data.

Updates tend to be at weekends too (last 2 or 3 I think) - if it is still crawling it might be to early for this weekend.


 2:25 pm on Sep 30, 2004 (gmt 0)

ya i was surprised, its fetched over 250,000 pages this week which is more then all other SE's combined have done for the month


 6:42 pm on Sep 30, 2004 (gmt 0)

Wow yeah, Teoma is on a rampage on one of my sites:

88151 Mozilla/2.0 (compatible; Ask Jeeves/Teoma)
71159 msnbot/0.11 (NULLED)
28485 msnbot/0.3 (NULLED)
23009 Googlebot/2.1 (NULLED)

Crazy, this is the first time i've seen it spider so vigorously.


 7:44 am on Oct 2, 2004 (gmt 0)

Jeeves/Teoma has updated today - but for my sites there is a lot more crawled data not listed.

Perhaps more updating is still being factored in......


 6:45 am on Oct 3, 2004 (gmt 0)

ya still going strong here, fetched another 60,000 pages in the last 24 hours


 10:07 am on Oct 3, 2004 (gmt 0)

Seeing this thread, I checked my latest access_log file. Sure enough, ask/Teoma has crawled pretty much my one entire smallish site, (about 130 pages). All this in the span of a single 24 hour period. Crawler hits were nicely spaced, maybe 30 seconds apart, and it pulled in robots.txt repeatedly. It looks well behaved from my little corner of the net.

I'm not sure if there is any cause-effect relation here, but my single 3-letter keyword brought my site up on the first page, #10 out of 1.7 million. I can't complain.

- Larry


 3:23 pm on Oct 3, 2004 (gmt 0)

>Jeeves/Teoma has updated today - but for my sites there is a lot more crawled data not listed.

Yep. And if ask.com was the dominant search engine I could get filthy rich doing professional SEO. Ask.com isn't an easy nut to crack, but once you know exactly how to do so it's easy to rank well.


 7:55 am on Oct 10, 2004 (gmt 0)

Has anyone got much data into jeeves following the recent crawl.

The about of data added to the index compared to the amount of pages crawled for my sites was only a very small percentage.

Bit of a waste of bandwidth unless these pages do get added. :(


 11:14 pm on Oct 12, 2004 (gmt 0)

Ask Jeews is storming my forum LOFI version.
8000 hiths in the last 24 hours

I've never had links from teoma/ask jeews and this is strange!


 1:51 pm on Oct 20, 2004 (gmt 0)

The AJ/Teoma crawler is also hitting one of my sites quite hard, a page every 2 or 3 seconds, 24/7 (there are a few hundred thousand pages on the site).

Every now and then it triggers my mod_throttle rules (too many pages accessed by one IP in a given time frame). I then return a 503 error code. But it just carries on pounding away, ignoring that for hundreds more pages...

Of course, none of the tens of thousands of pages it grabs per day are indexed yet. It just has one page from the site in the SERPS, a holding page from many months ago.

A key to the search engine wars is comprehesive, fresh results. They seem to be trying a bit hard for the first part of that, and failing totally on the second part.


 5:53 pm on Nov 22, 2004 (gmt 0)

AskJeeves is behaving rather strange for my site, eating TONS of bandwidth. (More-so than GoogleBot for less pages.) Here are the stats for both:

Googlebot 71606 1.95 GB
AskJeeves 51414 9.25 GB

Makes little to no sense to me.



 8:02 pm on Nov 22, 2004 (gmt 0)

Maybe the majority of the spidered pages by Googlebot was by the Mozilla 5.0 variant?

This bot supports HTTP/1.1 meaning it supports Gzip pages, sometimes upto 1/8th the transfer size of uncompressed pages.


 5:54 pm on Nov 23, 2004 (gmt 0)

That's very possible. I will take a look at the raw logs and see if this is the case. It just seems strange to me that AskJeeves (who never spidered more than a few hundred pages a month prior) is all of the sudden raping my entire site. =)


Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Ask - Teoma
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved