pendanticist

msg:837807 | 12:43 am on Feb 17, 2004 (gmt 0) |
[webmasterworld.com...]
|
4serendipity

msg:837808 | 3:06 am on Feb 17, 2004 (gmt 0) |
Yahlurp lives :) I spotted Yahlurp in a couple logs this morning.
|
Brett_Tabke

msg:837809 | 3:14 am on Feb 17, 2004 (gmt 0) |
That is straight from Yahoo (thanks Tim [webmasterworld.com])
|
markis00

msg:837810 | 3:41 am on Feb 17, 2004 (gmt 0) |
Let's hope they increase the speed of the spider!
|
4serendipity

msg:837811 | 3:44 am on Feb 17, 2004 (gmt 0) |
To verify the UA mentioned in the first post was exactly what I saw in the logs Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
|
sidyadav

msg:837812 | 5:15 am on Feb 17, 2004 (gmt 0) |
cool :) Now we have Yahoo's own noisly eating Web Spider :) (Look at definition #4 of slurp [google.com] at google.) Sid
|
4serendipity

msg:837813 | 7:59 am on Feb 17, 2004 (gmt 0) |
| Now we have Yahoo's own noisly eating Web Spider :) (Look at definition #4 of slurp at google.) |
| Yeah, it always gives me a kick how web-centric the Google define: function is. You have to scroll waaay down the list for define:yahoo to find the traditional meaning of the term.
|
Kirby

msg:837814 | 5:01 pm on Feb 17, 2004 (gmt 0) |
Anyone notice that Yahoo is archiving as well? [help.yahoo.com...]
|
GlynMusica

msg:837815 | 9:41 am on Feb 18, 2004 (gmt 0) |
What I am seeing is that sites that were kicked out of Google following the recent filter hoohah, are included in the Yahoo listings. Which is nice. For a start.
|
Wail

msg:837816 | 3:31 pm on Feb 18, 2004 (gmt 0) |
I think this was a brave move by Yahoo! I think they did the right thing. I agree with their reason for keeping the Slurp name. Backwards compatibility, it's everyone's friend. Why's it brave? Slurp was hardly a prestigious spider. I think it was pretty dumb and fairly cowardly; especially for query strings. I'm shooting from the hip but I suspect the most intelligent spider Yahoo! bought on their search-engine-fest was FAST. I'd see Fast Yahoo! as a more serious threat to Googlebot, but although I shouldn't judge the new spider by its name, I still snigger when I see Slurp's user agent. Yahoo Shopper, of course, has been bimbling around robotland for a while now.
|
4serendipity

msg:837817 | 9:54 pm on Feb 18, 2004 (gmt 0) |
Anyone notice that Yahoo is archiving as well? |
| Yes, and after a quick review of some large pages, it looks like it is caching pages over 101KB in size.
|
Tim

msg:837818 | 10:09 pm on Feb 18, 2004 (gmt 0) |
we index 500K for html- larger than Googles 101k. This is published in Searchday.
|
Powdork

msg:837819 | 4:06 am on Feb 19, 2004 (gmt 0) |
Can we call you TimothY!?
|
newsphinx

msg:837820 | 8:03 am on Feb 19, 2004 (gmt 0) |
Spot +Yahoo!+Slurp in my logs. However, it does not appear in my Sawmill report.
|
stripey

msg:837821 | 1:09 pm on Feb 19, 2004 (gmt 0) |
Slurpoo! (heheheh) ;-)
|
farside847

msg:837822 | 9:03 pm on Feb 19, 2004 (gmt 0) |
The new Y! spider is hitting files in banned by my robots.txt file... Anyone else seeing this?
|
misja

msg:837823 | 10:15 pm on Feb 19, 2004 (gmt 0) |
Well as far as I can see it's obeying robots.txt but I think Yahoo is rewriting some of my URLs like: site.com/directory/?search=test to: site.com/directory?search=test While my robots.txt says
Disallow: /directory/
|
4serendipity

msg:837824 | 2:18 am on Feb 20, 2004 (gmt 0) |
Tim, thanks for the heads up about the searchday article
|
a_chameleon

msg:837825 | 9:05 pm on Feb 20, 2004 (gmt 0) |
I'm seeing a bot in my log files: | (Yahoo-MMCrawler/3.x (mm dash crawler at trd dot overture dot com) |
| that resolves to mmscrm03-1.sac.overture.com and the whois says overture.com; but I also get - Network Data Network id#: 1 Qwest Cybercenters QWEST-CYBERCENTER-2 (NET-66-77-0-0-1) 66.77.0.0 - 66.77.255.255 Fast Search, Inc. QWEST-MCC-FASTSRCH3 (NET-66-77-73-0-1) 66.77.73.0 - 66.77.73.255 ARIN WHOIS database, last updated 2004-02-19 19:15 |
| is this a new crawler that's using a FAST/Overture combo of some sort..?
|
Tim

msg:837826 | 9:13 pm on Feb 20, 2004 (gmt 0) |
Multimedia crawler for Alltheweb, AV and other Overture partners Index.
|
bonanza

msg:837827 | 12:41 pm on Feb 21, 2004 (gmt 0) |
The slurp spider is checking robots.txt several times a day almost hourly, but nothing else, for weeks now. I find myself reading that behavior like tea leaves, and I don't like it. The site well established but absent from the Y! serps. On the other hand, the Yahoo Seeker spider is spidering deeply. *shrug*
|
cityres

msg:837828 | 10:07 pm on Feb 22, 2004 (gmt 0) |
So what is Yahoo using currently? Inktomi or is it their own engine now? How does it work? Any tips on optimization?
|
Wail

msg:837829 | 9:59 am on Feb 23, 2004 (gmt 0) |
| Multimedia crawler for Alltheweb, AV and other Overture partners Index. |
| Hi Tim, it's great to have you on the forums. Is this a fair assessment? + Scooter still finds results for AV and AV is still a stand alone search engine in its own right. + FAST still finds results for AtW and AtW is a still a stand alone search engine in its own right. However, new technologies developed by Yahoo! are likely to be single items shared by all the search engines owned by the company? Yahoo-MMCrawler is an example of this? I think the new Yahoo results look really good. The web page summaries are especially fair.
|
|