Forum Moderators: open

Message Too Old, No Replies

Yahoo! Slurp only takes robots.txt

Site not refreshed by robots

         

seindal

6:25 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



I have a site, mostly an affiliate site, where Yahoo! Slurp only reads robots.txt, which it grabs a couple of times each day. It doesn't even take the homepage. Only robots.txt

This is the only kind of trace of Yahoo! Slurp I have in my logs for that site:

lj1001.inktomisearch.com - - [25/Feb/2004:16:20:31 +0100] "GET /robots.txt HTTP/1.0" 200 174 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http*://help.yahoo.com/help/us/ysearch/slurp)"

The robots.txt is very simple

User-agent: *
Disallow: /search
Disallow: /random.html
Disallow: /questions.html
Disallow: /comments.html
Disallow: /honey.html

Nothing there that should keep slurp or any other robot away.

The site is doing ok in google, and doing a yahoo search on "site:www.example.com" gives 48000 hits, so the site is indexed, but it is not refreshed and it gets very little traffic from yahoo.

What can I do to get Slurp to refresh my site, less of giving them all my money and still pray that I will be included?

Sunset_Jim

9:49 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



Can someone tell me what my server log entree is reporting?

2004-02-25 16:43:07 66.196.65.36 - ipswjw0029atl2 xx.xx.xx.xx 80 GET /robots.txt 404 251 199 0 www.widgets.net Mozilla/5.0+(compatible;+Yahoo!+Slurp;+http://help.yahoo.com/help/us/ysearch/slurp)

I would like to know if the Yahoo/Inktomi robot is crawling my site and if it is visiting more than a non existant robots.txt file.

walkman

10:05 pm on Feb 25, 2004 (gmt 0)



same thing happens to me and I have a penalty.

farside847

10:06 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



seindal: my guess is that the Yahoo bot is refreshing all of its robots.txt entrees across all of the spiders before crawling again. The bot is actively crawling the net this week. It has been hitting many content pages on a few of my sites. In time I bet they will hit you too.

seindal

10:20 pm on Feb 25, 2004 (gmt 0)

10+ Year Member



I have a bunch of other sites, some mostly content sites and some mostly affiliate sites, but only this one is not being crawled by Slurp. Unfortunately, it is my best earner.

I have noticed that the site do get some limited traffic from yahoo, and as mentioned it is present in the index.

I guess I just have to be patient and see if the robot comes around. At least I know it goes around :-)

DreamMaster

2:43 am on Feb 26, 2004 (gmt 0)



Seindal,

I'm in the same boat as you, my top converting site which is doing great in google has aparently been penalized by Inktomi/Yahoo and it doesn't end there because it's nowhere to be seen in Altavista and Alltheweb as well. Maby Yahoo shares the "penalty" database with those other engines since they all share the same owner... Can you or anybody confirm this?

I get about 5 requests for my robots.txt from Slurp per day and recently got the following wierd request:

207.126.231.94 - - [18/Feb/2004:12:17:54 -0600] "GET /***/***.gif HTTP/1.1" 200 4192 "http://idev11.inktomi.com/rt/left_frame.cgi?frame=top&url=http%3A//www.***/***.html&cat=rtj1000.inktomi.com:2281" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461; Q312461; .NET CLR 1.0.3705)"
207.126.231.94 - - [18/Feb/2004:12:17:54 -0600] "GET /***/***.css HTTP/1.1" 200 850 "http://idev11.inktomi.com/rt/left_frame.cgi?frame=top&url=http%3A//www.***/***.html&cat=rtj1000.inktomi.com:2281" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461; Q312461; .NET CLR 1.0.3705)"

I read in another thread that this could be an Inktomi hand check. Can anyone confirm this?

Anyway I am thinking of moving the site to another domain or maby even make a clone of the site in another domain and block all spiders exept Slrup. What do you people think of that last idea?
After much thought regarding this matter I have come to the conclusion that Slurp is eather very sensiteive to redirects or has a low threshhold for duplicate content.

dauction

2:46 am on Feb 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



same here ..only graps robots.txt ..

I have to believe that it's a matter of being patient ..and not anything with our site

walkman

3:35 am on Feb 26, 2004 (gmt 0)



DreamMaster,
let's see what they say in the conf this week. I think they plan to address this banning /penalizing issue. It's ridiculous.

check out my log, this is the first time a Y! Slurp has asked for my homepage, ever. The bad news? I think it's doing it for Inktomi since I paid to index it and have it listed dead last because of some penalty. Notice Ink gets the robots.txt, the Y! Slurp gets the /. I wonder if they coordinate or if this is just a coincindece.

66.196.65.28 - - [25/Feb/2004:08:55:08 -0500] "GET /robots.txt HTTP/1.0" 200 939 "-" "Mozilla/5.0 (Slurp/si; slurp@inktomi.com; [inktomi.com...]
66.196.73.51 - - [25/Feb/2004:08:55:09 -0500] "GET / HTTP/1.0" 200 13198 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]

As far as having another domain it has crossed my mind but I'd lose the linkbacks and that sucks. Beats not being listed at all though.

flobaby

3:37 am on Feb 26, 2004 (gmt 0)

10+ Year Member



I got the same thing, Ink goes for the robots, Yahoo gets the PFI page.