Forum Moderators: open
This is the only kind of trace of Yahoo! Slurp I have in my logs for that site:
lj1001.inktomisearch.com - - [25/Feb/2004:16:20:31 +0100] "GET /robots.txt HTTP/1.0" 200 174 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http*://help.yahoo.com/help/us/ysearch/slurp)"
The robots.txt is very simple
User-agent: *
Disallow: /search
Disallow: /random.html
Disallow: /questions.html
Disallow: /comments.html
Disallow: /honey.html
Nothing there that should keep slurp or any other robot away.
The site is doing ok in google, and doing a yahoo search on "site:www.example.com" gives 48000 hits, so the site is indexed, but it is not refreshed and it gets very little traffic from yahoo.
What can I do to get Slurp to refresh my site, less of giving them all my money and still pray that I will be included?
2004-02-25 16:43:07 66.196.65.36 - ipswjw0029atl2 xx.xx.xx.xx 80 GET /robots.txt 404 251 199 0 www.widgets.net Mozilla/5.0+(compatible;+Yahoo!+Slurp;+http://help.yahoo.com/help/us/ysearch/slurp)
I would like to know if the Yahoo/Inktomi robot is crawling my site and if it is visiting more than a non existant robots.txt file.
I have noticed that the site do get some limited traffic from yahoo, and as mentioned it is present in the index.
I guess I just have to be patient and see if the robot comes around. At least I know it goes around :-)
I'm in the same boat as you, my top converting site which is doing great in google has aparently been penalized by Inktomi/Yahoo and it doesn't end there because it's nowhere to be seen in Altavista and Alltheweb as well. Maby Yahoo shares the "penalty" database with those other engines since they all share the same owner... Can you or anybody confirm this?
I get about 5 requests for my robots.txt from Slurp per day and recently got the following wierd request:
207.126.231.94 - - [18/Feb/2004:12:17:54 -0600] "GET /***/***.gif HTTP/1.1" 200 4192 "http://idev11.inktomi.com/rt/left_frame.cgi?frame=top&url=http%3A//www.***/***.html&cat=rtj1000.inktomi.com:2281" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461; Q312461; .NET CLR 1.0.3705)"
207.126.231.94 - - [18/Feb/2004:12:17:54 -0600] "GET /***/***.css HTTP/1.1" 200 850 "http://idev11.inktomi.com/rt/left_frame.cgi?frame=top&url=http%3A//www.***/***.html&cat=rtj1000.inktomi.com:2281" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461; Q312461; .NET CLR 1.0.3705)"
I read in another thread that this could be an Inktomi hand check. Can anyone confirm this?
Anyway I am thinking of moving the site to another domain or maby even make a clone of the site in another domain and block all spiders exept Slrup. What do you people think of that last idea?
After much thought regarding this matter I have come to the conclusion that Slurp is eather very sensiteive to redirects or has a low threshhold for duplicate content.
check out my log, this is the first time a Y! Slurp has asked for my homepage, ever. The bad news? I think it's doing it for Inktomi since I paid to index it and have it listed dead last because of some penalty. Notice Ink gets the robots.txt, the Y! Slurp gets the /. I wonder if they coordinate or if this is just a coincindece.
66.196.65.28 - - [25/Feb/2004:08:55:08 -0500] "GET /robots.txt HTTP/1.0" 200 939 "-" "Mozilla/5.0 (Slurp/si; slurp@inktomi.com; [inktomi.com...]
66.196.73.51 - - [25/Feb/2004:08:55:09 -0500] "GET / HTTP/1.0" 200 13198 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; [help.yahoo.com...]
As far as having another domain it has crossed my mind but I'd lose the linkbacks and that sucks. Beats not being listed at all though.