Forum Moderators: open

Message Too Old, No Replies

Slurp Seems A Little Weak

Does it crawl more than one page anymore

         

rankboy

3:46 am on Apr 12, 2002 (gmt 0)

10+ Year Member



Hey everyone,
I am posting this message because I have been a little dissapointed with the way slurp is crawling my sites. No matter what navigation I have in place I can't get it to crawl past the index page. Granted I haven't paid for inclusion, but at $30 a page I would go bankrupt before I got everything I wanted in the index. Now, Googlebot is crawling like I have never seen before and hitting up almost every page on my sites. Does anyone else have this problem, and maybe a potential solution? Traffic from MSN, AOL, Overture, Iwon, etc can be pretty big if you can get in. Thanks

msgraph

3:57 am on Apr 12, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi rankboy and welcome to WebmasterWorld!

I get hit by Slurp almost everyday both on the index page and various sub-pages.

Few questions for you...

Which version of Slurp are you seeing that just crawls your index page?

How often do you update your pages?

Do you have anything disallowed in your robots.txt?

Are all your pages static or dynamic?

Are your links in Java?

Of course you don't have to answer all this but I'm just trying to see if we can narrow down the problem.

rankboy

6:53 am on Apr 12, 2002 (gmt 0)

10+ Year Member



thanks msgraph,
I've been hit by both the si and cat spiders. They both don't seem to crawl differently, however about 70% of the inktomi spiders are si spiders.
I update my pages approximately every week.
Nothing is disallowed in robots.txt
static pages which have plain old href links that are absolute not relative

rankboy

7:00 am on Apr 12, 2002 (gmt 0)

10+ Year Member



Sorry I was a little off about si and cat here are the exact numbers:

si = 167 hits
cat = 15 hits
percentage of cat to si: 8.9%

Granted, all of these hits are too the index pages of my sites and it seems to come back to the same pages quite a bit.

rankboy

7:06 am on Apr 12, 2002 (gmt 0)

10+ Year Member



Also to clarify, I don't use a robots.txt file on my server. From what I've seen before, if a spider requests robots.txt but doesn't find it then it is free to crawl the entire site. Maybe I'm mistaken and inktomi does things a little different.

The funny thing about this whole situation is that both googlebot,fast,and thunderstone(hehe) are crawling so many of my pages. Inktomi seems to be the only one getting confused at all of my index pages.

msgraph

1:33 pm on Apr 12, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Si" is an update checker, unless they've changed things. It will check your index page and robots.txt, more often than any other page, to see if you have made any updates.

"Cat"(Not the paid one) is a deep crawler that spiders off a predetermined URL list. Again, unless things have changed, it doesn't come to your index page and follow the links down.

Since the other robots don't have problems crawling your pages then it might be something wrong on their side.

Do you have links pointing to these pages from other sites? Perhaps INK is getting a little stubborn these days and won't check your pages unless there is an outside link pointing to it.

If you have nothing to hide ;), send them a note with your URL stating what your problem is. They are pretty good at replying.

slurp-help@inktomi.com

If that doesn't work then try this.

support@inktomi.com

rankboy

6:57 pm on Apr 20, 2002 (gmt 0)

10+ Year Member



Today I noticed I am ranking for the index pages I had crawled. Rather well in fact. However the only pages that show up in Inktomi are my index pages. None of the other pages on my domain were crawled and likewise no other pages show up in the index. Does anyone know if inktomi is limiting crawling to the exact url you submit via their free submit, rather than an entire domain?

keywordbuys

8:56 pm on Apr 20, 2002 (gmt 0)

10+ Year Member



not totally sure, but inktomi just crawls index pages since they started PFI. If your free submit ever had them crawl past the index they may do it again, but if you've been included within the last year or so they probably won't.