| 12:18 am on Aug 12, 2000 (gmt 0)|
The only thing I can think is that there may be an error
durring submissions. This often happens with us. It's either
A)human error, where you may may have accidently skiped a submission. B)Durring the submission process, an error occured
where the script didn't pick up the url,even though the "thank you page" says it did pick it up.
This is very common actually. It's even common that it will
actually spider all the pages but not index the results
Hope this has been some kind of help :-)
| 12:21 am on Aug 12, 2000 (gmt 0)|
That is ink - it's j6000.inktomi.com
>The question is...why does it
>get them randomly and why does it skip some[?]
Yes, that is the question.
| 1:17 am on Aug 12, 2000 (gmt 0)|
Welcome to the inktomi forum - look forward to seeing you around
| 2:02 am on Aug 12, 2000 (gmt 0)|
Just hope I can contribute.
| 12:53 pm on Aug 13, 2000 (gmt 0)|
I checked the logs after introducing a 1 minute delay between submissions (25 pages per day). The INK spider did not skip any submission, however it still hit them radomly and all at the same time. It appears that submissions are run through a filter which takes out some submissions (criteria unknown..but may be time), then batched for the spider. The filter seems to contain a list of banned domains...I have two it will not spider. I am consumed with getting into the INK database..then I will worry about optimization.
| 7:23 pm on Aug 13, 2000 (gmt 0)|
I just checked my logs again for 10 new domains I submitted 10 pages a day, with a 1 minute delay. I found that the spider only got beween 2 and 6 of the pages! The spider gets them randomly and all show the same time. Now I am confused. I'd go get drunk, but I don't drink. So, it appears that the results are about the same if you submit 100 or 10, or use a time delay or not, you will get 20 to 60 percent into INK. I would guess that these pages would not pass the 75% dupe test. Perhaps about 90% dupe!