Msg#: 1360 posted 12:01 am on Aug 12, 2000 (gmt 0)
I have been experimenting with some domains and it has been driving me crazy. I can't seem to spot the INK spider in my logs. I found that 18.104.22.168 from Exodus is coming the day after I submit to the three INK sites and grabbing the pages. I have been submitting 25 per day and it never gets all 25, it seems to get them in random order and skips some, with the average being 15. The day after it gets the pages, they appear in the INK sites. It appears that all the pages it gets, make it into INK. The question is...why does it get them randomly and why does it skip some. It is possible I am hallicinating!
Msg#: 1360 posted 12:18 am on Aug 12, 2000 (gmt 0)
The only thing I can think is that there may be an error durring submissions. This often happens with us. It's either A)human error, where you may may have accidently skiped a submission. B)Durring the submission process, an error occured where the script didn't pick up the url,even though the "thank you page" says it did pick it up.
This is very common actually. It's even common that it will actually spider all the pages but not index the results
Msg#: 1360 posted 12:53 pm on Aug 13, 2000 (gmt 0)
I checked the logs after introducing a 1 minute delay between submissions (25 pages per day). The INK spider did not skip any submission, however it still hit them radomly and all at the same time. It appears that submissions are run through a filter which takes out some submissions (criteria unknown..but may be time), then batched for the spider. The filter seems to contain a list of banned domains...I have two it will not spider. I am consumed with getting into the INK database..then I will worry about optimization.
I just checked my logs again for 10 new domains I submitted 10 pages a day, with a 1 minute delay. I found that the spider only got beween 2 and 6 of the pages! The spider gets them randomly and all show the same time. Now I am confused. I'd go get drunk, but I don't drink. So, it appears that the results are about the same if you submit 100 or 10, or use a time delay or not, you will get 20 to 60 percent into INK. I would guess that these pages would not pass the 75% dupe test. Perhaps about 90% dupe!