|Fast only getting to robots.txt and leaving...|
Happened 2x and I'm desperately trying to get picked up
| 1:43 pm on Nov 3, 2002 (gmt 0)|
Here's the scoop....
10/21 & 10/22 - Submitted pages to FAST/All The Web
10/28 - Submitted a few more to All the Web and Lycos
I got hit on 10/28 and 11/03, both by FAST-WebCrawler/3.6 but they only requested my robots.txt file and went on their merry way.
cr048r01-2.sac2.fastsearch.net - - [03/Nov/2002:06:09:57 -0500] "GET /robots.txt HTTP/1.0" 200 36 "-" "FAST-WebCrawler/3.6 (atw-crawler at fast dot no; [fast.no...]
cr048r01-2.sac2.fastsearch.net - - [28/Oct/2002:05:17:30 -0500] "GET /robots.txt HTTP/1.0" 200 36 "-" "FAST-WebCrawler/3.6 (atw-crawler at fast dot no; [fast.no...]
Does anyone have any idea why this is happening? I realllllly would like to get into Fast/All the Web/Lycos, but I'm feeling doubtful, as they can't even get past robots.txt. I just have a general robots.txt set up like this....
I only did the disallow above because I had some funky stuff happening with requests for scripts that didn't even exist. I don't know what I could be doing, but if anyone has any ideas, I'd appreciate it. I seem to be getting hit ok, by other spiders.
| 1:50 pm on Nov 3, 2002 (gmt 0)|
I wouldn't worry at this point. It's interesting that the spider came your way after submitting. I let them pick up new pages and sites usually the same way as Google: by setting up links to the new site. Best option as always is getting a ODP listing.
Anyway: Fast knows your site now. Doing some preliminary crwls before they send the regular spider for content crawling seems nothing unusual.
I would wait for some more weeks before starting to worry.
Your robots txt anyhow looks just about right.
| 2:22 pm on Nov 3, 2002 (gmt 0)|
Thank you for your insight heini. I have submitted to DMOZ, and am currently waiting, as well as waiting for the other big guys to pick me up. As I find myself in the midst of all this waiting (sigh), I'm trying to analyze who has been where and when. Too much time on my hands maybe? :)
One more unrelated question though, if I may?
On 11/01 I had the following entry...
inktomi5-bre.server.ntl.com - - [01/Nov/2002:20:44:40 -0500] "GET / HTTP/1.0" 200 8115 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Is this really Inktomi Spider, or a fake? They grabbed a few pages, but no robots.txt. Could it possibly be that an inclusion to Inktomi's SERPS could be near?
I'm sorry to ask dumb questions, but I'm still learning when it comes to differentiating who is who and why they're there. :)
| 2:26 pm on Nov 3, 2002 (gmt 0)|
It's a proxy from Inktomi that NTL uses:
| 2:31 pm on Nov 3, 2002 (gmt 0)|
Thanks for the info :)
| 2:50 pm on Nov 3, 2002 (gmt 0)|
An important though often overlooked part of web promotion is the art of waiting ;)
But with listings in OPD and other directories underway you have done what it takes to make it into every free spidering engine.
Not at all. What you are doing - looking at your logs, identifying the bots and figuring out what's what is the best way to get a grip on things.
| 4:08 pm on Nov 3, 2002 (gmt 0)|
Thanks for reassuring me that I am normal. :)
I hope the art of waiting is something I can learn to do more gracefully...lol
BTW, although I'm learning new things every day, I've done some editing for Zeal to bide my time, and have also applied for editing at DMOZ. Hopefully, between the two, I will be able to suffice some of my boredom :)