|Robots files, Spiders, and Search Engines|
| 7:31 pm on Feb 29, 2000 (gmt 0)|
I'm stuck on some basics and need any help that anyone can provide. I have a robots.txt file which, in theory, disallows search engines from various pages on my site based upon the spider the engine uses. I've used a robots.txt syntax checker to verify that my syntax is ok.
I've included meta tags with index, follow options on my pages.
One of three things that seem to happen when a spider visits my site:
1. They visit my home page and go away. I've tried changing the size of my home page by adding 100-200 blanks at the end of the page. Didn't help.
2. They get the robots.txt file, go away and never index the site.
3. Occasionally, a spider will pick up one page in addition to the robots.txt page.
If this were just one engine or spider, I would think that it was just something unique about that spider. But with it happening on virtually all of them I assume it must be something I'm doing wrong.
Oh, I should mention that when I posted the latest buddy links pages, AV visited the site, picked up the pages, got the robots.txt file, visited my home page, and then went back to the robots.txt file and then disappeared.
Can anyone suggest what I might be doing wrong?
| 11:51 pm on Feb 29, 2000 (gmt 0)|
What you are probably seeing is the initial crawl by which search engines will check during or shortly after submission whether the submitted site is accessible at all. If so, your site will be placed in queue for
later extensive crawling.
Also, you might make a point of putting as many links to your other web pages as you reasonably can (you can use invisible links nicely, too, so there shouldn't really be a limit), as this will help you get your site
deep crawled faster. Note, however, that there are only so many deep crawling engines left these days, so you will be best advised to submit every single page individually as you won't get them into most SE indices otherwise.
| 3:11 am on Mar 1, 2000 (gmt 0)|
You might want to post your robots.txt for us to have a look at. There could be a syntax problem that spiders are having with it.