Thanks aakk999 and lucy24.
To your questions, I actually don't even have a robots.txt file for this particular site, which is why I didn't think I had been accidentally blocking the bots. As far as I know, there's no other way to block them if there aren't any explicit instructions provided in a robots.txt file?
As for my logs, it does seem like they're crawling through. Here's a sample log:
66.249.73.35 - - [27/Oct/2013:03:02:06 -0400] "GET /myurl HTTP/1.1" 200 12523 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
The IP location is Mountain View, CA. It also actually says Googlebot in the log so looks likely to be one. If they're crawling through, is there any reason why they aren't indexing?
aakk999 - thanks for the suggestion to try Fetch as Googlebot. I'd missed out on that one and just did it. Clicked "Submit to Index" so I hope that's the right way to get it indexed.
Additional context:
- in Google Webmaster Tools, the Crawl Errors page states that there are no crawl errors. The Crawl Stats page also shows that the bots have been crawling the site.
- The site is secured with SSL cert, and I'd submitted both the http and https versions of my site to Webmaster Tools.
And yep, I'd already tried the site:myurl.com search and it returned 0 results.
lucy24 - I'm not in North America but the server is (which is probably what really matters), and the site does end with .com. And yes, AFAI the search engines don't even have to respect the nofollow since it's really just a request on your part to them.
So the question now is, without a robots.txt file and seeing that Googlebots are actually crawling through, is there any reason why my site isn't being indexed?
Appreciate the help!