Has lack of a robots.txt confused Googlebot?

OK, I hate to add to the white noise in this forum, but I have been engaged in debate on another forum about a truly bizarre situation, and I think it's something worth debating on here where there's a lot more knowledge floating around. A member of this other forum has had a website indexed in Google since 1997 for a completely non-competitive keyphrase (his strange hobby), and has dozens of high-quality backlinks and a PR of about 5.
This past month, he's checked and actually typing in his domain name doesn't yield his site as a result. However, his backlinks all show up as links to his domain, and if he looks at his site the toolbar shows his PR intact. It's very weird and nobody over there, including me, has ever seen anything like it. Basically, everything around and pertaining to his site is there, but the pages actually in his site aren't returning as results. (No adult filter would be triggered, either; it's an utterly innocent and innocent-sounding hobby.)
He has never used anything remotely sketchy to optimize his site. The only thing he changed was that he switched his entire site from HTML to XHTML. It validates in every validator to be found on the Web-- XHTML 1.0 Strict, I think. I checked it myself and there's really nothing wrong with it at all-- it's more than correct.
Another person on that board insists that it's his XHTML-- Google isn't indexing it. I countered that there was no way that was true-- Google indexes HTML, XHTML, Word, PowerPoint, PDF, TXT... All kinds of formats. And XHTML was designed to be backwards-compatible in older browsers, besides. So that's simply not the answer, right?
A possibility I've thought of, though, for a site being excluded but not penalized could be this:
Robots.txt.
Last month, I noticed Googlebot was coming to my site, asking for my robots.txt, and not getting it because I didn't have one. It would then repeat the request over and over again, up to maybe a dozen times. And since it never got robots.txt, it would leave.
So, I uploaded a blank robots.txt and all was well. I'm indexed, i'm in, everyone's happy.
(Since then I've gone back and added some lines to the robots.txt so it's actually functional, but that hasn't really changed anything except protecting my testing directory from embarrassingly getting spidered. It was careless of me not to protect it before.)

Could that possibly be the problem? I'm not sure if the site owner in question has a robots.txt, but I am sure it's not his XHTML that's made his site go missing. (I'll post back if he did have one so that's not the answer. I think that's the solution. It's my hunch.)

Anyhow, thanks for any suggestions and any knowledge you can impart.

Has lack of a robots.txt confused Googlebot?

site not penalized, but not included in index.

dragonlady7

rainborick

BigDave

dragonlady7

dragonlady7

rainborick

ogletree

dragonlady7

ogletree

HitProf

dragonlady7

tschild

dragonlady7

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week