Forum Moderators: open
Could that possibly be the problem? I'm not sure if the site owner in question has a robots.txt, but I am sure it's not his XHTML that's made his site go missing. (I'll post back if he did have one so that's not the answer. I think that's the solution. It's my hunch.)
Anyhow, thanks for any suggestions and any knowledge you can impart.
If there are, do they return the 404 status that they are supposed to, or do they redirect and return a 200.
If the custom 404 page returns a 200, then google will try in interpret the page returned and won't be able to. It will then decide to assume that you do not want to be crawlled.
By putting up a blank robots.txt you will no longer return the defective 404 page.
Which leaves the question of links. I noticed he's in a web ring. I never understood how those could be bad; before search engines, those are how I got around. But I'll tell him to look into that. It seems the most likely.
So the toolbar still isn't sorted out then? Because his page was showing up as PR 5 but not showing in Google searches. Most odd.
Well, thanks for help anyway. :D
Hmm... if that's not it, someone else has pointed out that his doctype declaration has a linebreak in it and perhaps that's confused Googlebot.
Man, I do not know. These things are why webmasters who really really know what they're doing make the big bucks. There is too much crap that's essential for machines and nearly invisible to humans. Blechh, I need a nap!
I hope the problem is solved with fixing the doctype. If not:
Do you have acces to the log files? Does Googlebot try to index pages or not? Are the links plain hrefs?
Did the file names change while redoing the site? Perhaps the old pages are now a 404 and the now not yet indexed.
There may be a problem if the old pages are redrected tot a custom 404 page with a 301 / 200. You can check this with the server header checker:
[searchengineworld.com...]
(type in one of the old file names that no longer exist).
So it might also be a good idea to check if the site's pages can be retrieved with Wget, with the user-agent string set to 'Googlebot/2.1 (+http://www.googlebot.com/bot.html)'