lucy24 - 8:34 am on May 6, 2013 (gmt 0)
So, this cannot happen with hosting the website in some other server right?
It doesn't matter whose server it is. I meant: the server that the site lives on. The key part is: It doesn't matter what response your server sends out. What matters is what response the user receives. If you've got pure static html, they will be the same. But they don't have to be. I have only recently wrapped my mind around this myself.
Can this actually happen? I mean it's not going to take a minute or an hour, everything performed within 3 to 5 secs right? Can this occur within that time frame?
It only takes a microsecond for the server to output a line break. That blank space on your page before the leading <?php can be lethal. In fact what you describe sounds like that is exactly what's happening: You get the beginnings of a page, but then the server realizes that it has been handed a set of bum parameters, and shoves in the content of your 404 page. You need to tweak the code to make sure no html is generated before the server is sure there will be a valid page at the end of the process. One way is output buffering. ("Prepare this stuff, but don't release it until I say so.") There are probably at least six other ways. But that's a php question, which means it will be better answered by almost anyone in the world other than me ;)
How does Google actually discovers that requesting anything returns in a 200 for a website? I mean it only follows the internal links through its crawling process right?
I don't know what triggers a Garbage Request. But now and then in logs I'll find something like "/paintings/rats/qrkltejtl.html" which is obviously not a typo or garbled link. The search engine is testing whether your site returns a 404 when there can't possibly be a page matching the request. There are two kinds of bad response. One is the global redirect to the front page ("soft 404"); the other is the one you're getting.