Forum Moderators: open
[webmasterworld.com...]
If the HTTP_IF_MODIFIED_SINCE is setup the right way it will prevent the bots from checking all docs in a site again and again.
HTTP_IF_MODIFIED_SINCE tells the bot about the files that had a change as compared to the last visit (something the bot knows).
This way the bot will be able to spider more efficiently, does not use that much bandwidth for repeat readings and (most importantly, especially for large and frequently changed sites) will read more of the new and changed pages more frequently.
Correct, and Google some bandwidth....all of which is good enough reason for correct implementation.
However, as for "Freshie" listings I believe it is a difference between the last cached and the current crawled.....which is not necessarily that fresh as Google may see the "same difference" for a few weeks for one change and multiple crawls. They need a better way to update the comparison IMHO so that the truly very, very, fresh are given the edge.
But that is only one indicator and they will spider pages again even if it hasn't been modified. Just if they see the file is modified_since they may be more apt to go deeper to see if other pages are also modified. (imho)