I'm developing a site that relies on a data feed which is updated about every hour. I'm caching it anyway as to not strain the other server and to speed up page loads.
Using cached data, the page loads are very quick, usually under 0.01 seconds, but when this data needs to be updated on the fly, it can take between 1-20 seconds. Typically its 3-7 seconds. Unfortunately, its not feasible to preload and cache this data.
First of all, is that kind of page load time going to be a problem for spiders? Let's say pages with expired cache data load in an average of 5-6 seconds. I'm really only concerned about Googlebot and Inktomi/Slurp. If it would be a problem, would it be OK to serve a page containing the most recently cached data to those spiders?
I don't see a problem with it. Even a shadow spider would only see insignificant differences. Titles and headings would always be the same. Any thoughts? Am I taking a risk at all by doing that? I know Inktomi is known to send shadow spiders...would slight differences set off any cloaking flags?
I've never did any testing to see how long Slurp or Google will stick around for a page request, though it would seem that 5 seconds isn't very long.
Still, I'd think feeding the bots the cached pages makes sense -- there would be less chance of getting a server error indexed. I can't see how this would be considered a negative thing in the eyes of the search engines.