Forum Moderators: open
PR of my site is 5 and I update my home page every day. Earlier Google use to catch my page every other day, but now it's once in a month. I have a robot.txt which is perfectly OK and my server is also up every time.
What can be the reason? All inputs are welcome.
there could be many reasons why your home or index page gets (or got) the honour of having a frequent spidering schedule from Google. I very much doubt having your page updated or changed every day is the single criterium. That would probably be too easy. My personal feeling is that pages that recently get new, inbound, mainly external links from important pages (high Pagerank) are more legible to frequent crawling or what used to be called Fresh! (the date stamp next to your listing on the Google results). Naturally your index page will most probably be the page of your site with the most inbound external links.
It could be, that once your index page stops getting inbound links, Google decides to stop giving your page the frequent crawling status, as seemingly other sites are finding your (home)page less interesting to link to recently. Once your page has reached a certain Pagerank threshold Google could decide you deserve a permanent frequent spidering/re-indexing. There have been postings here on WebmasterWorld stating a good DMOZ or Yahoo link would be enough to keep the frequent crawling status for that page and there is some good logic in that as Google tends to value both directories.
Furthermore, even the highest ranking sites do not carry a Fresh date (cache) stamp every day. Google tends to skip a day every once in a while.
I do not understand why you should rank higher once you have been recently crawled and indexed for a couple of days though. This should only be the case for recently added keywords not on your homepage a month ago.
Yes, as I told earlier I update my page every day and hence the keywords are added or removed allmost every day. I can understand the logic of been crawled daily if one have a higher page rank or regular increase in inbound links. But my problem is why google regulary revert back to my one month old page.
Even I know there are many inbound links for my internal pages, but those backward links are not shown in google. All my internal pages are having a PR of less then 4. Also backward links of my home page dose not include my internal pages.
But my problem is why google regulary reverts back to my one month old page.
That is a good question. Why not keep the most recent cached version, instead of reverting back to old stuff?
I have a feeling a lot of this frequent crawling + plus short reindexing/caching is still in the testing phase.
One of the moderators recently had a nice analogy of two different crawlers (father-crawler and son-crawler or something similar). The father crawler does the heavy monthly crawling and indexing. The son-crawler only goes for so called fresh stuff and interferes the normal result rankings by adding in some recently crawled data (such as your newly added keywords) when relevant. Probably the son-crawler has a limited database size for the moment and therefore the cache reverts back to dad's database every so often when one's webpage is not considered fresh enough ;).