Forum Moderators: open
Now, my current opinion is about Google freshness. I know about the dance, the deepcrawl and freshbot. I know that freshbot comes by and indexes my updated pages. I know that if they are not fresh the next day, they disappear the day after in favor of a page that is weeks old. The question is, WHY? If it's in the index for a day, why can't it stay there? Why does Google think that the weeks old page is more important or useful to the surfer? This is what bothers me with freshbot. Can someone tell me the point? I don't understand, and that is not good for someone who is so opinionated.
Thanks,
Hank (TheComte)
The good rated PR sites - important to the eyes of web surfers - with content modified every days or weeks might get permanently updated like that in the future (when they will have sorted the spamming issues), and the others might only be indexed during the deepcrawl.
--- Anyway, just a wild thought of mine ---
Leo
It seems like Google could provide more relevant search results if the "fresh index" was cumulative, with each fresh index being added to the previous one, rather than replacing it.
After all you still have all the other options in the SERP to click on havent you? So the loss of one page even though it was fresh, but not completely evaluated by the algo is not a biggie surely?
So the loss of one page even though it was fresh, but not completely evaluated by the algo is not a biggie surely?
No, that's not a biggy. None of my sites are a biggy in the great scheme of things. The question is why does Goggle prefer my older pages? They are not relevant. The current page is relevant. Can their algo determine that? I just get fed up with so many generalities in this world as opposed to specifics. My child is mine. If you don't agree, we'll cut him in half. Is that the way it should be? It seems that Google is going toward the big business paradymn of presenting the most bland content possible. Is that a good thing? Maybe.
Understand what you are saying. That may apply to YOUR site, but may not apply to thousands of others that are working and spamming Mr Freshy for all its worth. (You just have to see a few posts today!) In other cases, not yours, the new content may indeed not be more relavant - it may be fresher, but it has not been properly evaluated for PR etc. it may actually be "freshy spam" - so Google may think its better to give freshed pages their day in the sun and tuck them into bed until they get fully evalauated in the next update.
Things may change however - and im sure will. Im sure that freshy is still beta.
My model is based on the news site. Three days ago results for 'iraq latest news' picked up pages with that title. Today results for the same term drop those pages and pick up today's pages with that title. Hence the results stay fresh. Hopefully the pages that the main googlebot picks up are relevant in the longer term.
So, the fact that new pages fall out of the index and are replaced by older ones just indicates that the storage capacity for fresh pages is limited, and Google can't store all the fresh pages. When they run out of room for your site's fresh page(s), your page's fresh listing gets dropped, and the search results listing reverts to the data stored in the deepcrawl index.
So, it's a technical limitation of the current experimental setup; I'll bet they are deeply enjoying the discussion here about how to manipulate the freshbot, and are busily designing-in safeguards for the next beta version. If Google does roll-out the freshbot as a full implementation, I think most of us will be much happier, except fot those who are "faking" fresh content.
Freshbot is probably necessary because as the Web grows, it may become impossible to re-index the whole thing and re-calculate PageRank once a month. I recall the figure is something like seven million new pages a day? PageRank calculation probably already takes a log time, and will take longer and longer as the Web grows. Freshbot is a useful experiment to learn how to find the content that does need to be updated monthly, and to differentiate it from that which is static over time. Integrating the new pages it discovers using an "approximate" or "guessed" PR method may make it possible to reduce the monthly PageRank calculations which must be performed as well.
The above is my opinion - I don't have any inside track to Google.
I hope the experiment is successful - It will vastly reduce the energy I must put into managing client expectations; No-one likes to wait 30-60-90 days to be able to find their Web site well-indexed, especially when search results for all the other sites are already available almost instantly.
YMMV,
Jim
John wrote:
>>results for 'iraq latest news' picked up pages with that title<<
Ah ha! looks like we are competitors. Ours probably replaced yours yesterday, and yours will probably replace ours tommorrow. Keep on writing new pages to keep overwriting our pages, and keep it specific to certain aspects of the war, competing with the possible millions of pages on Iraq news at the moment is a hard ask, but optimizing the iraq war news for a certain spin - eg military strategy, peacenik moverment, humanitarian aid, implications for trade, media coverage aalysis means you are competing in a much smaller pool, and more targeted to the aims and readership of your site.