Forum Moderators: open

Message Too Old, No Replies

Opinions and Freshness

Or George, you should have listened.

         

TheComte

5:38 pm on Apr 4, 2003 (gmt 0)

10+ Year Member



Ok, I have opinions. Anyone that knows me will tell you that. If I pass you on the street, I will pass on my opinions, even if I have to wrestle you to the ground to make you listen. If George Washington had listened to me, we would still be a British colony. So, see, I have credibility.

Now, my current opinion is about Google freshness. I know about the dance, the deepcrawl and freshbot. I know that freshbot comes by and indexes my updated pages. I know that if they are not fresh the next day, they disappear the day after in favor of a page that is weeks old. The question is, WHY? If it's in the index for a day, why can't it stay there? Why does Google think that the weeks old page is more important or useful to the surfer? This is what bothers me with freshbot. Can someone tell me the point? I don't understand, and that is not good for someone who is so opinionated.

Thanks,

Hank (TheComte)

John_Caius

5:41 pm on Apr 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The point of freshbot is that it highlights pages that are fresh *now* not 'were fresh a week ago'. The idea is best exemplified with a breaking news story. If you search for 'iraq latest news' you'd hope to pick up what is latest news now, not what was latest news last week.

TheComte

5:45 pm on Apr 4, 2003 (gmt 0)

10+ Year Member



But that's exactly the point, John. For myself, if I am searching for a news item, I would rather see the story that was updated two days ago rather than two or three weeks ago. Same analogy for my web sites. I think the pages that are two days old are more useful than the pages that are two weeks old.

le_gber

5:46 pm on Apr 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The freshbot might be some kind of experiment from Google to see if it is viable for them to update their index every day.

The good rated PR sites - important to the eyes of web surfers - with content modified every days or weeks might get permanently updated like that in the future (when they will have sorted the spamming issues), and the others might only be indexed during the deepcrawl.

--- Anyway, just a wild thought of mine ---

Leo

swerve

5:54 pm on Apr 4, 2003 (gmt 0)

10+ Year Member



I agree with Hank. In the current system there are two extremes: "now" (fresh) or "up to a month ago". The result is that searchers often don't receive the most relevant search results.

It seems like Google could provide more relevant search results if the "fresh index" was cumulative, with each fresh index being added to the previous one, rather than replacing it.

chiyo

6:09 pm on Apr 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think the key reason for this is that since fresh pages go up straight away, there is not enough time to evaluate their real authority, via PR or other methods. I know a lot of our fresh content goes up, but in 90% of cases it does not really deserve it. The one day appearance (i thought it was a bit longer?) just reflect a compromise between freshness and credibility/authority.

After all you still have all the other options in the SERP to click on havent you? So the loss of one page even though it was fresh, but not completely evaluated by the algo is not a biggie surely?

TheComte

8:09 pm on Apr 4, 2003 (gmt 0)

10+ Year Member



So the loss of one page even though it was fresh, but not completely evaluated by the algo is not a biggie surely?

No, that's not a biggy. None of my sites are a biggy in the great scheme of things. The question is why does Goggle prefer my older pages? They are not relevant. The current page is relevant. Can their algo determine that? I just get fed up with so many generalities in this world as opposed to specifics. My child is mine. If you don't agree, we'll cut him in half. Is that the way it should be? It seems that Google is going toward the big business paradymn of presenting the most bland content possible. Is that a good thing? Maybe.

chiyo

8:40 pm on Apr 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>why does Goggle prefer my older pages? They are not relevant. The current page is relevant.<<

Understand what you are saying. That may apply to YOUR site, but may not apply to thousands of others that are working and spamming Mr Freshy for all its worth. (You just have to see a few posts today!) In other cases, not yours, the new content may indeed not be more relavant - it may be fresher, but it has not been properly evaluated for PR etc. it may actually be "freshy spam" - so Google may think its better to give freshed pages their day in the sun and tuck them into bed until they get fully evalauated in the next update.

Things may change however - and im sure will. Im sure that freshy is still beta.

John_Caius

8:56 pm on Apr 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sorry, I think we actually agree.

My model is based on the news site. Three days ago results for 'iraq latest news' picked up pages with that title. Today results for the same term drop those pages and pick up today's pages with that title. Hence the results stay fresh. Hopefully the pages that the main googlebot picks up are relevant in the longer term.

jdMorgan

9:29 pm on Apr 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I believe that GoogleGuy stated at the outset that the freshbot is an experiment (but I can't find that thread). Given that, and the observed behaviour of page listings, it is fairly obvious that the fresh results are processed and stored separately from the main index, and are then injected in the search query response phase.

So, the fact that new pages fall out of the index and are replaced by older ones just indicates that the storage capacity for fresh pages is limited, and Google can't store all the fresh pages. When they run out of room for your site's fresh page(s), your page's fresh listing gets dropped, and the search results listing reverts to the data stored in the deepcrawl index.

So, it's a technical limitation of the current experimental setup; I'll bet they are deeply enjoying the discussion here about how to manipulate the freshbot, and are busily designing-in safeguards for the next beta version. If Google does roll-out the freshbot as a full implementation, I think most of us will be much happier, except fot those who are "faking" fresh content.

Freshbot is probably necessary because as the Web grows, it may become impossible to re-index the whole thing and re-calculate PageRank once a month. I recall the figure is something like seven million new pages a day? PageRank calculation probably already takes a log time, and will take longer and longer as the Web grows. Freshbot is a useful experiment to learn how to find the content that does need to be updated monthly, and to differentiate it from that which is static over time. Integrating the new pages it discovers using an "approximate" or "guessed" PR method may make it possible to reduce the monthly PageRank calculations which must be performed as well.

The above is my opinion - I don't have any inside track to Google.

I hope the experiment is successful - It will vastly reduce the energy I must put into managing client expectations; No-one likes to wait 30-60-90 days to be able to find their Web site well-indexed, especially when search results for all the other sites are already available almost instantly.

YMMV,
Jim

chiyo

5:10 am on Apr 5, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jdmorgan.. good points, though im still not convinced its a storage limit problem. I think its more to do with the amount of evaluation they can do for a page in a few hours. For example, its impossible to determine PR without comparing that page to all other pages - which they can only do once a month. In the meatime they have a very rough "guess" based on just a few criteria.

John wrote:

>>results for 'iraq latest news' picked up pages with that title<<

Ah ha! looks like we are competitors. Ours probably replaced yours yesterday, and yours will probably replace ours tommorrow. Keep on writing new pages to keep overwriting our pages, and keep it specific to certain aspects of the war, competing with the possible millions of pages on Iraq news at the moment is a hard ask, but optimizing the iraq war news for a certain spin - eg military strategy, peacenik moverment, humanitarian aid, implications for trade, media coverage aalysis means you are competing in a much smaller pool, and more targeted to the aims and readership of your site.