Does history cause time related filters

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Does history cause time related filters

Filters on Google

Whitey

3:35 am on Feb 28, 2007 (gmt 0)

This caught my eye:

Sites out of the Sandbox [webmasterworld.com]

We have 9 repaired sites and of those 2 that have not come back into the SERP's after 8-9 months. They all have similar architecture and varied content.

The 2 that are filtered out of the SERP's are the oldest and best supported in terms of quality inbound links. However, they did have duplicate content issues which lasted for 8 months up to around Jul06. They also have the best PR at 5 & 6 .

However, the previous 7 had little history, were also duplicate content affected and are the ones that preceded them into the SERP's after just 3-4 months.

My sense is there is a continual revalidation process going on.

Does anyone have any thoughts as to whether there are time delays applied to Google's algo with filters which prolong the recovery of a previously affected site based on their history? Any ideas how this works?

tedster

4:19 am on Feb 28, 2007 (gmt 0)

This may not be a direct answer, but I feel a study of Google's Information retrieval based on historical data [appft1.uspto.gov] can reveal some important possible factors. The patent application lists many kinds of historical data and details a large number of potential uses.

Somtimes the language is remarkably vague and ambivalent:

[0039] Consider the example of a document with an inception date of yesterday that is referenced by 10 back links. This document may be scored higher by search engine 125 than a document with an inception date of 10 years ago that is referenced by 100 back links because the rate of link growth for the former is relatively higher than the latter. While a spiky rate of growth in the number of back links may be a factor used by search engine 125 to score documents, it may also signal an attempt to spam search engine 125. Accordingly, in this situation, search engine 125 may actually lower the score of a document(s) to reduce the effect of spamming.

The patent also details some of the minutiae that Google watches:

...freshness of a link associated with the document is based on at least one of a date of appearance of the link, a date of a change to the link, a date of appearance of anchor text associated with the link, a date of a change to anchor text associated with the link, a date of appearance of a linking document containing the link, and a date of a change to a linking document containing the link

These two sections caught my eye as potentially affecting the recovery rate of urls that were previously penalized.

41. The method of claim 1, wherein the one or more types of history data includes information relating to a prior ranking history of documents; and wherein the generating a score includes: determining a prior ranking history of the document, and scoring the document based, at least in part, on the prior ranking history of the document.
42. The method of claim 41, wherein the scoring the document includes: determining a quantity or rate that the document moves in rankings over a time period, and scoring the document based, at least in part, on the quantity or rate that the document moves in the rankings.

[edited by: tedster at 1:30 am (utc) on Mar. 1, 2007]

Whitey

10:16 pm on Feb 28, 2007 (gmt 0)

This was a fascinating read - thanks for providing the link.

On a general level it certainly demonstrates that Google's thinking strongly considers time, quality and change as a key relationship for scoring documents. Time and change appears to be related to:

- Document Inception Date
- Content Updates/Changes
- Query Analysis
- Link-Based Criteria
- Anchor Text
- Traffic
- User Behavior
- Domain-Related Information
- Ranking History
- User Maintained/Generated Data ( e.g. Bookmarks, Facourites )
- Unique Words, Bigrams, Phrases in Anchor Text
- Linkage of Independent Peers ( Un natural link patterns )
- Document Topics ( Themes )

Patterns in the past ( ie History ) appear then to have a direct relationship to the current scoring.

There is more in this document than my head can absorb in one go,[ perhaps ever ] but it helps to manage a perception of what Google's emphasis is.

One thing that strikes me is that it seems a sounder strategy to get things right from the start and keep building, rather than try to repair or change. I recall [ somewhere ] that Adam Lasnik emphasised this, and several Senior Webmasters have as well.

On this basis, the so called "trust rank" factor builds as an underscoring foundation to potential "hiccups" which may otherwise undermine a site.

My perception is that a lot of webmasters that do not get the foundations right in the beginning may therefore be faced with challenges for sustaining their websites.

IMO - the task becomes harder with time as filters kick in and get scored into the hisory of the site.

I wonder, can such sites recover "trust rank" with good changes?

What are the practical experiences of webmasters overcoming historic patterns of websites they have become custodians to?

walkman

1:25 am on Mar 1, 2007 (gmt 0)

what about the sites that clean up, resubmit and move to the top of the serps?

we have it many times? Fresh start from g?

walkman

1:03 am on Mar 2, 2007 (gmt 0)

>> fyi - we have some filter holding 2 of our sites out of the index for almost 9 months- I've no idea why

are you out of index, or out of top rank?

Whitey

2:07 am on Mar 2, 2007 (gmt 0)

We are indexed, but filtered out of top rank. In fact virtually any rank for most pages.

Prior to the progressive release of the other 7 sites they followed a similar pattern. Results were behaving the same.

[edited by: Whitey at 2:12 am (utc) on Mar. 2, 2007]