Page is a not externally linkable
BillyS - 5:52 pm on Mar 12, 2005 (gmt 0)
Spam is a problem for google. Spammers can easily get hold of popular keyword or phrases. Google also creates a primary results database that is fast and is based on frequently queried terms. Ahh, a good fit for the engineers - right? Exclude certain websites from the primary index and you can return fast, stable and spam free results. The casualty will be new websites of high quality - right? But why should the engineer care about that? If a search term is popular, then it has probably been asked and answered millions of times. If your asking for information on "hotels in california" then how stale can sites greater than 1 year old be? (pretending the filter is based on time). But how to keep spammers out of the primary index or database - sandbox sites? That is the challenge of the engineer. How can they mark a page or site to indicate that this does not qualify for inclusion in the primary index? One way is to create a composite filter that results in a "spam score." They have a lot of information so this is quite easy. The problem for us it to reverse engineer what the score might be based on to prevent it from tripping - if that is even possible. Personally, I think that Google is giving us two pieces of information to work with already: 1. - brand new websites seem to avoid the sandbox in their first few weeks of existence. 2. - The link: command does not return all the links that google is aware of. It appears to be broken - or is it? When I am involved in problem solving at work, the first thing I like to do is list the facts. Then see if they start to group naturally. Then start to ask the "why?" There is a lot of good information that can be shared since it appears many sites were just released. If we keep taking the same approach, we will keep coming up with a dead end.
Let's pretend that I am on the right track and there are one or more databases that Google uses to return results. I am going out on a limb here and stating the Google is competent and does not make mistakes, only compromises...