homepage Welcome to WebmasterWorld Guest from 54.227.171.163
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 189 message thread spans 7 pages: < < 189 ( 1 2 3 4 5 6 [7]     
New Google Patent Details Many Google Techniques
msgraph

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28814 posted 3:47 pm on Mar 31, 2005 (gmt 0)

Probably one of the best bits of information released by them in a patent.

Large number of inventors listed on here, even Matt Cutts that guy that attends those SE conferences. Explains a bit about what is already known through experience as well as comments made by search engine representatives.

Example:


[0039] Consider the example of a document with an inception date of yesterday that is referenced by 10 back links. This document may be scored higher by search engine 125 than a document with an inception date of 10 years ago that is referenced by 100 back links because the rate of link growth for the former is relatively higher than the latter. While a spiky rate of growth in the number of back links may be a factor used by search engine 125 to score documents, it may also signal an attempt to spam search engine 125. Accordingly, in this situation, search engine 125 may actually lower the score of a document(s) to reduce the effect of spamming.

USPTO version [appft1.uspto.gov]

< Note: the USPTO has at times either moved or removed this
patent. If that happens again, here's an online back-up copy:
Information retrieval based on historical data [webmasterwoman.com]>

[edited by: tedster at 3:04 am (utc) on April 10, 2008]

 

Charlie

5+ Year Member



 
Msg#: 28814 posted 12:32 am on Apr 12, 2005 (gmt 0)

so do adwords count for the ctr

Vadim

10+ Year Member



 
Msg#: 28814 posted 7:11 am on Apr 12, 2005 (gmt 0)

this is more a process of "training" than "tuning", which is why there's no particular problem of instability that results from throwing in a few dozen more variables.
(ronburk)

I don't think so. I believe that any irrelevant parameter increase the instability. We all actually observe such instabilities and most probably because of the irrelevant variables.

Here is simple example how it may happen. Let us imagine that someone decided that the word "hat" might be important for the fashion sites. It inserts the word “hat” and trains the robot, giving it the good and bad sites as examples. If it is the time when hats just become to be in fashion, there will be high correlation between the good sites that noticed this trend and the word “hat”. The word hat will get high weight because of the training.

Now what will happen when the hats becomes out of fashion? Probably, really good sites that noticed this trend first will loose their high index position because they stop using the word “hat”.

So irrelevant variables increase the instability and require manual corrections in data mills also.

Bottom line. If a parameter has a little relation to the good content, like, for example, the period of the domain registration, do not bother about it.

Even if the Google will use this parameter, it stops to do it quickly.

If it is irrelevant it is irrelevant.

Vadim.

ronburk

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28814 posted 2:40 am on Apr 13, 2005 (gmt 0)

Bottom line. If a parameter has a little relation to the good content, like, for example, the period of the domain registration, do not bother about it.

The success of data mining is the success of brute force formula construction doing a better job at discovering what is reliably relevant than humans can do.

Even if the Google will use this parameter, it stops to do it quickly.

Of course! By using data mining instead of manually tweaking via human intervention, Google can automatically abandon or de-emphasize a variable quite quickly -- and do the exact opposite if conditions change again.

If it is irrelevant it is irrelevant.

Again, the reason data mining is used more and more rather than less and less is that, whether it comes to interpreting astronomical images or deciding which supermarket customers should be mailed coupons for which products or interpreting search results, data mining can often do a better job of deciding what is relevant than humans can.

When the data relationships to be mined are quite complex, you can pretty reliably change that "often" to "always".

Vadim

10+ Year Member



 
Msg#: 28814 posted 1:42 am on Apr 14, 2005 (gmt 0)

data mining can often do a better job of deciding what is relevant than humans can.

It sounds like “Your Big Brother Data Mining Robot knows better than you what you need”. For this to be true, the data mining robot should have the life experience and knowledge of a particular human or at least a human expert in a particular subject. At present state of the art it is simply impossible because it means to solve the problem of the artificial intellect.

The keyword here is life experience, i.e. *history*.

So it seems that the Goggle patent is about the history and its importance in the search results.

This is the right direction but at present state of the art, it seems the only way to achieve the stability is to feed the data-mining robot with highly relevant criteria from the start. I.e. ask the human experts first.

The present Google instabilities seems show that they spend not enough efforts to select, or rather compose, really relevant criteria before enter them into their data mining robot.

Vadim.

mann

10+ Year Member



 
Msg#: 28814 posted 4:40 pm on Apr 16, 2005 (gmt 0)

>>62. The method of claim 61, wherein the indication of link churn is computed as a function of an extent to which one or more links provided by the linking document change over time.

63. The method of claim 62, wherein adjusting the ranking includes penalizing the ranking if the link churn is above a threshold.>>

Can any one make me understand what is threshold?
or can you please more specefic what google mean to threshold?

claus

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28814 posted 11:42 pm on Apr 16, 2005 (gmt 0)

I personally think that it means that the Google engineers do some calculations about the average number of links that change during some specified amount of time -- either in general, or for some subset of sites, or both -- and then they measure if your site is above or below that average rate of change for links.

I can't see from the quote if it is links pointing in, or links pointing out, or both.

neuron

10+ Year Member



 
Msg#: 28814 posted 7:53 am on Apr 17, 2005 (gmt 0)

57. The method of claim 54, wherein the linkage data includes a rank based, at least in part, on links and anchor text provided by one or more linking documents and related to the linked document.

Apparently, 62 and 63 are talking about one of two type documents, a "linking document" and a "linked document", but it did take a bit of searching to clarify that issue.

>>62. The method of claim 61, wherein the indication of link churn is computed as a function of an extent to which one or more links provided by the linking document change over time.

indication here is the change over time of the linking document itself, but "indication" could mean something such as "numerator" or "single factor or function".

>>63. The method of claim 62, wherein adjusting the ranking includes penalizing the ranking if the link churn is above a threshold.

The indication of 62 is then compared to the "threshold" of 63, but I do not see from where the threshold is calculated, it could be just this site's link churn threshold, it could be some type of semantic/topological/industry-specific threshold, it could be an overall threshold of all indexed documents, or it could even be one based on traffic, PR, or perhaps other factors; which makes the whole thing a mixed cauldron of complexity, though nonetheless, with its array of definitions, derivations, and indications of direction of change or adjustment of rankings, somewhat decypherable in that it would seem we could list all these things and along with positive and negative indications on rankings and come up with a list of pros and cons so that we will have a handy document/checklist by which comparisons could be made to specific pages and which might be useful to score our own pages, at least in a quasi-consistent manner relative to this patent application in particular.

I have believed for a long time that updating content on a site will give it a boost. This document contains specific reference to such in >>60. An analysis of this document is in order.

panicbutton

10+ Year Member



 
Msg#: 28814 posted 11:07 pm on Apr 17, 2005 (gmt 0)

"...the dreaded alternative of simply incrementally building a website with good content over time..."

Hahaha, you hit the nail on the head!

Someone mentioned they thought Google went blackbox after Florida. I absolutely agree with this point although I think it happened earlier than this.

It's weird that something (Google) can be valued at a zillion dollars when *nobody* knows how it works.

mann

10+ Year Member



 
Msg#: 28814 posted 6:11 am on Apr 19, 2005 (gmt 0)


Thank You Claus,

But I am wondring now what linking strategy should I implement for good PR and Rank too.

This 189 message thread spans 7 pages: < < 189 ( 1 2 3 4 5 6 [7]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved