I was thinking the same thing Brett. As an overall system it seem patentable but many of the individual points seem like they would be self-evident for many a search engine company.
Patents are a sticky business.
Side Note: I always wondered what it would be like to be married to a patent lawyer...if they spoke like they did their work. :-) That would be exhausting.
[edited by: StephenBauer at 8:29 pm (utc) on Mar. 31, 2005]
Maybe the patent is a double sly. Good way to throw everyone off the real scent :)
Filed it 1 day too early.
> but am surprise they feel it is patent worthy.
It seems almost like the patriot act of cyberspace. More than a patent it looks like the new "law" by which we will be judged.
Google is laying down the law.
Other search engines may try to make this document their new doctrine as well. Thus the patent.
> but am surprise they feel it is patent worthy
I was always under the impression that for something to be patent worthy it had to be new, not already publicly available..
I'd say the majority of what is there has been discussed in places such as WebmasterWorld extensively?
They could have summarised it and added it here:
"I'm not surprised by what's in the doc, but am surprised they feel it is patent worthy"
I doubt they'll sue Y! (or win if they did) if Y! started to use it, most of this is common sense. They just put it in writting.
What bothers me:
1. You can buy sitewide links with totally unrelated terms and your competitor is toast. Someone could it to you. If you read this, links and anchor text are still the king, either directly or inderectly they make about 90% of the ranking. They can make or break you or your competitor. The prices will defintely go down now, so it's not expensive to nuke your competitor.
2. If your ranking are bad because of 1, and somehow manage to remedy it, you're stuck in a cycle because previous rankings matter, and the changes are too drastic.
So to summarize, to rank well yo need to:
-Carry on building links for ever. Old links decay in value.
-Don't add them too fast, or risk the spam filter.
-Don't add them too slowly or you won't get enough.
-Get links from fresh pages
-ask linking site to move the link to a different page to make it "fresher"
-Vary anchor text over time
-Don't change the content of your key pages as not to reflect incoming anchor
-register your domain for several years
-Use a solid server for nameservers (whatever that means)
-add new pages/content to your site all the time (poor bastards with lots of content need to add content faster)
-get all your buddies to add your site to their favourites
-link out to Amazon
-Put Adsense on your site and make sure it gets good clickthrough
-Rank well in the past because Google counts your old rank in the current rank.
-Don't jump up and down in the serps too much. Google likes stable rankings.
-make your site sticky. Sticky sites are favoured.
Anything I missed?
Is it possible that favoring a site with one advertiser over a site with another advertiser could raise issues regarding Restraint of Trade?
Before i've read it i've read this thread, and now i seriously doubt if i should read it at all.
On second hand impression it sounds just like a long "wish list" of sorts - the kind of thing that really should have nothing to do with the patent system, but would seem appropriate for a long evening at the local pub.
"On second hand impression it sounds just like a long "wish list" of sorts"
they'll probably not implement them all, they just listed them there to warn /scare us or "patent" it.
Great find, msgraph! Thank you.
|For example, search engine 125 may determine that a query is likely commercial if the average (median) score of the top results is relatively high and there is a significant amount of change in the top results from month to month. Search engine 125 may also monitor churn as an indication of a commercial query. For commercial queries, the likelihood of spam is higher, so search engine 125 may treat documents associated therewith accordingly. |
Thanks Slydog for the summary
Now I'll just wait for Claus's and I'll be set.
Excellent find. And good summary too.
* Good ol' fashion sites with lots of content
* Backlinks & anchortext still relevant?
* Please do this so we may continue serving "relevant" search results
* Your competition can really screw you.
* How does the blogging phenomenon come into play?
Nice! More information like that can be real useful.
|-add new pages/content to your site all the time (poor bastards with lots of content need to add content faster) |
LMAO - I nearly fell off my chair when I read this too. What shot through my brain was - the more I write, the faster I need to write.
By the way, it appears the domain registration information for WebmasterWorld was tweaked today. Perhaps Brett is taking this stuff seriously too.
I use my own nameservers. Does that mean I am doomed to oblivion?
When is April 1st? Seems a bit early.
> I'd say the majority of what is there has been discussed in places such as WebmasterWorld extensively?
LOL. My first thought was that either members of this forum must have contributed to their brainstorming or that they must have hired seudo-optimizers and quasi-spammers. Scoring might now be based in both users' and webmasters' behaviours. But it is this last point that is unusual. The detail in which they describe all the different ways in which webmasters can alter their documents is just impressive. This effectively will regulate, if not control, the way we work.
It's interesting. Presumably Google got a bunch of very smart people together in a room with a large whiteboard and said "Come on, let's list ALL the ways we could score/rank websites in our results, no matter how wild" and then filed the patent. Covers them for the next N iterations of search engine innovation if it gets granted, no matter how many of the individual ideas within it they implement. Smart move.
> I'd say the majority of what is there has been discussed in places such as WebmasterWorld extensively?
Yeah, I reckon Matt Cutts got more from the SEO's at pubcons than they did from him.
I think the people on this board could have easily come up with a mish-mash of techniques far more substantive than these. It's an attempt to try to box in MSN and Yahoo by patenting eveything imaginable I'd guess, but not a very good one.
I do think though that Google is weighting some "signals of quality" (or maybe better put.. far-off page factors) heavily right now.
Perhaps Google will soon involve tarot cards and tea leaves in their algos as well.
But Brett, what other way is there to legitimately and innocently get it publicly published? This should do very nicely to confuse and scare the pants off even more people than ever.
|I'm not surprise by what's in the doc, but am surprise they feel it is patent worthy. |
If they are using a lot of these tactics, it would be sad. I see many of these "tactics" as stereotypes of websites, and not the best way to determine relevancy.
I'm certain that there are a lot of sites that don't renew 10 years in advance. I'm sure there are a lot great sites that don't get bookmarked. I'm sure there are a lot of information sites that don't add new pages everyday, nor change the current information they have up.
I'd imagine that this would open the door for toolbar spamming, much like what has happened with Alexa. We'll be seeing people run bots to go from site to site across their own network all day. We'll be seeing competitors run ROS links to competitors on 40,000 page sites.
You'd think that over time, Google would have complex ways of determining sites relevancy to specific keywords. However, they are using amateur techniques that can be manipulated moreso than their current linking structure allows. This certainly opens the door once again, which is fine by me.
I suppose when people continue to ask "What happened to Google's results?", we now have a more legitimate answer.
Great find msgraph!
I think the following defines the sandbox:
| Search engine 125 may take measures to prevent spam attempts by, for example, employing hysteresis to allow a rank to grow at a certain rate. In another implementation, the rank for a given document may be allowed a certain maximum threshold of growth over a predefined window of time. |
|If they are using a lot of these tactics, it would be sad. I see many of these "tactics" as stereotypes of websites, and not the best way to determine relevancy. |
Well, it's profiling. In the same way that an insurance company will say that you're a high risk of crashing a car because of your age and sex, even if in reality you're the most careful driver in the world.
The thing with profiling for web sites is that a lot of innocent websites will get hit in the process, however in the grand scheme of things it will greatly reduce spam.
How it will effect relevancy, I'm not sure; I've never operated a large scale hypertext search engine!
I like the idea of credit for number of bookmarks. I have a sizable % coming in on bookmarks. Bookmark visitors stay longer as well.
I don't want to be punished for being evergreen though. I'm sure my rate of new links is much slower now.
I suppose we will all have our favorites and most dreaded possibilities from that list. Who knows how it will really be implemented though.
One interesting thing is that they seem to be looking more at the whole site and less at individual pages.
From where I sit:
Filing a patent requires disclosure, so it is only done after considering the balance of giving up your secrets (risk/benefit). If it's not a strong patent, as Brett suggests, are the benefits in the public disclosure?
A patent is not a plan. No one said they were doing what is in the patent application. I've seen IP produce patents that the tech people laughed about, because they were unimplementable or common knowledge or in conflict with other activities.
Are the hard core search people represented? Scientists want their names on quality work, and not on sub-standard work. Who is on this?
I listened to Cutts say webmasters should host at data centers close to their geographic locations, to assist local search. Using length of registration term as a trust factor is just as naive. Sometimes these ideas are so ridiculous you have to wonder if people like Matt are technically competent, floating balloons to gage response, or deliberately trying to influence opinion to make life easier for G at the expense of good sense.
Game on for sure, with a renewed enthusiasm. GoogleTrust is at an all time low.
My site is still pagerank 0 after about 6 months "live". Quite a few Yahoo links, and some MSN links, but few G links.
My WHOIS info is obscured by domains by proxy courtesy of Godaddy.
Could this be part of the reason why?
Anyone with proxied Whois info with Pagerank >0?
If not, I'm taking myself off the proxy!
Didn't Google become a registrar so they could look at the whois info anyway. Or would it still be hidden?
"I listened to Cutts say webmasters should host at data centers close to their geographic locations, to assist local search."
Is it possible to see restraint of trade applications in that statement? I'm just asking.
Seems that being required to favor one commercial enterprise over another in order to receive some benefit from a third party might be seen as restraint of trade. This might also apply to the idea (in the patent being discussed here) that G could consider a site using Advertiser A more relevant that a site with ads from advertisers B.
Again, I really don't know, just asking.
I see too much confusion here. I'd like to guess that most people being worried really didn't took a look at the (26 pages printed) document.
I would not even dare to resume the patent in a set of 7-10 statements, because it would be too much generalization.
|16. The method of claim 15, wherein the scoring the document includes assigning a higher score to the document when the document is selected more often than other documents in the set of search results over a time period. |
How can this be done?
Is google going to implement 302 exit link counters?
But, are they tracking the clicks on SERPS already?
|(3) the extent to which the advertisements generate user traffic to the documents to which they relate (e.g., their click-through rate). Search engine 125 may use these time-varying characteristics relating to advertising traffic to score the document. |
I heard a few posts here relating this to AdSense CTR, but...
It seems they are referring to rating a page based on the amount of traffic it generates to it's (affiliate) advertisers links. That makes sense on some levels (spamming high traffic keywords with off-topic aff ads) but not at all on others (forums and such will have low CTR no matter the relevance of the aff ads because people don't want to leave via ads).
|Excellent point. Another hole in the "can't be hurt by competitors" claim. |
Lot of leaks to plug in that one...
Pretty comprensive there, SOD ;-)
|On second hand impression it sounds just like a long "wish list" of sorts. |
Yea, looks like the "you're dreaming" lists of button pushing tools I give to my programmers.
[edited by: nuevojefe at 12:11 am (utc) on April 1, 2005]
| This 189 message thread spans 7 pages: < < 189 ( 1  3 4 5 6 7 ) > > |