Forum Moderators: Robert Charlton & goodroi
40. A method, comprising: aggregating information regarding documents that have been removed by a group of users; and assign scores to a set of documents based on the aggregated information.
41. The method of claim 40, wherein aggregating information regarding documents that have been removed by a group of users includes: identifying a set of legitimate users and a set of illegitimate users; and collecting information regarding documents that have been removed by the set of legitimate users.
42. The method of claim 40, wherein aggregating information regarding documents that have been removed by a group of users includes: identifying a set of users with a defined relationship; and collecting information regarding documents that have been removed by the set of users.
Removing documents [appft1.uspto.gov]
Abstract...The system may aggregate information regarding documents that have been removed by a group of users and assign scores to a set of documents based on the aggregated information.
[0103] The IR score, link-based score, and remove list score may be combined in some manner to generate a total score that is assigned to a document. The assigned scores may be used to rank the documents (block 1830).
The section (Improving Search Results) where you pulled out that 2nd quote is a good read.
Intresting suppose I hire a group from another country get them a proxie server or servers so the ip is random and begin hitting my competition with delete actions
wonder how many will be enough to get their standings hurt or diminished...
The most legitimate, authenticated group of users I can think of are Adsense publishers who put fraudulent, replicated and garbage sites on their filtered list.
Why not AdWords "content network" advertisers who put fraudulent, replicated, and garbage sites on their filtered lists?
Why not AdWords "content network" advertisers who put fraudulent, replicated, and garbage sites on their filtered lists?
if they can nail down legitimate users
I imagine Google would have little trouble in filtering out illegitimate users. Think about all the information they'll have about a Google account – search history, click history, browsing history (through analytics), just to name a few variables in their arsenal. They'll be able to tell exactly how legitimate a user is and weight their affect on the search results accordingly. If this were a simple “one user, one vote” system, they wouldn't be applying for a patent.
Faking a natural looking Google account using raw CPU power would probably be about as difficult as getting your computer to generate a creative writing essay for you. The way I see it, it would require man hours to create a set of legitimate looking accounts and use them to influence the search results. Man hours that would probably be better spent doing real SEO.
[edited by: SiteOrigin at 10:46 pm (utc) on Feb. 22, 2007]
True, as verification for actual users, not as what filters they use for their own biz. Site filters for monetary reasons have nothing to do with search results. Sure it can help weed out crap sites but you have many sites that play the adsense ad space cramming game yet have good content.
[edited by: msgraph at 2:26 am (utc) on Feb. 23, 2007]
Intresting suppose I hire a group from another country get them a proxie server or servers so the ip is random and begin hitting my competition with delete actions
First, "a" proxy server obviously won't make the IP addresses look random at all. Second, even if you got 100 proxy servers (how much money ya got to spend on this?), you then have to make the activity look "normal". How many deletes per week does a "normal" user perform? What are the statistical norms for other activities Google can detect (searches, Google account activities, Google toolbar activity, etc.).
When there's no penalty for collateral damage, Google can afford to do auto-detection that's pretty good at eliminating the bad guys. Just ask anybody who got auto-banned from AdSense because one of their students went to the computer lab and clicked on their ads for an hour every day.
As someone who used to work at a search engine I see the patents as advancements. Yes this will make our lives as SEO's a little more difficult but adaptation is a key that has driven this business for years.
Most of the old SEO's who realized that they couldn’t cheat as easily have dropped out. The one thing I have learned is that if you have to use "Special Tactics" to drive rankings it will most likely back fire at some point or another.
As marketers in general we know that if you can build a solid offering that is compelling your business will thrive. SEO seems to be moving in this direction every year. This theme is actually exciting me rather than scaring me.
I think personal search will be a great function and will help move us forward.
I am sure people will try to game this system to take a short cut, but as we know Google and other engines are very good at detecting and eliminating this over time.
I am going to think about this more and blog about it, best of wishes.
First, "a" proxy server obviously won't make the IP addresses look random at all. Second, even if you got 100 proxy servers (how much money ya got to spend on this?), you then have to make the activity look "normal". How many deletes per week does a "normal" user perform? What are the statistical norms for other activities Google can detect (searches, Google account activities, Google toolbar activity, etc.).When there's no penalty for collateral damage, Google can afford to do auto-detection that's pretty good at eliminating the bad guys. Just ask anybody who got auto-banned from AdSense because one of their students went to the computer lab and clicked on their ads for an hour every day.
I think that this will be based upon login information as well. It is very interesting that Google was requiring a referral to join Gmail. Now they are asking for an active mobile number. I am sure that their central user id system will have a direct impact on votes. I would also imagine that it may take hundreds of extremely unique searches and votes to validate a site or not.
doing real SEO.
I think that SEO in general could be on its way out in google to a degree, for example I will guess they will be looking at the way the user interact with the website they are on, how long they spend on a site and if they buy and so on so the point i am getting at would be to make your website as good as possible and if the users like it based on its history then you'll rank better or atleast get a good "quality score" ;)
RJ
Wow this is going to be interesting to see in action.
We know they've been collecting the data to do this for at least two years. We've discussed it here before. The "removal" is users clicking on the "remove this from the search results" link in google searches.
The patent was filed in August '05.
Google will not have waited until the patent got granted (or not) before actually using it in the field.
I wouldn't expect any wholesale changes that we haven't already seen.
So the actions of individual users of personal search - taken in aggregate - may affect all search users? Is that how we can read this?
I believe that's about the size of it.
I'll see if I can dig up the old threads.
Just for absolute clarity, this is not a new patent application. This is a patent application from August 2005. The only thing that has changed is that it just got granted by the patent office.
TJ
"New! Google finds the search results most relevant to you, based on your search history. Learn more."
It links to this page: [google.com...]
Its probably been around for ages and I just haven't noticed it. However it's interesting that they've decided to promote it now.
I remember people discussing this issue of excluding results when they promoted the custom search as "build your own niche search engine". People knew that this was about collecting data ever since last year, and Google knew that we knew so they are probably very cautious using anything they gathered.
doing real SEO.I think that SEO in general could be on its way out in google to a degree,
I disagree. I think it will just be that the definition of SEO will change.
It will include (well, it already does)
1. helping your clients bait links to develop "natural" links (whatever they are),
2. reviewing content not only for keyword placement, density and semantics, but also stickiness i.e. how well a site responds to a search query,
3. helping your clients get the balance right between monetising your site traffic immediately and developing a longer term relationship with the user,
4. Writing titles and manipulating the Google snippit so that you not only get higher rankings but also a higher CTR (which in turn will get you higher rankings)
I could go on, but I think that makes my point. For some time, good SEO has not been simply a question of tagging, link building and doing metrics on page content. Increasingly it incorporates more creative and subjective elements. SEO's are becoming real professionals.
Doing it properly means having a certain security that, whatever algo changes are afoot, your clients are going to come off better.
no more open to gaming than link voting
Which is INCREDIBLY open to gaming :)
High volume crappy links still work in Google in competitive markets (debt, hosting).
Google is not as clever (yet) as lots of people give it credit for. Google isn't intuitive - it's not AI. It has to use simple 'yes/no' rules (however many, however inter-related) to make its decisions. If they are going to try to look for pointers of human / natural activity these have to be broken down into simple patterns that can be identified. These could be re-created. Google have to know that.
In an internet full of keyword-dense pages, their algorithm based on links was perfect - until people figured out how it worked. IF they implement a system like this, it will be cracked and the information put out there in time. Then it will be open to abuse.
they will be looking at the way the user interact with the website they are on, how long they spend on a site and if they buy and so on
Say I want train time info; I do a search, get my info in 5 seconds and I'm gone. I want to order a pizza and do a local search; I get the phone number and I'm done. I've just spent 1 hour reading user reviews of the best, cheapest MP3 players and been recommended to use 'mp3s-r-us.com' and buy a particular model. I go to the site and buy the model I want straight away. Are these useful sites? Yes.
Or how about I click away in less than 30 seconds because the page loads too slow, or it's MFA or a sex site, or just not quite what I wanted? Are these useful sites? No.
Suppose I spend 30 minutes looking for information all over a site and click deep into it before giving up? Is that site useful?
Should I even be trusted? Suppose I'm an idiot and can't find the useful information right infront of me?
To try to understand human interaction with a website as a guage of 'usefulness' is just so far beyond Google (or anyone) at the moment to be laughable.
To try to understand human interaction with a website as a guage of 'usefulness' is just so far beyond Google (or anyone) at the moment to be laughable.
Don't forget that their original way of classifying popularity by comparing "votes" of another content providers was very successful, although a kind of naive for some too. I don't think that the broadening of the model by including users/consumers in the voting process will be a wrong move.
Actually, I am pretty sure, we have been witnessing it already for some time.