Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Patent Granted To Google's Panda On SERPs Ranking

         

engine

7:29 pm on Mar 25, 2014 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Hat tip Bill Slawski

Invented by Navneet Panda and Vladimir Ofitserov
Assigned to Google
US Patent 8,682,892
Granted March 25, 2014
Filed: September 28, 2012Patent Granted For Google's Panda On SERPs Ranking [seobythesea.com]

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for ranking search results. One of the methods includes determining, for each of a plurality of groups of resources, a respective count of independent incoming links to resources in the group; determining, for each of the plurality of groups of resources, a respective count of reference queries; determining, for each of the plurality of groups of resources, a respective group-specific modification factor, wherein the group-specific modification factor for each group is based on the count of independent links and the count of reference queries for the group; and associating, with each of the plurality of groups of resources, the respective group-specific modification factor for the group, wherein the respective group-specific modification for the group modifies initial scores generated for resources in the group in response to received search queries. [patft.uspto.gov...]

indyank

5:30 am on Mar 31, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



@Whatson, it is not just about the legitmacy/independence of links (conditions for being counted as natural links) but rather a new way of looking at them. In simple terms, if the links you have is not supported by adequate user engagement signals and brand signals like reference queries, then the new way to rank and display search results, will demote you among the results for the group your page/site falls into.

gouri

6:49 pm on Mar 31, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It looks like there is a link factor to Panda, in addition to on-site content; it is not just Penguin that is looking at links.

whatson

7:06 pm on Mar 31, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok, so an example might be you have a link to your site with link text "las vegas travel" from a Las vegas themed site.
If visitors go to your site, by searching las vegas travel, and they bounce back, stay for a short period of time, or have very low pageviews/visitor. Then this site is not deemed worthy of this keyword.

Is this closer to what you are thinking?

kevina

9:13 am on Apr 3, 2014 (gmt 0)

10+ Year Member



@planet13 @watson

The patent describes how "groups" or "plurality of groups" are built. This is a new platform I believe.

And also the goal of the method is to "improve quality of sites" according to the patent, that is the goal of Panda.

And the method is also related with user experience as it takes into account the queries to make the magic formula.

I understand someones are deceived to discovers that the algo used to evaluate the quality of a site is just a mathematical formula (I was also), but it is the Google way.

7_Driver

11:26 am on Apr 3, 2014 (gmt 0)

10+ Year Member



I think this is definitely Panda (or a large part of it).

At first glance - it looks like something else - because it's different to how Panda was spun to the public by Amit Singhal - we were expecting analysis of on-page content.

But having studied Panda for three years - and read the patent carefully (several times!) - I think this fits very well with what Panda acutally did (rather than how it was spun).

For me it explains:

* Why Panda hit "mid-size" sites - leaving both tiny sites and big brands alone.

* Why Panda is so "Brand-Centric"

* Why so few sites hit by Panda have been able to recover

* Why Panda had to be a site-wide penalty rather than page-specific

* Why Panda failed to take out the high-profile content farms everyone thought it was aimed at

* Why older sites tended to be more affected than newer ones

etc etc

It's all in there - and more. Well worth studying if you've been hit by Panda - or want to avoid being hit in the future.

Future

11:51 am on Apr 3, 2014 (gmt 0)

10+ Year Member Top Contributors Of The Month



Perfectly written 7_Driver.

webstuck

5:45 am on Apr 4, 2014 (gmt 0)

10+ Year Member



7_Driver,

You seem to have a good grasp of Panda, if that is what this patent is about. Could you explain how you believe Panda works?

indyank

11:09 am on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



for people who want to understand how it works, you may take a look at the following diagrams included with that patent.

[techblissonline.com...]
[techblissonline.com...]
[techblissonline.com...]
[techblissonline.com...]
[techblissonline.com...]

The last one makes it very clear on how the flow happens.

The problem that Navneet Panda appears to have solved is in overcoming the technical challenge in computing the group modification factor for each group of resources (address based grouping is grouping of resources based on domain name or host name) and having it stored in the database. This is a very resource intensive process as it need to be done for most sites on the web. The group modification factor can be understood as one that is calculated per site based on the total incoming independent links (IL) and total reference queries (RQ) to that site. It is calculated as a ratio of IL to RQ i.e. IL/RQ.

Though I, like many others, initially believed that google was having a penalty factor in its panda algo, this document has made it very clear that there is no penalty element in it. But one has to bear in mind that this might not be the only patent relevant to Panda. There might be more such patents and reading them all will throw more light on how user engagement is measured and used in arriving at a quality adjusted ranking score. From this document, Google's sevret sauce for determining quality sites is "Reference queries". This is used to determine the weight that its ranking algo need to assign on the total incoming independent links and it is done in the form of a group modification factor. Though there might still be many other signals used, reference queries seem to be an important and a significant determinant of the final raking score. The top brands like wikipedia, NYT, amazon, ebay and others must be having significant number of reference queries and hence Panda appear to have benefited their final ranking score while those with just links but not supported by significant number of reference queries (brand signal) have lost out.

I have found answers to most, if not all, changes that I noticed in SERPS after Panda went live.

indyank

11:15 am on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Though there isn't any penalty element, the reliance on reference queries makes it more biased towards large brands as compared to topical authority sites.What panda has really done is to promote popularity over authority due to its reliance on brand signals like the count of reference queries.

Martin Ice Web

11:15 am on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



indyank, what does "reference queries" mean. The query contains the sites or Domain Name?

indyank

11:18 am on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



IMO yes. I have explained my understanding in the comments section here as rajesh.

[seobythesea.com...]

Martin Ice Web

12:36 pm on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK but this would make Googles algo base on user popularity and not on content. And it would make it volunerable to brands taking over the serps ( like they did ).
And it this would make the algo not depend on Content or Quality Content like MC says.

indyank

12:48 pm on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The way they went about it is identify sample sites for both high and mediocre quality sites. The likes of wikipedia, NYT obviously figured it their "high quality" list and the data they had on reference queries for sites would have distinguished these two types of sites better.

indyank

12:52 pm on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You must also remember that a group of resources need not always be all resources having a common domain or host name.It can also be a portion of them.


For example, a particular group can include only a portion of the resources that can be accessed using a particular host name or a particular domain name.

Dymero

1:26 pm on Apr 4, 2014 (gmt 0)

10+ Year Member



So is a reference query something like [{brand retailer} widgetname]?

Also, are the modifications associated with reference queries made every so often or constantly? If it's constantly, this might explain strange ranking changes I've observed that seem to follow sales patterns in my niche.

Martin Ice Web

2:37 pm on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@indyrank, this makes perfect sense. Thanx a lot. The question is how big are this groups.
Like:
-cars is a group / good for big guys, bad for smaller
or
-convertable
-Trucks
-small cars
-big cars

If the groups have only small niche pattern it would be very hard to gain any rankings against the big once. If it is devided into smaller pieces u can get better rankings and beat the big guys by making good user conent.
So it depends on google how they treat a group.

And it explains why they are after link builders. Their main base to build the groups are links. Many links to one domain for one niche would make the algo think that this site is the reference site.

7_Driver

2:50 pm on Apr 4, 2014 (gmt 0)

10+ Year Member



Webstuck wrote:

Could you explain how you believe Panda works?


I can't really improve on what indyrank has written above - according to this patent the ratio of reference queries to links is the secret.

(You can think of reference queries as Navigational Queries - possibly including other keywords as well, or possibly not).

Of course it's true that there may be other factors in play within Panda (and there certainly are within the ranking algorithm as a whole) - but IL / RQ on its own would explain most of the effects of Panda we've seen.

Martin Ice Web wrote:

And it this would make the algo not depend on Content or Quality Content like MC says.


Yes and No. You're right that we've all been sold a dummy - all the talk of "Quality" had led us to think that it was either an on-page quality assessment - possibly based on machine learning, a function of user metrics like bounce rate, time on site etc, or some combination of the two.

But what they seem to have done is taken a set of "high quality sites" (CNN, NYT, Wikipedia etc) and "Low Quality Sites" - and noticed that the ratio of Reference Queries to Links correlates well enough with these two groups, that sharply reducing the visibility of sites with the wrong ratio has improved the quality of search results overall. Of course there's massive collateral damage with a blunt instrument like that.

But there is an argument that by making your site "higher quality" in the eyes of your users - and providing information they can't find elsewhere (not just unique words) - they're more likely to search for you by name in future - hence improving your ratio of navigational queries to links, and lifting you out of the Panda zone.

Of course there are other ways to achieve the same result - including heavy "brand" advertising to raise your brand awareness - but Google wouldn't say that - they'd rather talk about "quality".

I've seen some exact match domains, which are quite poor quality sites - yet have done amazingly well since Panda - so it may be that some queries are getting counted as navigational when they aren't.

Martin Ice Web

3:46 pm on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



providing information they can't find elsewhere (not just unique words)


U hit the nail on the head. I thin that is where i misunderstood MC when he says make compelling sites. Its not the words itself.


I've seen some exact match domains, which are quite poor quality sites - yet have done amazingly well since Panda - so it may be that some queries are getting counted as navigational when they aren't.


Yes, a keyword in the search string could be false proofed by the algo that the user likes to target the EMD.
In this case the bounce should show the algo that it is not the case ( i will make it a patent, so google let our hands of it )

My url is not an EMD or PMD but as i read this it instantly made me think that a par of the url is a specific term in my niche.
So this must be the case why i was hit in this specific niche by the EMD algo update.
EMd is not just EMD or PMD it must stand in correlation to the "navigational" or "reference" query that the panda algo thinks has determind for this niche.

Hence, the next question is, does direct domain search count? I know so many poeple who donīt use the adressbar but type the domain name into the search field?

What makes my think of quality is that brands and content scappers like amazon get every day a new boost while they are so popular. But is the intenten of a search engine to show popular or good content that users are not knowing about.
I think panda makes google run after the user and not showing them the way.
So users are leading the algo not the algo itself.

EditorialGuy

3:50 pm on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



But there is an argument that by making your site "higher quality" in the eyes of your users - and providing information they can't find elsewhere (not just unique words) - they're more likely to search for you by name in future - hence improving your ratio of navigational queries to links, and lifting you out of the Panda zone.


That argument could work for what might be called "recurring searches" (e.g., medical symptoms, recipes, or camera reviews), but it isn't applicable to "one-off searches" (planning a once-in-a-lifetime climb of Aconcagua or weighing the environmental impact of cremation vs. burial). Still, that kind of thinking might help to explain why--to use a real-life example--I find a 2009 article from Smithsonian Magazine among the top five results for a query about a city transit system. (Smithsonian is a fine magazine, but would you really want to use it when figuring out how to get from point A to point B?)

I've seen some exact match domains, which are quite poor quality sites - yet have done amazingly well since Panda - so it may be that some queries are getting counted as navigational when they aren't.


I see that with location-based EMDs. For some informational queries that I watch, the top 10 results are largely a mixture of big-name megasites and unknown (sometimes tiny) sites with EMDs.

indyank

4:49 pm on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



But let me reiterate one shouldn't be carried away that this patent is the only one used in Panda. There could be a few more related to measurement and use of user engagement metrics etc. in the mix.

I see that with location-based EMDs. For some informational queries that I watch, the top 10 results are largely a mixture of big-name megasites and unknown (sometimes tiny) sites with EMDs.


It is not just EMDs but a few other smaller sites which could have been spared by the initial Panda version. If you notice the flow diagram, the resource modification factor derived from the group modification factor is applied only when the initial score is more than a threshold value.The initial score is one derived from relevance and quality (determined by no. of incoming links) measures. The initial score is often high for popular sites and over optimized sites. So panda modification factor is applied only for such groups where the initial score exceed a threshold value. For smaller sites and for other newer sites, this initial score may not be that high and hence panda modification factor might not be applied. So they get an unmodified initial score which might help them rank higher.

But there is no clear explanation of how the threshold values are determined.

indyank

5:09 pm on Apr 4, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I will go on to add that the group modification factor was most likely computed/determined initially in Panda 1.0, for sites with higher visibility for the top 12% of search queries.And I had one of those sites ;)

muzza64

10:17 pm on Apr 5, 2014 (gmt 0)

10+ Year Member



The brand queries aspect certainly explains a lot of the effects of Panda I've experienced, like partial recoveries that follow sustained marketing efforts (e.g. ramping up Adwords campaigns) and subsequent drops when that effort reduced or stopped. It explains why sites that build a big and genuine social media following might avoid Panda issues.

It also explains why some sites reported recoveries after noindexing or removing parts of a site, or moving content to sub domains.

It also explains why two of my sites were not affected by Panda despite being low quality in my mind. One was a PMD, the other an EMD with very few links.

In theory this could be gamed by hiring people to do brand searches for your site, but presumably there is a 'natural' balance with other factors that would enable Google to penalise anyone trying to game this.

muzza64

10:45 pm on Apr 5, 2014 (gmt 0)

10+ Year Member



Another thought. What if Google looks at where visitors go on your site after arriving for a brand search and finds that there are some areas people are not going to. That could be an indication of low quality, especially if links are being acquired for those areas (which you would only expect if those areas of the site were popular or prominent).

I recall that some webmasters had Panda success from improving internal linking, which might be enough to address the problem for some sites.

Planet13

3:11 pm on Apr 6, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"I recall that some webmasters had Panda success from improving internal linking, which might be enough to address the problem for some sites."

Well, improving internal linking is probably a good thing whether a site was hit by Panda or not (I know, Captain Obvious strikes again).

But if it is true that google doesn't use analytics data (as they claim inthe past), then it would be tracking user interaction some other way.

Granted, even without GA data, google has access to a lot of resources for user browsing data. And I know they said that Panda was only supposed to affect X number of sites, so maybe there are enough data points to make a determination like you suggested for that limited number of sites / queries that Panda was supposed to target?

Martin Ice Web

9:26 am on Apr 7, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@muzza64,
i agree total with you.
And that explains why poeple that donīt care abaout incoming links and donīt disawov them, gain traffic because this site still get queries by poeple who search for this sites.

So all you have to do is making poeple search for your site, while you donīt have to worry about the quality of your content.
And i thought google is analysing the content! Baah. It is all fake, because they canīt.

Martin Ice Web

9:40 am on Apr 7, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So how do game the system:

Instead of link to our page, make write down just or domain name and the referenz article. Poeple will search for it with google. google will know your site to be popular.

aakk9999

10:55 am on Apr 7, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Poeple will search for it with google. google will know your site to be popular

I am not so sure about this. It would take a great interest from me to search for a company name from the reference article. But if it was a link, I may just click on it with not much extra effort...

Martin Ice Web

11:13 am on Apr 7, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@aaak, and this is all what is behind panda. Great interest for a site. Not links but reference queries.

Errioxa

11:48 am on Apr 7, 2014 (gmt 0)

10+ Year Member



Here’s a quick summary from the patent of what happens in the process it describes:

-Determining, for each of a plurality of groups of resources, a respective count of independent incoming links to resources in the group

-Determining, for each of the plurality of groups of resources, a respective count of reference queries

-Determining, for each of the plurality of groups of resources, a respective group-specific modification factor, wherein the group-specific modification factor for each group is based on the count of independent links and the count of reference queries for the group

-Associating, with each of the plurality of groups of resources, the respective group-specific modification factor for the group, wherein the respective group-specific modification for the group modifies initial scores generated for resources in the group in response to received search queries.


if a site has a total of 1000 backlinks but only 10 contain the query, and there is another site with only 50 backlinks and 10 contain the query. What is better for the user?

The second site that focuses more on the query that first site

indyank

12:31 pm on Apr 7, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if a site has a total of 1000 backlinks but only 10 contain the query.


What do you mean by only 10 contain the query? Reference queries have nothing to do with anchor text.

On internal links, they aren't considered as independent links for determining the modification factor. But it doesn't mean they aren't useful as long as they are done prudently.
This 64 message thread spans 3 pages: 64