Forum Moderators: open

Message Too Old, No Replies

Popularity as algo factor?

Stop working so hard Google.......

         

markdidj

5:31 am on May 8, 2003 (gmt 0)

10+ Year Member



I still think some kind of popularity based algo should be included in with the Spam Search algo. If they started work on it now, they'll be finalising it in a year, then they can let the index work itself. And ease up on the banning, if they included popularity, spammy sites will quickly work their way to the bottom.

chiyo

6:10 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Huh? dont they have a popularity index already (PR?) Though i accept that link-farms, buying links, and most recip-linking devalues it, it has been relatively resilient when you compare it to earlier forms of relevance testing such as on-page text analysis and meta-tag spidering.

If you are referring to indexing how many people visit certain sites compared to others, how can this be acheived. Direct Hit failed - it never got beyond doing this reliably for a small number of one-term queries. Alexa's attempt uses a highly skewed sample, and even Alexa's supporters acknowledge that below say the top 1,000 or 10,000 sites things get very smudgy.

USe the Google toolbar? - another highly skewed sample.

Use "karma" type ratings like happy and sad faces and all the rest? - a recipe for spam city.

And even then you are restricted mainly to a site basis rather than a page basis.

and with millions of sites, how can google check the popularity of each one? - especially as the ranking of niche, specialist sites is Google's speciality? First Google would have to theme much better than it can now, so you are not comparing
apples to oranges.

Finally, it would involve the "...tyranny of the popular..", a bugbear of modern life where the most useful information in the world is hidden from public view because savvy marketers can make a transient lowest-commom-denominator talent like Brittany Spheres more popular than enduring talent.

It would be a step back in my opinion, as before Google has managed close to the impossible - to cut through a lot of the marketing hype to real useful information that may not indeed, be popular.

vitaplease

6:22 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



..some kind of popularity based algo should be included

I think for more frequent spidering it makes sense.

A site getting a disproportionally high visitors amount may be worth while respidering and reindexing more often.

In a way Pagerank has a bit of popularity in it:

Indeed, one way of viewing Pagrank is that it puts a number on how easy (or difficult) it is to find particular pages by a browsing-like activity.

from: www.almaden.ibm.com/cs/k53/www9.final/

markdidj

6:43 am on May 8, 2003 (gmt 0)

10+ Year Member



Chiyo.
Thats OK for text based websites, but what about interactive websites that teach through other mediums, visual and audio. These have alot less textual infomation, but are just as, sometimes more than, informative.
A text based website will not teach drums as well as an interactive site, but will get indexed higher.

What about those like myself that wish to deliver content with javascript rather than PHP or other server sided program?

The web has become fast enough to deal with other mediums than just text, yet Google relies 100% on text for its indexing.

ncsuk

6:47 am on May 8, 2003 (gmt 0)

10+ Year Member



They should do PR of 1-100 not 1-10 because then there is more scope to the results.

digitalghost

6:49 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



PR= 7.6374635262 Hmm... ;)

ncsuk

6:49 am on May 8, 2003 (gmt 0)

10+ Year Member



3.14159265358979323846 (anyone nkow what that is)

BigDave

6:54 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I still do not see how popularity on one search will apply to another search, even if the kewords are closely related.

Lots of people search on widgets, and they go to the widgetcity site. It is hugely popular. It happens to have a page on widget history, but it's pretty weak. None of those people searching on "widget" go to the Widget Museum site, because they just want to buy them, they don't care that Ole Olson created the original widget out of straw and lutefisk in his barn in 1863.

Now if you search on widget history you will still get that lame page on the widgetcity site instead of the widget museum.

With the current system, even if thw widget city page has higher PR, the widget museum site will probably rank higher for "widget history" because it will have many more external links with that link text.

Morgan

7:23 am on May 8, 2003 (gmt 0)

10+ Year Member



Google already knows how to do popularity. They use it to adjust AdWords right now, showing ads that get a response more often than those that don't.

I don't think there'd be anything wrong with doing it, they could do it without using toolbar info. For example, if the destination of a result is www.example.com, they send you to google.com/redirect?url=www.example.com, and they have all the stats they'd ever need on which listings get the best response.

The things that get clicked will be the things that appear most relevant to the searcher. If you were searching for SARS research, you're not going to click on Britney Spears results just because she's popular in general. You'll click where you think you're going to find what you're looking for.

Also, PR definitely has a lot more than 10 values, Google just chooses to define toolbar values that way.

BigDave

7:34 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Doing something with adwords is very different than trying to find the most relevant information for a searcher.

heini

7:44 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



changed the title a bit so that people know what's it all about

In a way Google's very foundation is popularity measurement, only they base it on webmasters votes, not on users votes.

Including popularity among users in the algo in a meaningful manner would require a lot more than just a simple CTR style measurement.
Popularity in itself says nothing about the usefulness or relevancy of a result to a query.
Essentially Google would have to get a grip on how satisfied users are with a result the user clicks. Time spent on the page would be a bare beginning to measure that.
I don't even start thinking about the privacy issues involved.

chiyo

7:51 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



markdidj, I understand your problem. Our sites use js a lot too, but we live with google by doiing things that overcome the problem, like including text on the page. But surely this is a different topic really? On how to get multi-media or interactive sites better ranked?

I cant see yet any solutions that will be "better" than the present systems. Or ones that are scalable. For Adwords, there are a far less datapoints and people pay for the service, which means more budget can be spent on it. To clicktrack all clicks on google, analyse them and incoporate in the database would be a massive undertaking. And as i argued before, still open to abuse.

And how do you operationlize "popularity"? Number of unqiue hits? Amount of time spent on site? Amount of pages veiwed? Number of returning users? One of the major problems with systems like this is that popularity can be more due to marketing than the site itself. Savvy marketers can very well draw viewers to a site, but if its not very good they leave quick! If you just counted visitors, that site would be seen as "popular", but in the event, its not - it was just well marketed.

Remember with Google at least, an interactive site like yours can be optimized by almost all known techniques that are very influential. Title tag, incoming hypertext links, link popularity from other sites. That's even before you start adding text on page to augment your other material.

We have to optimise for all promotional options. Google is just one, and it specialises in plain text sites, because generally people want to find info quick and less often want to go to a slower loading multi-media presentation.

I know Google accounts for a lot of SE traffic now, but that will not always be the case.

One thing you have to do, if you want google traffic is make compromises.

One thing we did was to change our home page to almost all text, small, and very simple. Then the more complex stuff goes on pages with links from the home page.

On sites that we dont want to make any compromises at all, we dont worry about Google and use Adwords or paid advertising.

sit2510

8:50 am on May 8, 2003 (gmt 0)

10+ Year Member



If the same people are still working with Google, I would not worry much about the future - as the past till present, they have proven how smart and intelligent the G people are in edging over their giant competitors. But what is more important is that they are really FAIR and ETHICAL!

The "predated" backlinks shown in www2, www3, and sj appear to imply that G are VERY seriously comtemplating the new popularity algo as the strange behaviour seems to run only on secondary datacenters. Have any one seen the dance in www? For me, I don't see any dance since the update began around 48 hours ago, which is not the same as before. Very curious to see what would be the new popularity algo which might mean another round of sweating again

vitaplease

10:30 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Methods and apparatus for employing usage statistics in document retrieval [appft1.uspto.gov]

"Methods and apparatus consistent with the invention provide improved organization of documents responsive to a search query. In one embodiment, a search query is received and a list of responsive documents is identified. The responsive documents are organized based in whole or in part on usage statistics."

from one of the Google patents as Rubble88 posted here: [webmasterworld.com...]

heini

10:38 am on May 8, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>usage statistics
meaning Google wants our logs?
JK, but exactly that I had in mind with my remark on privacy issues.
Measuring user satisfaction would indeed mean measuring. And that can only be done with full access to the webserver serving the site.