Forum Moderators: open

Message Too Old, No Replies

Personalization Based Upon Click Throughs

Ways to Personalize results

         

iseff

5:32 pm on Jul 1, 2003 (gmt 0)

10+ Year Member



In a related thread [webmasterworld.com], it was asked if Google tracks click throughs. I dont think they do, but that question isn't as important to me right now as the more general question, "Should search engines track click throughs?"

(note: Im beginning my more active role of posting because of an old post made by Brett [webmasterworld.com])

We're moving quickly towards the era of personalization, and search engines are going to need to be at the forefront of this. Many topics are included under this umbrella, such as natural language processing and the most relevant results based on past searches.

My thought is personalization based on click throughs is a very interesting idea. There may be a large amount of overhead, but its something to think about. In my head, theres two basic ways to perform this.

First, either a search engine could keep track of the click throughs in its database and use this against the peer results to see which results users tend to like and begin to place these results higher. This would be a general version of personalization, based upon most peoples responses. This could be very effective if you have a difficult query to handle and decipher like the old cases of when one word means two things. Take 'spider,' for example. If you return the results for a site on search engine spiders first and spiders the animal second, chances are the second result will be clicked far more often. This data can in turn be used to correct this situation. As webmasters we use this type of data in our logs all the time.

Second, you could go ultra-personal and start with some cookies. By remembering peoples results, you can remember their preferences. This would obviously be great for the same exact search, but in a way it could be good for the general case as well. As we progress, future advancements in natural language processing and structure could allow us to generalize specific user inputs and the way they typically type their queries. From these generalizations we may be able to provide more accurate clustering of words and more accurate results - the end goal in search engines.

These are all thoughts, nothing too insightful, but I thought they might open some discussion on theories like this. Please flame me for my stupidity at any time! ;)

Ian

jeremy goodrich

5:38 pm on Jul 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The 'first' point you make is kind of interesting -> and a flash back to DirectHit, which was later purchased by ASK Jeeves. Their 'personalization' essentially tracked click throughs & cookied the end user, then if the site had more clicks, the higher it would rise in the SERP.

Trouble is, with more simplistic methods of tracking clicks & 'personalization' of this kind, it's too easily manipulated with a bot that catches & serves cookies, rotates user agent strings, and forges the header info going through a list of anonnymous proxy servers ;)

The 2nd idea, while interesting, is a bit problematic due to the metaphorical nature of language itself, as well as the inherent underlying structure of the web. Over time, I could see some linguistic inference being used for better 'personalization' when used in conjunction with a personalized vector calculation, such as that proposed by the BlockRank algorithm or the Quadratic Extrapolation method.

By differentiating the weight of partial vecotors in the underlying web map, and using that in conjunction with web site usage data (such as from a toolbar that 'phones home') it is possible to build a better database of surfer likes & dislikes, and then come up with new & sophisticated methods to 'roll that into the algo' so to speak.

Interesting thoughts, though - and NLP is definitely one of the yet untapped avenues of web site categorization on a massive scale, imho, but the problems remain - perhaps with increasing computational power, it will become more feasible to calculate those vectors?

martinibuster

6:22 pm on Jul 1, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Ways to Personalize results

Yahoo has made noises about personalization as something they want to do- but the emphasis seemed more for the ability to target ads, instead of helping people find and do things.

MS Longhorn will integrate search at the desktop level. In an overlooked article in the Seattle Times [seattletimes.nwsource.com], MS dropped a bomb shell on what they intend to do:


But Allchin bristled at the comparison. "Google's a very nice system, but compared to my vision, it's pathetic," he said.

Allchin said his goal is to have computers learn about the user, helping set the context for searches.

[edited by: msgraph at 9:01 pm (utc) on July 1, 2003]
[edit reason] cut quote length [/edit]

iseff

5:46 am on Jul 2, 2003 (gmt 0)

10+ Year Member



Jeremy, thanks for the amazing insight. Mind if I pick your brain a bit in the coming days/weeks? :)

You're exactly correct that the first type would be very, very easily manipulated. In time I believe that there must be some sort of search engine which can not be designed for, optimitized for, or spammed. Obviously, this is a pipedream as it currently stands, but think about cars - 100 years ago when Ford first made his car, there were no doors or roofs and were easy to steal. Then came a shell which made it harder. Then came better locks. Then came codes embedded within the keys. Now there are things such as rolling encryption on the new Saab's. They're virtually unstealable. Will someone steal it? Yes, of course - somehow, at some point. Technology for the thiefs must catch up first though. By then, hopefully there is new technology available to Saab. Search engines must work this way.

We all know the perfect search engine: One which grabs the entire web, every little piece of data online, takes a query and realizes what the user is actually asking/looking for (this is where personalization comes in), and provides the correct results based upon this info. In a perfect world, SEOs could never be able to catch up quick enough with the search engine algo's to optimize. That way, only true content which does in fact relate, from the heart of the author, will be given in the result.

I have no idea how one would ever be able to do something such as a 'rolling algorithm,' but doesnt it seem great!? :)

Using your second method of adding personalization into the algo would perform this to some extent - each person gets a different result. The SEO then must decide EXACTLY what person he/she is looking to trigger.

martinibuster: MS looks to be primed and ready to go ahead and take the lead with searches if its on the desktop of Longhorn. Not only do I think it will allow users the ease of searching on the desktop, but it will certainly allow, as an application, to store much more information about the users preferences for very targetted and personalized results. If they (along with Y!) only see this as an advertising opportunity, they surely do not have the vision they are thinking they have. It will be interesting to see how the SE wars ignite with the launch of Longhorn - if, in fact, this is integrated to the desktop. It might be the perfect opportunity for a new, small guy to break in and steal it all while the big guys break out into a horrific fight. Im only an undergrad in college, but hopefully I can be that one, right!? ;)

Ian

ulounge

10:54 pm on Jul 2, 2003 (gmt 0)

10+ Year Member



I posted this a while back [webmasterworld.com...] along the same lines... I'm glad there is a thread on it.

My thoughts on building better results...
Google results are on target for the most part.

I really like the way they handle Adwords in that they reward advertisers that have higher click rates and dynamically increase that sites position.

You always find a few sites that don't really fit in the results, sometime they are completely off topic.

Since google puts so much stock in DMOZ, why not use a hybrid model between DMOZ and Adwords dynamic listings.

Track click rates on keyword results and drop sites that receive a lower than expected result compared to other sites that are returned. In turn reward those site that receive a higher number of clicks expected for its listing position.

I often skip many of the results and click only the most pertinent. Granted Basically using users as human editors.

claus

2:26 pm on Jul 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



allow users the ease of searching on the desktop

At the bottom of my screen there´s a button labelled "start". Clicking this reveals an item named "search" which in turn has an option labelled "on the internet"

This has been the case since widows98 or 95, so the concept "search integrated with desktop" isn't that new. At the moment, mine points toward google, but this can be changed anytime if i suddently don't like G anymore. I think, though, that most users stick to the default (i really can't recall what that was ;-)

Back on topic:

I've recently gained a little insight into toolbars - not the G toolbar specifically, rather "add-on/custom toolbars" in general. It's fascinating. With such a device you are actually able to ...well, let's just say monitor stuff and run programs, it won't get too scary that way. Anyway, it's not that far from installing any other type of software - you always have to trust the firm somehow.

Now, if G is not monitoring toolbar use, they are wasting their time. Of course they would do that - if for no other reason, then just to build a better product. On the other hand, if they are not using the results to adjust the serps, this could be for a good reason: Are the toolbar users representative of the users of the main G entrypoints? I think not.

Toolbar users are probably the 10% experts, while the bread and butter is the 90% other users. The search term "spiders" would perhaps yield wrong results for most of their audience that is.

The idea of the computer knowing the user is somewhat better i think. Not as in "free from privacy issues" but as in "it's closer to the user than the search engine".

It is rather hard for a company like G to develop some kind of personalization that would work for millions of users and still keep the servers running without signs of smoke.

On the other hand personalization at the pc level only has to incorporate preferences of some small number of user profiles. There are fewer elements in the equation, that way.

This is not to say that Micosoft has the lead. Personalization at the pc level could be done using the G toolbar (along with a few "helper apps") as well.

oh... i forgot... is it a good idea?

Well, dunno. I tend to switch all personalization off whenever i come by it (as in "most used menu items" and such). Then again, i'm not exactly an average software user (software used in a very broad sense).

/claus

martinibuster

2:57 pm on Jul 3, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Topic:
Ways to Personalize results
We're moving quickly towards the era of personalization, and search engines are going to need to be at the forefront of this. Many topics are included under this umbrella, such as natural language processing and the most relevant results based on past searches.

At the bottom of my screen there´s a button labelled "start". Clicking this reveals an item named "search" which in turn has an option labelled "on the internet"

A search button three clicks off the start button isn't integrated- It is three clicks off the start button.

Longhorn goes way beyond that, and that's what makes it new. What they are proposing is true search integration so that the way you retrieve a file from the hard drive will be the same as how you retrieve something from the internet- it will be seamless.

Their personalized search is tightly integrated with an all new file storage system- no more file folders within file folders. It's completely new.

The context of their proposed personalized search will be the files on your pc- mp3's, docs- everything that your computer says you are, everything that your computer says you like, including most probably your click throughs.

"Many topics are included under this umbrella"
Focusing on click throughs, in the context of a discussion of search personalization, is to limit the whole idea of what personalization is.

It's no stretch to conceive that click throughs will be a small part of Longhorn's search personalization, but what they are proposing is far more comprehensive than that, and any attempt to personalize search will have to encompass much more than a single variable like click throughs.

To take this idea to the next step, you have to ask the question: How do you optimize for a personalized search? We may be heading toward a whole new ball game.

claus

5:16 pm on Jul 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



martinibuster, was really i that much off topic? really, i think i was "under the umbrella". Of course, the integration issue is somewhat off, but the personalization part is right on, i should think?

My key argument (which was perhaps not stated clearly enough) was this:

To personalize anything, you will have to do some data storage, manipulation and retrieval. If you do this on behalf of one user (say, on a pc) it can be done with a reasonable performance.

Example: my five last searches are stored on my pc

If, on the other hand, you have millions of users (say, as a large SE) this will put a lot of pressure on your servers if it has to involve them in the first place.

Example: my five last searches (times x million users) are stored on G

From this perspective, it seems like a good choice to place the "whateveritis" that is actually taking care of the personalization... on the PC. Do it with a bar or with a horn, it's details.


Once (a few years back) a piece of software popped up - i don't know if it's still around, but i still use a copy sometimes. It was called "the brain" and in essence it was a database of pointers. Pointers could be locations on the hard drive, on the network, on the internet, on disks or cd's. And you could set up relations between pointers and topics for them.

That is: The actual documents were distributed, and i did not care where they were located. I navigated subjects that made personal sense to me and when i searched for a keyword, the search ran across all types of storage and documents.

To me, that sounds a bit like the long horn, and perhaps i even like the idea. But this was no "ready-made" - to personalize i had to set up everything manually, and to define headlines, subjects, keywords, and relations myself - just like building a database. It was userfriendly and quite easy to do, but i still had to put in some work.

To make a long story short, personalization ultimately relies on user input. The "whateveritis" has to be told somehow that a and b are related.


Clicks are now the de facto means of navigation on a windows or mac pc (even some *nix flavors use a graphical interface). It is also the tool used on the web.

Reading clicks off the screen and analyzing clickstreams for a particular user could be an interesting way to collect user input about relations - this way the user would not have to explicitly define a lot of stuff, as this could be inferred from activity. So i wouldn't altogether dump the notion of clicks.

/claus

martinibuster

5:56 pm on Jul 3, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



So i wouldn't altogether dump the notion of clicks.

You're right, 100%.

iseff

10:32 pm on Jul 3, 2003 (gmt 0)

10+ Year Member



I agree with martinibuster, whatever Longhorn presents will be MUCH easier to use for the average user than hidden away like that. It will capitalize on the new MSN search which has been crawling away for some time now.

With a new engine which presumably will be quite comparable to Google and access to millions of desktops, where does a new player (or even an established player such as G) position themself?

We know MS will do whatever it takes to get the best results and the most personalized results, because they have the means to (i.e. access to the desktop and ability to easily store data on the machine, as well as ~$30b in cash to work with). They will have the ultimate advantage in personalized results.. Where then does that leave the rest of the field? Google surely won't die that quickly? Will they end up pushed out by MS and left only to corporations and universities? It would be bad to see.

However, I dont think G will go down like that. They certainly dont have the funds to compete with MS (does this finally mean IPO time?), but I think they certainly know how to compete as an underdog. All I know is while they are off fighting, Im going to be working extremely hard finding the best way to launch a new engine - they can compete each other to the ground while I work silently to the top! :)

For someone else to take the top though, they must bring something different. ulounge mentions the happiness of G's results - I beg to differ. Though they are relatively accurate most of the time, a new study from a recent paper (I can find the link to the WW thread if wanted) showed that something like 50-60% of users were happy with the results. While its pretty good, its not great. Imagine being at even just 75%. You'd surely win over many users.

I guess the question to now be asked should be broadened: What will it take, at the web (browser) level to personalize a search engine? I'd like to hear any and all comments about this.

martinibuster

10:47 pm on Jul 3, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



What will it take, at the web (browser) level to personalize a search engine?

That's a great question. I'm not a software engineer so I can't comment with too much authority, but from a layman's point of view it seems that the "web browser" may have reached a limit to it's abilities and what's needed is a technology more sophisticated than cookies.

Personalization of search... I'm going to keep my on the overture se properties, as they've been making a bit of noise about experimenting with new technologies.

claus

7:54 am on Jul 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just remembered...G does personalisation already. You can set preferred language, adult filter and # of results per page - settings are stored in your G cookie, i suppose.

Perhaps that's a point in itself; that good (well executed) personalization tends to be something you forget about as a user when it works fine?

Personalization was a major trend among portals a few years back - now it seem that they've sort of abandoned the idea again. My best guess for a reason was that the possibilities were not widely used by the majority of their users. Some of my own work with portals points in that direction, although that's not for the international market.

I can think of two very important points:

1) ease of use (and setup)

2) not too many possibilities

So, You should choose (a subset of) the most relevant options for personalization, and stick to these. Example: If i could customize the G background color, that would not be a relevant issue.

So what are the relevant options, then?

Those G has chosen are good options, imho. Further, i have found the latest x searches to be useful at some occasions. Using a widows system and IE, they are simply stored in the input box (try double clicking or alt+arrow down) until i clear the input box history.

To do this, all one has to do is to choose a name for the input box that is not likely used anywhere else. The storage takes place on the pc. Then again, if the Yhoo input box has the same name as the G input box, i could reuse searches across engines. I haven't checked to see if they do have the same names.

But to really make the search experience better? That's a tough question.

I can't help thinking about some kind of meta-search. Perhaps one could do a little bit better that "choose how many results you would like to see for each engine" - better parsing, presentation in same format, and check for double entries across SE's would be nice features. Also, they would add to the processing time, so perhaps they're not that nice after all.

Then there's the folder system used by northern light. If you could set up folders that made sense to you and gradually fill them, that would be very interesting.

Last, there's the "what are you really searching for" idea. AskJ does something right here. Did you mean to check the word in a dictionary, to inquire the yellow pages, to look for a product for sale or did you want to find web pages containing the word?

Don't know if these options are the right ones, this was only a quick brainstorm/wishlist :)

/claus