|I LOVE MSN Search|
The reason why I LOVED (and still do) MSN Search was that it was stupid enough to index many of my blogs (and sites) faithfully and give them top ranking for very good keywords even though they were very new! Thanks MSN :) you send me a good traffic :) (almost zero traffic from Yahoo! and Google for such blog and no where near the top of their SERPs for the main site keyphrase opposite to the case of MSN).
I have noticed something while chacking MSN, besides from reasoning that probably it does not take a sites 'history' well into consideration (like Google strongly does), simply because MSN is new so they do not have older records of sites nor do they have patterns having watched sites for years like has Google and also Yahoo!. Beside that, I have noticed a BIG hole in the MSN search engine: it does not 'understand' concepts behind words! It is very semantically poor! (unlike Google for instance and also Yahoo!). It reminds me of the old search engines many years ago when all they did was checked that a web page contained the words you were searching for even if the words did not make sense together as a concept. It does NOT understand the relationship between words or how when two words come together they should indicate a specific concept, no, MSN just checks that the page contains those tow words and that's all! This is a BIG hole in the MSN search system. Perhaps building such a component that would figure out the relationsihp between words and pack them up in concepts would take a lot from MSN, but they must do it or else their search engine will remain stupid.
Untill then, which I think would take still some good time to do, I'll sit back and relax enjoying the traffic they supidly send to my new blog for a good keyphrase :)
I think the fact it doesn't take age as a factor is a good thing. Why should the age have any bearing on whether a page is relevant? All I know is I can't find any sites for new movies, musicians, groups, etc on Google.
I never even mentioned the word "age"! I said history. The history of a site and a page means how its content changed with time, how it frequently changed, how links to it grew with time ... etc. History is an essential part of deciding on the quality of anything in life. When you hire a new candidate, you check his or her 'history' of work, study, activities. When you vote for a leader you check his or her history of achievements and actions. History is an integral part of establishing credibility. We use it all the time. In web sites, it is valuable too. As for the "age" of a site or page, this is only one tiny part of its history.
The second fatal mistake you made in your post is using the word "relevant". What I talked about had nothing to do with relevance! I was talking about measuring the quality and value of a site or page. Those are two different things. A search engine, any search engine, attempts to do two things: 1- determine the value of a site/page 2- determine the relevance of a site/page to a search query. Those are two different processes. I was talking about the first one and not the second which you referred to!
Please try to be more accurate next time you talk here at WW.
|was stupid enough to index many of my blogs (and sites) faithfully and give them top ranking for very good keywords even though they were very new! |
Age has nothing to do with it. Unfortunately, if you have stupid blogs and sites, then the traffic won't do much good toward building your branding. That isn't MSN's problem, that's your problem until you implement better quality content development.
|It does NOT understand the relationship between words or how when two words come together they should indicate a specific concept, no, MSN just checks that the page contains those tow words and that's all! |
What you describe is how TF/IDF indexing works, and it just so happens that MSN has some of the best LSI minds on board. Exactly what other kind of technology do you think should be implemented that can emulate human comprehension? Does it have a name?
Yes it does have a name: semantic networks.
|I think the fact it doesn't take age as a factor is a good thing. Why should the age have any bearing on whether a page is relevant? All I know is I can't find any sites for new movies, musicians, groups, etc on Google. |
I'll second that... it's really refreshing being able to find brand new sites in their search egnine.
|it's really refreshing being able to find brand new sites in their search egnine. |
Search engines index new site, I am talking about giving them high ranking which is a different story.
Use search by date to find fresher sites. Any newbe can do that.
I just have to chip in on the LSI part of this.
As someone who has studied LSI and created a working system for a scientific research site I'd say I have an idea of the resources it would take to 'understand' word relationships.
The database that we created has over 25,000 dimensions. Being mostly empty, due to the nature of LSI, it still takes up 2GB of memory (which it needs to run in to be fast enough). The data was created from 268,000 research papers and took huge amount of processing power.
The result of this is merely a database that can categorise what area of research any document heading is likely to come under, yet it still gets things wrong occasionally. It can also tell you related words but can't 'understand' concepts. For example, it can tell you that 'stem cells' are often referred to in cancer research papers (and specific types) but if you ask it 'how large are stem cells' it will not provide a suitable answer.
To scale up to even a 100,000 term dictionary (no numbers included, just whole english words) and get it to 'understand' concepts would be a massive task. Huge resources would be required, probably entering the picobyte range due to additional data about proximity, and the people to do this are not exacly abundant.
So let's celebrate the fact that MSN and others can't understand concepts yet and enjoy working out how to please their algo's.
When I used the word "understand" I did not really mean it in the human (nor even the AI) sense of the word. I did not mean that the system could answer a questions such as "How large is a ..." This is a branch of AI and would require HUGE amounts of resources that can not be realisticaly available for something of the scale of the net (though Google claims to be doing 'research' in such direction, but that's another story).
What I did mean by "understand" was just to figure out the relationship between words (relationships are really what matters here, as the WordNet project have discovered after long).
If you do nothing to retain that placement, you will drop in the results. MSN has a tendancy to place new sites well, and then if the site is worth it, it will stay in a good ranking. otherwise it will be pushed out be ither sites given time.
I've gained a couple of links back to me after some networking and was so glad to get them. Surprisingly, my ranking shot down in the MSN SERPs shortly after that!
Umm, maybe MSN interpreted that rapid or instant boost in links as spam, or perhaps it did not like so much the sites that linked to me, or more probably it thought it was kind of a 'spam' network or something like that, since indeed those sites were within a ring (or a couple of rings) related to my topic.
I was so unhappy about that. And thought that it's better not to let others link to me unless they are really good sites 'couse MSN interpretes not-so-good links as a bad thing and actually drops your ranking for that, I never thought such a thing could happen!
Anyway, the good news is that now and after some time, my rank floated back up again, and now even higher than it was before, thank God, hurray!
Thanks again MSN. (I'm #4 for my main keyword in MSN, thank God.)