Forum Moderators: mack
I have noticed something while chacking MSN, besides from reasoning that probably it does not take a sites 'history' well into consideration (like Google strongly does), simply because MSN is new so they do not have older records of sites nor do they have patterns having watched sites for years like has Google and also Yahoo!. Beside that, I have noticed a BIG hole in the MSN search engine: it does not 'understand' concepts behind words! It is very semantically poor! (unlike Google for instance and also Yahoo!). It reminds me of the old search engines many years ago when all they did was checked that a web page contained the words you were searching for even if the words did not make sense together as a concept. It does NOT understand the relationship between words or how when two words come together they should indicate a specific concept, no, MSN just checks that the page contains those tow words and that's all! This is a BIG hole in the MSN search system. Perhaps building such a component that would figure out the relationsihp between words and pack them up in concepts would take a lot from MSN, but they must do it or else their search engine will remain stupid.
Untill then, which I think would take still some good time to do, I'll sit back and relax enjoying the traffic they supidly send to my new blog for a good keyphrase :)
The second fatal mistake you made in your post is using the word "relevant". What I talked about had nothing to do with relevance! I was talking about measuring the quality and value of a site or page. Those are two different things. A search engine, any search engine, attempts to do two things: 1- determine the value of a site/page 2- determine the relevance of a site/page to a search query. Those are two different processes. I was talking about the first one and not the second which you referred to!
Please try to be more accurate next time you talk here at WW.
was stupid enough to index many of my blogs (and sites) faithfully and give them top ranking for very good keywords even though they were very new!
Age has nothing to do with it. Unfortunately, if you have stupid blogs and sites, then the traffic won't do much good toward building your branding. That isn't MSN's problem, that's your problem until you implement better quality content development.
It does NOT understand the relationship between words or how when two words come together they should indicate a specific concept, no, MSN just checks that the page contains those tow words and that's all!
What you describe is how TF/IDF indexing works, and it just so happens that MSN has some of the best LSI minds on board. Exactly what other kind of technology do you think should be implemented that can emulate human comprehension? Does it have a name?
I think the fact it doesn't take age as a factor is a good thing. Why should the age have any bearing on whether a page is relevant? All I know is I can't find any sites for new movies, musicians, groups, etc on Google.
I'll second that... it's really refreshing being able to find brand new sites in their search egnine.
As someone who has studied LSI and created a working system for a scientific research site I'd say I have an idea of the resources it would take to 'understand' word relationships.
The database that we created has over 25,000 dimensions. Being mostly empty, due to the nature of LSI, it still takes up 2GB of memory (which it needs to run in to be fast enough). The data was created from 268,000 research papers and took huge amount of processing power.
The result of this is merely a database that can categorise what area of research any document heading is likely to come under, yet it still gets things wrong occasionally. It can also tell you related words but can't 'understand' concepts. For example, it can tell you that 'stem cells' are often referred to in cancer research papers (and specific types) but if you ask it 'how large are stem cells' it will not provide a suitable answer.
To scale up to even a 100,000 term dictionary (no numbers included, just whole english words) and get it to 'understand' concepts would be a massive task. Huge resources would be required, probably entering the picobyte range due to additional data about proximity, and the people to do this are not exacly abundant.
So let's celebrate the fact that MSN and others can't understand concepts yet and enjoy working out how to please their algo's.
What I did mean by "understand" was just to figure out the relationship between words (relationships are really what matters here, as the WordNet project have discovered after long).
Umm, maybe MSN interpreted that rapid or instant boost in links as spam, or perhaps it did not like so much the sites that linked to me, or more probably it thought it was kind of a 'spam' network or something like that, since indeed those sites were within a ring (or a couple of rings) related to my topic.
I was so unhappy about that. And thought that it's better not to let others link to me unless they are really good sites 'couse MSN interpretes not-so-good links as a bad thing and actually drops your ranking for that, I never thought such a thing could happen!
Anyway, the good news is that now and after some time, my rank floated back up again, and now even higher than it was before, thank God, hurray!
Thanks again MSN. (I'm #4 for my main keyword in MSN, thank God.)