Forum Moderators: mack

Message Too Old, No Replies

Plenty of MSNbot visits, very few pages on index

been visited for at least a year

         

walkman

12:22 am on Jan 21, 2005 (gmt 0)



I have been visited for over a year, tens of thousands of visits and every page has been seen by the bot, no doubt in my mind. How come so few pages are on the index? I have plenty of links on the two sites that I'm talking about, and they are indexed fully by G and Y!. For example, out of 1500 pages only 30 or so are in MSN from one site.

Is this happening to anyone else?

steveb

12:55 am on Jan 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not to that degree, no. But no site of mine shows as close to fully indexed (50% to 80%, all well under 1000 pages), despite all non-brand-new pages being hit many times.

dazzlindonna

5:11 am on Jan 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Make sure this doesnt apply to you - blogs.msdn.com/msnsearch/archive/2004/11/18/266087.aspx

walkman

7:02 am on Jan 21, 2005 (gmt 0)



thanks Donna,
I don't see anything fishy, the links thing they put up is stupid unless they manually check. I have links and all are relevant to my site and the page. Definitely not a "link farm". I also changed the site structure recently though and that probably didn't help.

so I check a few of my competitors. One has 8000 pages on Google, 400 or so on Msn. Another site of mine (extremely white hat & not really commercial) has 90,000 on G (half are real, the rest are supplemental since the re-write) and only 1200 pages on MSN. CNN has 800,000+ on Google, about 50,000 on MSN.

Hmmm...

zeus

2:34 pm on Jan 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



same here, the site is of 2500 pages, but only 51 are in the index. Im not sure why.

Another a new site of mine got indexed in google, it had about 1000 page only 11 are indexed, but the wiered thing is that the same case for MSN.com also only 11 pages indexed and its not the same pages.

dazzlindonna

3:41 pm on Jan 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, I can't even get msnbot to come to one of my sites - no peep from it yet. I've got decent links pointing to it from indexed pages on other sites, but the bot hasn't been anywhere near my site. It's been a couple months...I'll keep getting links and hoping to lure msnbot my way. Then, maybe I'll have to worry about having your problem. :)

Imaster

4:00 pm on Jan 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That's happening for most sites which I monitor. Not only them, all our competitor sites face the same behavior. So its more like a testing phase for msn and probably we should see a complete index in a couple of days, if Bill gets time to cut the ribbon.

ccton

4:22 am on Jan 22, 2005 (gmt 0)

10+ Year Member



same to me. the site has about 180,000 pages, Google indexed about 35,000. Yahoo indexed 450, MSN crawled 9,000 pages during the last 3 weeks but only 93 of them are indexed.

The most unblieveable thing is Ask Jeeves, it crawled over 120,000 pages last month, only 115 pages indexed.

I guess, one should pay for the directory (related to the search engines, i.e. Yahoo related to venture, ...) inclusion and then get more page indexed.

Hope I was wrong

zeus

11:11 am on Jan 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maybe it is because of a test fase, that also means we will see HUGE changes in the serps, maybe it will be more theme realted search, so we do not see sites with only one related page, that aply to the search done.

dazzlindonna

12:00 am on Jan 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



3 days ago I was lamenting that msnbot hadn't visited - today it did! Woohoo! Now maybe, my site can start its walk down the golden runway of msn search. :)

ccton

11:28 am on Jan 28, 2005 (gmt 0)

10+ Year Member



OK, it was dropped down fron about 253 to 93, and then went up to 167, just a minute before it goes to 7040 pages indexed.

Not bad :)

zeus

6:58 pm on Feb 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Any news on why MSN does not include the whole sites pages - like in m example, now it has 44 pages indexed, but the site has about 2500 pages and some of my other site are fully included, so what are the terms to get fully included any news.

eyezshine

9:31 am on Feb 5, 2005 (gmt 0)

10+ Year Member



I seen a blog on MSN saying they index a site by how many clicks it takes to get to your page. They won't index a page 5 levels deep in your site unless there is another website linking to that page.

So I have five sites that have the same type of category structure and I tested this theory by linking the internal pages of the 5 sites together without cross linking them.

The result was about 1,000 pages indexed per day for 2 weeks per site.

The sites all averaged 1,000 page increase every day that I checked.

But with all those pages indexed you would think my traffic would have increased on those sites? It did but I didn't notice it at all.

Maybe the pages have to get some kind of page rank before they actually rank where they should? So I'll wait a month and see if the traffic increases.

But to me they are trying to rank pages similar to how google does it except they only value single pages and not entire sites PR like google.

So each individual page has to get it's own links to rank well. And if the page has no external links pointing to it and it takes more than 2 clicks from the home page to get to that page then MSN thinks that page isn't important enough to index.

This is one of the reasons MSN's results aren't quite up to par because they don't index deep enough to get the good pages from a site.

I think they should index sites backwards. Just index the home page and then index from the deepest pages to the shallow pages because in reality the deepest pages are really the most important for most sites.

That would give them much better results than all the engines if you ask me. Just give the pages with the least links pointing at them the highest rank.

HAHAHA! Negative page rank! The next generation of SEO.

zeus

8:12 pm on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



eyezshine, I think you got it, that sounds like the real answere to this, thanks

steveb

9:42 pm on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know if that is true, but it is a good explaination, and given the amazingly anti-sensible choices MSN has made in terms of other things, this deliberate anti-quality bias would fit right in.

If the above is true, this algo is inteded to favor blog comment spam (and link buying to a smaller degree).

I haven't seen the 5 levels comment, but clearly they aren't doing that. Anything more than one click seems to be very likely ignored, and even one click pages can be ignored.

What a "search engine", titles of pages ignored, location of ISP and not content is the prime criteria, 100 five page sites crosslinking to each other can all be indexed and algo-loved, while one 500 page site gets 5% of its pages indexed.

Worst conceptual thinking ever.

eyezshine

11:15 pm on Feb 5, 2005 (gmt 0)

10+ Year Member



The 5 levels comment was a small exaggeration. It could be 2 levels or 3 or 4 depending on how many links are pointing at the home page.

They must give pages some kind of page rank because if they didn't they wouldn't index as deep as they do now.

One of my sites they indexed 3 levels deep without any of my help. So just the popularity of the home page was enough to get 3 levels indexed.

I am assuming they are trying to keep their index as small as possible while still serving good enough results while they test everything and learn from their engine.

Once they start to trust their engine I think they will turn up the knob and index deeper. It's easier to work on a small database than it is to work on a big one.

sincraft

10:55 am on Feb 7, 2005 (gmt 0)

10+ Year Member



well maybe someone could explain to me why I have 650+ listings on MSN. even some rank #1 page #1 listings yet >0< listings in G or Y after being up now for 1.5 months..

MSN seems to index things a bit differently than most too. I like the way they list things..

S

zeus

10:22 pm on Feb 9, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



eyezshine, well your suggestions diddent work for me.

eyezshine

10:42 pm on Feb 9, 2005 (gmt 0)

10+ Year Member



It's working for me so far. It's all in the way you link the pages together.

I just draw 5 dots in a circle. each dot = one of my websites. Then just connect the dots without cross linking sites together.

It only works one way if you have less than 5 sites. Each internal page should be able to link to 1 page on 2 other sites. But do not cross link the pages! Like A - B And B - A is bad.

Of course I haven't seen a big increase in traffic but the traffic seems to be slowly going up every day. So far it's been 3 weeks since I did this and it has definately increased my pages being indexed on all 5 sites in MSN.

HenryUK

3:12 pm on Feb 12, 2005 (gmt 0)

10+ Year Member



Depth of pages in site does seem to me to be crucial.

I have looked at the number of pages indexed relative to Google on some different types of site. Google-optimised db-driven sites seem to do relatively badly, news/info sites are doing a lot better.

I have a theory about why MSN is working in this way. I think what they are trying to do is to scoop up authoritative information from news and information sites that are regularly updated.

So it will not matter where a story ends up (ie deep in a well-catalogued archive), what matters in terms of importance to MSN is "was this information at some point on or near the home page?"

This is an uncomfortable situation for those like myself who run large db-driven sites and have done well out of optimising them for the Google algo.

However, MSN has to distinguish itself from Google, and it can be argued that their approach has its own validity.

To test my hypothesis I've added a new page close to the home page that links to all the previous day's new data. If anyone is interested, I'll report back on how this works.

jimh009

12:44 pm on Feb 15, 2005 (gmt 0)

10+ Year Member



I thought I was having the same problem. MSN Bot has been, by a huge margin, the biggest bandwidth sucking machine on my site the past few months. It's actually nearly brought down my forum a few times - I've had more than 200 MSN bots on my forum as "guests" a few times.

However, MSN bot has diligently put the pages in their index. Initially, I thought that they were spidering in the pages but not putting them in. But, on further research, I realized I was not using the proper commands to see what pages are in their index.

First, don't rely strictly on this command - site:www.yoursite.com

Using that command I was only able to pull up 250 results for my site (which is over 5000 pages). And much of the results were very odd indeed.

Instead, to find if your deep pages are in the MSN index, try doing the following:

site:www.yoursite.com "unique phrase for your site shared on multiple pages"

The above type of search works superbly well if you have a common phrase shared among the pages of your site. After doing that search, I realized MSN had indeed indexed my deep-down pages. It wasn't ranking them well, but guess that is another story for another day. :)

Jim

HenryUK

10:24 pm on Feb 16, 2005 (gmt 0)

10+ Year Member



Interesting stuff Jim - although I have to say that when I tried your tip I got fewer pages coming up than on a straightforward "site:" search.

One new oddity that I've seen is that my home page is beginning to rank well for phrases that include words not on the home page - but phrases for which I have indexed pages that I'd expect to do well.

For example, the site is ranking number one (out of around 10,000) for something along the lines of red widgets [town]. The word [town] does not appear on the page, or in the code. The word red appears once, the word widget appears once. The phrase red widgets brings up around *8 million* results. On this search the home page of the site is top 30, despite the minimal on-page optimisation.

This may seem unfair or unreasonable, but in fact the site is a vast and market-leading source of widgets, red and otherwise, including some in [town], so it's actually not a bad result from a user's point of view

This raises an intriguing new possibility - that MSN search will go through a site, and when it finds a pages within a site that are highly relevant to a particular phrase, that it boosts the home page of that site for that phrase.

This is totally counter-intuitive to Google-orientated optimisers, but it makes a kind of sense - when you optimise for Google, do you REALLY want people to land on your optimised deep page, or would you actually rather that they came to your home page and saw all you had to offer? If you're like me, you regard the individual pages as a way of hooking people in, to show you have something relevant, but really you want to move them to your home or search page as soon as you can.

Think before you answer! Don't get stuck with your Google mentality - there's more than one way to skin a search cat!

All just hypothesis for now, and all of course MHO only...

Henry

eyezshine

8:51 am on Feb 17, 2005 (gmt 0)

10+ Year Member



I think I would rather have the visitor come directly to the page that has what they are looking for.

Of course that is not usually what happens even though most engines try hard.

But it would be easier on the engines if they could just index the home page and then spider all the deep pages to know what keywords to rank the home page for. Then let the website owners work it out from there?

That seems like a dumb idea but who knows with microsoft?

HenryUK

10:36 am on Feb 17, 2005 (gmt 0)

10+ Year Member



eyez

I see your point and respect your preference. However, there are some sites, like mine, which have sophisticated search options within the site designed to let the user specify which size and type and location of widget they would like.

My search page is better at returning the right widget to a particular user than Google is - as you would expect given our specialisation.

A user looking for red widgets in Largeville will find a page from my site by putting "red widgets Largeville" into Google. Now, my site will have maybe a hundred different pages listing differently specified red widgets in Largeville, and what I want to do when that user arrives is to steer them to my search page where they can be a little more specific about the size and shape of the red widgets they are looking for, and get a better result on which they are more likely to make a transaction.

If this is the way MSN is going and (critically important) the users understand it, then I can see it working very well for some sites.

It's not what we're used to, and it may not work out. But I don't think it's dumb - just different.

H

eyezshine

9:17 pm on Feb 17, 2005 (gmt 0)

10+ Year Member



I see where you're coming from and I can also see how it would help sites alot. But it will confuse users at first and they will need to be trained.

I really don't see MSN doing this at all though because if they did they wouldn't cache the pages they index.

I think MSN is just trying to keep their database small while they fine tune it. Once they got the bugs worked out I think they will open the gates and index more pages deeper in the sites and hopefully rank them better.

Right now MSN sends most of the traffic to my home page simply because the internal pages don't rank well for any good keywords. And the reason for that is because the internal pages don't have much link popularity like the home page does.

MSN is ranking by link popularity but they are only counting link popularity for individual pages and not entire sites like google and yahoo do.

So it looks like each individual page on a site has to get it's own link popularity to be able to rank well for any kind of keyword.

Which really does make sense to me but it makes it harder to optomize an entire site for msn without getting people to link to your internal pages.