Msg#: 3732122 posted 3:48 am on Aug 27, 2008 (gmt 0)
MSN has over 15,000 pages indexed for my site. My site has less than 2,500 pages. How can I remove the whole thing from their index and start over with a new xml sitemap.
I have already removed it from their Webmaster Cenral. A MSN rep has informed that that will not remove the site from their index. He did not give me the courtesy of telling me how to remove it from the index.
Msg#: 3732122 posted 3:00 pm on Aug 29, 2008 (gmt 0)
Someone will tell you you just need to update your robots.txt, but that won't work in getting pages already indexed kicked out.
Sounds to me like you have a duplicate URL issue. If you REALLY want to get it out of the index, then you would need to serve cloaked content to MSNBot only giving a 301 redirect to the home page... every time the spider looks at an old URL, it should then in theory update to the home page, thus taking it out of the index.
BUT... that's not what I would advize. It's a lot of effort and who knows if MSNBot will understand it properly anyway. I think you need to find out why MSN has indexed so many more URLs than you think you have. Then you need to find a way to make it impossible for anyone (humans and bots) to see multiple urls for the same content. You should do this by redirecting users to a standard URL syntax (OK - if P1R is reading... URI syntax). The redirect should again be a 301 and you need to also be careful that you don't 301 a 301 (Redirect stuff already redirecting).
Then... human or computer... the system should correct itself over time.
There may be a short term downside here. Google may also see al the 301s and cause you a short term bit of grief as I've been hearing that wholesale changes are Google's current pet subject - but there's no substitute long term for doing things right... and that is to avoid dupliacte URLs
[edited by: Receptional at 3:02 pm (utc) on Aug. 29, 2008]
Msg#: 3732122 posted 5:20 am on Sep 1, 2008 (gmt 0)
Actually, that count of the number of pages indexed is wrong on every single site I've ever seen. If you click your way down far enough (the end for a small site), that number will change to something more like what it really is.
That number of pages figure has been wrong 100% of the time for as far back as I can remember.