Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Mix-up omits key pages from Sitemap. Rankings lost - will we recover?

         

zap995

6:39 pm on Apr 13, 2008 (gmt 0)

10+ Year Member



Our website began as a Wordpress-based site. It since grew into an authority site with lots of very high rankings for very competitive searches in our segment.

We installed a sitemap generator plugin years ago to help with our search optimization. We also use smart URLs in the format of http://www.example.com/page-about-popsicles.html

For the past several months we've been working on a new section to our site, and we decided pages generated by this CMS would replace some of the WordPress pages that were not news articles. For example, our imaginary WordPress entry "Page About Popsicles" would be replaced by a page generated by the new CMS with the same title and URL (i.e. http://www.example.com/page-about-popsicles.html).

Sounds good so far, right? I thought so!

Well, despite my grave concerns, my colleagues and our programmer neglected to create a sitemap for the new CMS or add its content to the current sitemap. The pages were deleted from WordPress Mar 26th and the CMS took over for those URLs.

As a result, several of our VERY IMPORTANT pages were removed from Google's index after a week or so. Apparently, Google must have assumed we deleted the pages since they were deleted automatically from the WordPress sitemap.

Finally, our programmer added a second sitemap for our new CMS, with a list of the 519 or so pages included. He created a sitemap index including the new CMS sitemap and the Wordpress sitemap. I wanted it all to be in one sitemap to reduce confusion for google (and to make it look like nothing changed, but he seemed to think this method would be sufficient).

Suffice to say, the pages have still not returned to Google and traffic is way down. What's gonna happen? Will they return in the next few weeks or are we screwed?

[edited by: Robert_Charlton at 6:46 pm (utc) on April 13, 2008]
[edit reason] use example.com - it can never be owned [/edit]

tedster

7:30 pm on Apr 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not 100% clear - are each of your pages now reachable via either of two urls? That can create big problems. If you have any other kind of duplicate url issues in the mix, things can get even worse (see the Hot Topics area [webmasterworld.com], which is always pinned to the top of this forum's index page).

A more general comment - a sitemap is only an additional aid to Google crawling, and not in any way a limit on what urls get spidered and indexed. If a url can be found via regular link-based crawling around the site and the web, that url gets spidered, too.

zap995

8:02 pm on Apr 13, 2008 (gmt 0)

10+ Year Member



There are no two urls.... as stated the urls are the same as the old ones...

mack

8:16 pm on Apr 13, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If the physical URL location still points to the page then that isn't really a reason for Google to drop it.

Have you had any downtime? Has googlebot been hitting your pages.

I know of one example where a sitemap has published with a few pages missing, the pages where drooped from the Google index, but came back within a week. It's unusual, but kind of makes you wonder just how much Google is relying on sitemaps when a CMS is being used.

Mack.

zap995

8:20 pm on Apr 13, 2008 (gmt 0)

10+ Year Member



I'm sure there's some algo whereby they rely on the sitemap heavily if it's deemed accurate for a period of time. All i know is all the affected pages are ones that were absent from the sitemap for a week or so.

And yes, the physical URLs *are* the exact same. The pages have been de-indexed for 2 weeks at best.

tedster

9:16 pm on Apr 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On a closer reading, you have actually generated what is a new page, but given it an existing URL - correct? Assuming that's the case, here are some further thoughts.

You may have changed the "page" so much that Google cannot rank it the same as it did, especially if that ranking was highly keyed to on-page factors. Also, that url certainly may "come back" after a period of re-examination for trust factors.

Whatever Google has been up to recently, it seems like a lot of routine data crunching has slowed down. So re-scoring your "new" page may be taking longer than it commonly might have.

You have't mentioned if googlebot is spidering thsee "missing" urls - that would be a key factor to look at in assessing how big the problem is, or indeed, what it is. Also important, are these urls missing completely, or just missing from their previous place in the search results.

zap995

9:39 pm on Apr 13, 2008 (gmt 0)

10+ Year Member



tedster -- correct.

The body text is generally very similar with some modifications, but many new tabs/features have been added to those pages.

I'm not sure how to check if google is spidering the pages, but in the second sitemap (for the new CMS pages) only 76 of 519 pages are listed as indexed.

zap995

1:09 am on Apr 14, 2008 (gmt 0)

10+ Year Member



And yes, they're missing completely.

tedster

1:31 am on Apr 14, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Check for spidering issues:

1. Your server logs - see what pages googlebot requests, and when, and what response the server gives.
2. Webmaster Tools - many spidering problems are reported there.
3. Firefox - install the User Agent Switcher and see what response you get with the googlebot user agent.

zap995

2:53 pm on Apr 14, 2008 (gmt 0)

10+ Year Member



Tedster,

I tested #3 and I don't see anything unusual. Webmaster Tools also doesn't seem to show an issues with spidering.

Could this script be causing any trouble?

// JavaScript Document

var omitformtags=["input", "textarea", "select"]

omitformtags=omitformtags.join("¦")

function disableselect(e){
if (omitformtags.indexOf(e.target.tagName.toLowerCase())==-1)
return false
}

function reEnable(){
return true
}

if (typeof document.onselectstart!="undefined")
document.onselectstart=new Function ("return false")
else{
document.onmousedown=disableselect
document.onmouseup=reEnable
}
document.oncontextmenu=function(){return false}

TheMadScientist

3:34 am on Apr 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The body text is generally very similar with some modifications, but many new tabs/features have been added to those pages...

...The pages were deleted from WordPress Mar 26th and the CMS took over for those URLs.

I think it is important to remember: the URL and the content accessed by requesting the URL are not necessarily the same thing. They work together, but when the content changes focus, the 'page' is different regardless of the URL.

In reading it seems like you edited the text, added other 'diluting' factors, changed your linking structure (I am guessing if the pages were removed from WP, there is some difference in the way they are linked in?), and at the same time you moved to a new sitemap page, then your rankings for the URL dropped?

My guess is all of of the factors worked together and interpreted to 'new page' and a loss of some link weight to the URLs.

If you think about it:
You edited the content.
You added other factors to the content. (Possibly Diluting?)
You removed an established links page (sitemap), and set up a new one.
You changed the way they were linked within the website. (Guessing x 2)

If the last line above is correct, I would guess it is the single most contributing factor to the drop, followed by 1 & 2. Unless your WP sitemap has some huge 'link weight' running into it, and passes most of it out to the URLs in question, removing them from one sitemap and placing them on another was probably the least contributing factor to the drop.

[edited by: TheMadScientist at 3:37 am (utc) on April 16, 2008]