Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Thoughts on going into the Supplemental Index

         

arbitrary

7:55 pm on Jul 26, 2005 (gmt 0)

10+ Year Member



A site of mine recently went into the supplemental index. The site languished in the Google sandbox for a while since its launch. Now when I do site:example.com all pages are shown to be in the supplemental index.

The site has never had very strong incoming links.

I am hoping you could share your thoughts as to why sites go into the supplemental index. Does it relate to incoming links or lack of them? Does it related to lack of unique content on pages? Internal linking structure?

What can be done to get sites out of the supplemental index? Has anyone done this sucessfully and if so, how?

If you could back your thoughts up with examples or things you have seen that would be much appreciated.

Thanks.

[edited by: ciml at 12:53 pm (utc) on July 27, 2005]
[edit reason] No real domains in examples please. [/edit]

arbitrary

5:39 pm on Jul 27, 2005 (gmt 0)

10+ Year Member



That is a hard edit to understand, as I really don't own 'domain'.

Anyway, any thoughts on this before it disappears into oblivion.

oddsod

5:43 pm on Jul 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WW. You're not allowed to drop URLs or link to most sites from here.

Have you checked for the www vs non-www [google.co.uk] issue? There are numerous threads here about that.

arbitrary

5:51 pm on Jul 27, 2005 (gmt 0)

10+ Year Member



<snip>

[edited by: Brett_Tabke at 6:12 pm (utc) on July 27, 2005]
[edit reason] thanks - please reread the tos on moderation. thanks. [/edit]

arbitrary

6:15 pm on Jul 27, 2005 (gmt 0)

10+ Year Member



Thanks oddsod but that is not the issue as I have a redirect from non-www to www.

Dijkgraaf

10:48 pm on Jul 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google is running two different bots for spidering pages, one in which the user agents starts with
Googlebot/2.1
and the other
Mozilla/5.0 (compatible; Googlebot/2.1

It appears that pages spidered only by the Mozilla googlebot will appear as Supplemental Results.
The Mozilla bot will spider URL's with more than two parameters, whereas the standard Googlebot won't.
The Mozilla Googlebot will also try and find URL's in JavaScript.
Do your pages have multiple parameters or only accesible via JavaScript?

Some people have reported pages bouncing back and forwards between the Supplemental and standard index depending on which bot visited last, although I have not observed this myself on my site.
Which of the above two bot visited your pages last?

arbitrary

11:15 pm on Jul 27, 2005 (gmt 0)

10+ Year Member



Dijkgraaf, thanks and unfortunately I think your hunch is right.

Since about July 4th, Googlebot/2.1 has only grabbed robots.txt. My site pages are actually being crawled by Mozilla/5.0 which fits into your theory about pages being crawled by that bot being in the supplemental index.

Here is something interesting though. As I replied before to oddsod, I do have a 301 redirect from non-www to www and this is functionning correctly on the site.

Oddly, when I do site: , 90% of the pages that Google shows in the cache are pages from non-www. The cache of these non-www pages is also months old. Even the 10% of pages that Google is showing with the www show very old data and these pages are also in the supplemental index. I placed the redirect from non-www to www many months ago. Don't know the exact date but I am guess sometime between February and April.

My pages are static html and have no parameters. I don't have Javascript on my site. I do however have some pages that are dynamic pages. These dynamic pages have a no index, no follow tag on them. I did notice before that Google would crawl these pages (despite the no index, no follow tag) and then display them as URL only. Perhaps it sent in Mozilla 5.0 if that is what is supposed to spider those pages.

Okay, so what do I do now? How can I get myself out of the supplemental index? How can I even get it to spider my www pages. Like I said, my redirect is functionning properly (I have checked it with the tool here at WW.)

g1smd

11:43 pm on Jul 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A page may not be supplemental for all searches that you do where it is on the SERPs!

I have a page that is indexed normally for current content, shows in the SERPs when you search for that current content, is cached every few days, and the snippet shows a part of the current content..... however if you do a search for a word that was removed from the page over two years ago, then that same page is still returned as a match but this time it is flagged as a supplemental result. The cache shown is the same "new" and up to date one (so the word you searched for is nowhere in this cache [it even says "the following terms only appear in links pointing to this page..." (which is untrue - the only link pointing to it is that actual Google SERP)]). This type of supplemental result is a pain as it takes you to a page where that content used to be, shows you a snippet with the information you are looking for but then you find that neither the real page, nor the Google cache contains the information at all.

I have 5 friends who all have sites with the same issues. I got them all to contact Google help within the same week and ask the same questions using almost exactly the same words. The responses from Google went from "definately yes", and "yes", through to "no" and "definately no" when we said about trying to get the old data deleted from their database.

arbitrary

11:56 pm on Jul 27, 2005 (gmt 0)

10+ Year Member



g1smd, sorry but I did not follow all of that.

When I do site:, all my pages are all supplemental. 90% of my pages show non-www pages which have not been accessible on my server for months. 10% show www pages but the cache of these pages is also months old.

What did your friends specifically ask Google? What were the responses? What action did they take and what were the results?

So far I have come up with the following as courses of action I might take:

1) Submit the site using the "Add URL to Google" hoping that they will start fresh.
2) Submit my site using Google sitemaps.

Any other ideas?

wiseapple

3:13 pm on Jul 30, 2005 (gmt 0)

10+ Year Member



One remedy I am testing at the momement. If all the of the files reside in a subdirectory, rename the subdirectory and do a 301-redirect to this new directory. Make sure to do an update of all links pointing to the directory.

Redirect permanent /directoryxyz h**p://www.***.com/directoryabc

This seems to fix some problems.

I am using this to fix problems where google is reporting ten times the number of files than exist in the diretory. I am also looking for it to fix where everything has gone URL.

Anyone have thoughts on this?

Our site has been reporting 80,000 pages where there is only 20,000. Every since we have done the above it is slowy getting back to correct pages counts.

arbitrary

3:59 pm on Jul 30, 2005 (gmt 0)

10+ Year Member



wiseapple, thanks for a creative solution, I am going to try that in the near future.

I too have a large site like yours. Google was reporting 3 and 4 times the amount of actual pages as well. Now it is reporting a close to accurate count but the pages are in the supplemental index. Most pages reported are non-www and showing months old data despite there being in place a redirect from non-www to www.

Our site has been reporting 80,000 pages where there is only 20,000. Every since we have done the above it is slowy getting back to correct pages counts.

It seems to be working for you in terms of getting accurate page counts but were your pages in the supplemental index and did this help get them out of the supplemental?

wiseapple

7:07 pm on Jul 30, 2005 (gmt 0)

10+ Year Member



Time will tell... A good chunk were in supplemental index or were showing URL only. When I moved them to the new directory - slowly they are starting to show as normal.

Yahoo and MSN handle the 301-redirect really well. Google is a little slower at this.

Also, the page count goes down radically.

Not sure if this will really help out. We lost most of Google traffic after Feb. 2nd. Therefore, with Google, there is not much left for us to lose. Only MSN and Yahoo provide traffic. I have been trying to somehow get the site back in good graces with Google.

wiseapple

7:10 pm on Jul 30, 2005 (gmt 0)

10+ Year Member



This reminds me of another trick... I had a new site that was listed a supplemental. It had maybe one or two incoming links. It was listed as supplemental. I did a press release for the site and released it through a professional services. It was released to Google news, Yahoo News, and a bunch of other news services... A couple of weeks later - supplemental listing was gone. Now the site is indexed normally with a pr 1.

arbitrary

7:54 pm on Jul 30, 2005 (gmt 0)

10+ Year Member



Thanks wiseapple, all you're saying helps alot.

It is good to know that Yahoo and MSN handle the redirect well. I too have a feeling that links may help getting you out of the supplemental index.

I am going to be getting press releases as well for my site when I release some new content which should be coming soon. It is nice to know that helped one of your sites.