Welcome to WebmasterWorld Guest from 54.234.8.146

Message Too Old, No Replies

Why are "Site:" command pages inlfated?

Site Command

     

wiseapple

12:22 pm on Aug 17, 2005 (gmt 0)

10+ Year Member



Just curious... How many people have accurate page counts when using the "Site:" command at Google?

If you do not, how far is it off from the actual number of pages you have?

If it is off, is your site in penalty mode?

What do you think is causing your site to have an incorrect count in the number of pages?

Has anyone corrected the page count and got it to reflect the right value? What technique did you use?

Thanks.

wiseapple

12:58 am on Aug 18, 2005 (gmt 0)

10+ Year Member



(Bump) Am I the only one?

Rosalind

1:18 am on Aug 18, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think my largest site has an inflated count of around 3X the actual number of pages indexed. However, no site under 100 pages seems to have an inaccurate count, as far as I can tell.

Google has some trouble with old pages in the index, but that doesn't account for this anomaly.

lammert

7:48 am on Aug 18, 2005 (gmt 0)

WebmasterWorld Senior Member lammert is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



The site: command counts all URLs associated with the site, which is not the same as all pages of that site, or all indexed URLs. Some examples of URLs which are counted in the site: command:

  • URLs temporarily deleted with the URL removal tool
  • URLs from other sites doing a 302 hijack of your site (should be fixed by now)
  • Obsolete URLs which have still links to them from other sites and which Google visits now and then just to see of they are active
  • Links to your site with typos in it i.e. www.yourdomain.com/fiel.html instead of www.yourdomain.com/file.html. At one time I had many copies of my sitemap in the SERPs because I used the sitemap as my 404 page. Except for the original sitemap they now all went supplemental, but Google still counts them.
  • URLs that have been marked with "noindex,follow".

Google keeps track of many more URLs of your site, but I don't know if these are counted in the site: result. For example, if you have a 301 redirect from domain.com to www.domain.com, then Google must know that domain.com/file.html exists, but is equivalent to www.domain.com/file.html. So there has to be some database record or field somewhere with information about domain.com/file.html, but I don't know if this one inflates the number in the site: command.

g1smd

1:40 am on Aug 19, 2005 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Most inflated sites that I have seen, have been serving both www and non-www but without a redirect. This is duplicate content.

Add a 301 redirect to fix that problem.

wiseapple

1:48 am on Aug 19, 2005 (gmt 0)

10+ Year Member



The 301 redirect would be the logical thing to get things back in line... However, what happens when Google has grabbed things and has never updated since 2004? If they dont revisit - this means they will never get the 301. Therefore, the stuff stays in the index.

bull

6:05 am on Aug 19, 2005 (gmt 0)

10+ Year Member



Including also: stuff crawled by the Mozilla Googlebot only. Can verify this on one of my domains.

Nuttakorn

4:58 pm on Aug 21, 2005 (gmt 0)

10+ Year Member



My client site also effect from that. It is shown only url filename, no title, description like previous time. When It will recover?

Ossifer

7:19 am on Aug 22, 2005 (gmt 0)

10+ Year Member



site: is fine for me, but link: is screwed up completely
 

Featured Threads

Hot Threads This Week

Hot Threads This Month