Forum Moderators: open

Message Too Old, No Replies

The slow fading of big sites

Threshold PR, estimate, or something else?

         

claus

12:28 am on Jul 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There's been quie some speculation that you'll need some given level of PR (or inbounds) to get pages indexed, and there's also been evidence that some sites are losing pages that once was indexed.

By accident, i've come to track one Google search: "

bbc site:bbc.co.uk
"

Oct 12, 2003 [webmasterworld.com]: 3,100,000 pages
Apr 09, 2004 [webmasterworld.com]: 823,000 pages
Jul 12, 2004 [google.com]: 696,000 pages

I don't believe that BBC has deleted about 80% of their pages in nine months. So:

A) given that Google will only display 1,000 pages in any case, is this decline in number of pages indexed real, or

B) is their "about 696,000" statement from the serps highly unreliable (in the Google API, this number is referred to as an estimate and not a count)

If (A) then: Is the limit PR or folder level (path length) or a combination? And, what PR or path lenght seems to be the threshold?

Any opinions?

vitaplease

6:08 am on Jul 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Claus,

Not sure, but I think nowadays Google also filters out the subdomains with this "site:" search funtion?

BBC 2002 [webmasterworld.com...]

<<added>> sorry that in part seems to be the case with "www" added to the query..

claus

6:48 am on Jul 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I should add to the BBC case that BBC has made some changes during the period, which could imply that they don't use the term BBC as much on their pages and in anchor text as they used to (however strange that may sound).

So, for the search:

site:bbc.co.uk
(without keywords) it is:

Oct 12, 2003: 3,100,000 pages plus (*est)
Apr 09, 2004: 1,350,000 pages
Jul 13, 2004: 1,210,000 pages

Which is not an 80% drop, but still more than 50%.

------------
(*est) Estimate, as the figure without the keyword would logically have to be higher than the figure with the keyword. As i recall, you couldn't perform that search without a keyword in October 2003

Powdork

8:26 am on Jul 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Did they ever have 3.1 million pages, or is it possible that G is getting better at deciding what is, and isn't a page. Or did bbc move from frames to regular pages? Or did bbc have dynamic pages with url rewritten to static that could have both (or more) been counted. There are many possibilities especially with googles better ability to weed out duplicate content.

gpmgroup

11:43 am on Jul 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you go to the BBC home page www.bbc.co.uk
In the left hand column , below "Explore bbc.co.uk"

The site states

"More than 2 million amazing BBC pages"