Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Decreased indexed pages in google

         

minii

1:43 pm on Nov 19, 2005 (gmt 0)

10+ Year Member



Hi,

Day by day google is decreasing the number of indexed pages of my website. Any specific reasons why is it happening? How should i prevent the same?

Many Thanks,

Minii

tantalus

5:33 pm on Dec 4, 2005 (gmt 0)

10+ Year Member



Ah-hhh. Good Ole Mozzy Bot. (Which is what I like to call it, - as it reminds me of a little mosquito buzzing from page to page.)

As search is a prob at the moment here I thought I would give a couple of links threads that discuss it, I'm sure dayo has more;)

[webmasterworld.com ]
[webmasterworld.com ]

bull

11:03 am on Dec 7, 2005 (gmt 0)

10+ Year Member



64.233.179.104 's index seems to have pages only crawled by Mozilla Googlebot.
Index size: > 25 billion.

walkman

3:52 pm on Dec 7, 2005 (gmt 0)



>> 64.233.179.104 's index seems to have pages only crawled by Mozilla Googlebot.
Index size: > 25 billion.

something is "wrong" (maybe by design?) with Google. They only have 1/3 of their pages on main G. I searched for "a" and got 25 bill on one, and 8.7Bill on the Google.com IP. My site has no dupe issues yet, many pages have been dropped by the DCs that matter.
All I can think is that PR4...

Rosamunda

5:00 pm on Dec 10, 2005 (gmt 0)

10+ Year Member



I´m sorry for asking a question that maybe is a bit ... erm... dull... but... what´s the difference between the "Normal" Googlebot al the "Mozilla" Googlebot?

:)

bumpski

9:14 pm on Dec 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One of the big differences between the two bots is the Mozilla bot sets the bit requesting GZIP compression from your webserver. Your webserver has the capability of compressing your webpages content by a factor of about 3 or 4 times. A webpage that is 20K in size can be compressed by GZIP to 5K before your server sends it out. What a saving in bandwidth for the web and for Googlebot! This is an optional request and your server may just serve your page uncompressed as well.

All modern web browsers set this optional GZIP compression bit in their requests to webservers. MOST webservers do not provide GZIP compressed content! WHY? So they make more money on bandwidth! (At least that's one reason)

So if your server never provides compressed content there isn't much point in Google using bot that requests it. This may be one factor in which bot you see.

Your webserver logs should show a very different size for the page between the two bots if your server is GZIP compressing. Of course your website would be faster if your webserver did serve GZIP compressed content.

g1smd

9:26 pm on Dec 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Although compression uses less bandwidth, it also uses more processing power on the server, unless the server pre-compresses and keeps a cached copy of the compressed version too. So for high usage, the compression process may be what slows the server down more than the bandwidth used.

Rosamunda

1:08 pm on Dec 11, 2005 (gmt 0)

10+ Year Member



Thanks both of you guys!
Now I have a more accurate idea of what you´re talking about...

:)

roxah

9:48 pm on Dec 12, 2005 (gmt 0)

10+ Year Member



Basically Mozilla Googlebot does not add pages to the index. Well very very occassionally.

Its purpose is unknown.
-----

gotta love that quote

Dayo_UK

10:04 am on Dec 13, 2005 (gmt 0)



roxah

He he - things have changed since then though - it does look like a new index is being built using Mozilla Googlebot data on the test DC.

walkman

2:07 pm on Dec 13, 2005 (gmt 0)



Mozilla got another 450 pages today. The "real" GB...just 4.

Dayo_UK

2:11 pm on Dec 13, 2005 (gmt 0)



walkman

You seen any of the Mozilla Googlebot pages crawled in the test DC index.

MC Indicated on his blog that the test DC has a different crawl and indexing infastructure - so perhaps Mozilla Googlebot is now going to kick in and is the future :) - I have tried to start a new thread. Hope it gets approved soon.

HenryUK

3:45 pm on Dec 13, 2005 (gmt 0)

10+ Year Member



site search on Google - 5500 pages
site search on 64.233.179.104 - 350,000 pages
actual live pages on site - 50,000
pages now expired - c. 75,000

go figure...

H

bumpski

11:28 am on Dec 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes Brett can give us quite a lecture on this one (GZIP)!

g1smd

Unfortunately by default Apache is compressing at Level 6 which is CPU intensive. Were it simply compressing at level 1, most of the compression efficiency is gained, with reduced CPU usage. One can also compress using PHP, I'm not sure at what level.

With the correct level of compression total CPU usage should actually decline, and obviously bandwidth usage is drastically reduced, (DMA usage as well, very important).

Regardless, for my customers it's way faster! And if Google succeeds in using compression, as many other BOTS could, the whole internet would be faster, and way faster for dialup users! I know Apache caused Brett problems, but if GZIP compression were succesfully implemented, Brett probably would not have had to pull WW from the search engines (at least today anyway!)

Now we have ISP's compressing our content in their caches, so we (you the webmaster) don't see all the hits you should in your logs. Your pages are being served by ISP's caches not you! Not good for tracking your websites usage.

texasville

8:27 pm on Dec 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So how do we know if our server is doing gzip? And is that Apache only? You didn't mention others.

bumpski

3:56 pm on Dec 15, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually by default Apache is not serving compressed content. Most web hosts still are not serving compressed content. My webhost only hinted at how I can use a back door PHP method.

One way to tell if your web host is serving GZIP compressed content is by reviewing your website logs. Find a web page that both the Mozilla 1.1 Googlebot and the "classic" 1.0 Googlebot have crawled. If you see a very large difference in bytes transfered then you know your web host is "GZIPing" your web page's content. Typically the bytes transferred will be 3 to 4 times less for the Mozilla 1.1 bot, if your host supports GZIP.

There has been a great tool online for years to test webhosts, but today when I checked it was gone. Had it been there I would have sticky mailed it.

Wait, just found another commercial site with a good test. I can sticky mail it. (Google serves compressed content)

I wonder if compressed content is considered another minor "At a Webmaster" by Google for ranking?

walkman

5:42 pm on Dec 15, 2005 (gmt 0)



yeah,
Mozilla /other DC has all my pages and I'm gettign sick of it. I want them on the regular DCs ;)

guru5571

8:13 pm on Dec 17, 2005 (gmt 0)

10+ Year Member



I'm noticing my page count drop daily. I am also seeing the same thing with all my competitors. This week it's been dropping by several thousand a day on sites with several hundred thousand pages. Will someone who knows, please explain what is going on. Thanks.

rkhare

8:31 pm on Dec 17, 2005 (gmt 0)

10+ Year Member



its the otherway round for me, pagecount going up everyday since 12/15
This 48 message thread spans 2 pages: 48