Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Pages Dropping Out of Big Daddy Index

         

GoogleGuy

6:11 am on Apr 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Continued from: [webmasterworld.com...]


One thing to bear in mind is that Bigdaddy will have different crawl priorities. That can account for some of it. If you've run into any spam problems in the past, you might also want to do a reinclusion request. Otherwise, please send an email to bostonpubcon2006 at gmail.com with the subject line "crawlpages" (all one word), and I'll ask someone to see if they notice any commonalities.

internetheaven

11:23 pm on May 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I had 20,300 pages showing for a site:www.example.com search yesterday and for the past month. Today it droped to 509 but my traffic is still pretty constant. I normally get around 4,500 - 5,000 to that site per day and today I've already got 4,000.

So, either Google doesn't account for even a small percentage of my traffic (which I doubt) or the way Google stores information about my site has changed. i.e. the 20,300 pages are still there, Google will only tell me about 509 of them. As far as I can tell, I think the other pages have been supplemented.

tedster

2:42 am on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That is a very telling post. I wonder how many other people who are seeing low number from the site: query are still seing no drop in traffic.

hutcheson

4:44 am on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>This is the new normalcy. Get used to it.

>Well one should better tell G/Y/MS etc that

One doesn't tell Bill Gates the truth unless one wants to witness a temper tantrum. Microsoft Uber Alles!

As for Yahoo and Google, I suspect they already know that the war against e-mail spam, just like the wars against terrorism or organized crime or disease or hunger, will never be over. Victories here and there, surely -- but the war goes on. If the evil forces lose badly enough, they just change their name and start over. (How many names has VStore gone through now? Or Caldera?)

tigger

5:38 am on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>That is a very telling post. I wonder how many other people who are seeing low number from the site: query are still seing no drop in traffic.

when my pages dropped I saw a 30%/40% reduction in traffic and a few days ago I saw a small increase with command site:url not much from 148 to 204 but the odd thing is traffic has shot up in fact yesterday was my best day for a few months and according to my stats G traffic has increased by 50%

Lorel

2:53 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One common factor I'm seeing from 3 sites leaking pages from the index is that the disappearing pages didn't have enough content for Google to determine they weren't duplicate content when comparing the text on the page with the template..

site 1 had inventory pages with not enough text to differentiate from other inventory pages or other text only pages that only had one paragraph of text.

site 2 had paintings for sale with not enough description to tell Google this page is different than the other.

site 3 had descriptions and text on the page too similar to other pages.

i.e., each page needed more text to get past the duplication penalty.

mattg3

3:01 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That is a very telling post. I wonder how many other people who are seeing low number from the site: query are still seing no drop in traffic.

The server that lost 90% in sites actually gained traffic, the one that lost 40% has lost traffic, lol.

reseller

4:13 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have noticed changes in the serps of these two DCs sets for my search keywords. Don't know whether that brings any good news for your sites.

[72.14.203.99...]
[72.14.203.104...]

[72.14.207.99...]
[72.14.207.104...]

g1smd

4:43 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> I had 20,300 pages showing for a site:www.example.com search yesterday and for the past month. Today it dropped to 509 ... <<

Are you sure that both results came from the same datacentre? For one site I am looking at now, I see 3500 pages indexed in BigDaddy datacentres, and just 45 pages in the "experimental" datacentre.

Check the IP address of the datacentre where you see each result to be sure.

Lorel

5:02 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hmmmmm,

The site I'm monitoring that has the worst effect shows:

87% of pages missing on this DC: [72.14.207.104...]
29% missing from this DC: [72.14.203.104...]

phantombookman

5:07 pm on May 7, 2006 (gmt 0)

10+ Year Member



Tedster
That is a very telling post. I wonder how many other people who are seeing low number from the site: query are still seing no drop in traffic.

I was going to post about this the other day.
I have a solid established site (no funny business or dup etc) the has lost 75% of its pages on a site: search.
Traffic however is only fractionally down, and accountable now that the sun is coming out.

This lead me to doing random searches looking for internal pages, all of which normally rank top 5.
It did a lot but found nearly always found the page.
There are definately some missing but I could not hit the mathematical 75% failure rate.

I do wonder whether, as well as other problems in G, their site count is now also inaccurate for sub 1,000 page sites

phantombookman

5:10 pm on May 7, 2006 (gmt 0)

10+ Year Member



Lorel
your 29% DC shows more pages for one of my sites also, however for a couple of others it shows less.

Interestingly my ecommerce site shows no lost pages at all on any of the DCs, indeed is gaining the new pages I add!
It's all very strange

trinorthlighting

5:39 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I still have pages dropping out of the index. One or two a day... But my traffic seems to be going a bit more towards my home page

g1smd

5:57 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Which of the three indexes are you dropping out of?

Google has at least three versions of their index in play at the moment.

Lorel

6:26 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




But my traffic seems to be going a bit more towards my home page

I've noticed this also along with traffic slightly off by 15%

darnoc

6:53 pm on May 7, 2006 (gmt 0)

10+ Year Member



g1smd: how do you know which index is being used?

reseller

7:15 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Lorel

Here is another DC set which might belong to the "29% missing" class :-)

[64.233.167.99...]
[64.233.167.104...]

Or am I wrong?

g1smd

7:37 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> How do you know which index is being used? <<

I have many selected test query phrases for which I know the returned results for the last 4 years. When I see something different I note the IP address and keep an eye on it. Googles results morph every few months.

Google always has 2 or 3 indexes in play. At the moment they have what I call BigDaddy "A" and BigDaddy "B", which have been the main results since late last year. Additionally, there is the "experimental" results at 72.14.207.99 which last week gave very bad results and completely different handling of Supplemental Results. Those results have now migrated to several other datacentres, but in doing so, some now show a "cleaned up" version of the "experiment" with all Supplemental Results from before 2005 June no longer showing in the results.

I have waited years for that to happen, and hope it sticks and spreads.

Lorel

7:56 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




Here is another DC set which might belong to the "29% missing" class :-)

[64.233.167.99...]
[64.233.167.104...]

Or am I wrong?

Yes, both of those show about the same amount but now it's 28%

thanks,

claus

10:15 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



once upon a time we had a "link:" command that worked ... just saying that it may be the "site:" command that's been tempered with, not the index...

F_Rose

10:27 pm on May 7, 2006 (gmt 0)

10+ Year Member



Does anyone show that Google has indexed additional pages of thier site lately? We have 24 pages indexed for the past three weeks, Google is not indexing most of our pages, why is it so? Is something wrong with our site? Should I do a resubmission?

Google used to have in thier database all of our pages, and for some reason they dropped everything except for 24 pages? Why is it so?

g1smd

10:30 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Have they done that in all datacentres or just some of them?

Quote the IP address where you see the effect, as Google has three or four different indexes in play right now...

Atomic

10:36 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



once upon a time we had a "link:" command that worked ... just saying that it may be the "site:" command that's been tempered with, not the index...

This is not the case. When I search for terms on pages that used to be in the index the search results do not include this page or similar pages. They are not in the index but pages that do show with a site:domain query do show up when terms on those pages are searched for.

julinho

10:40 pm on May 7, 2006 (gmt 0)

10+ Year Member



Since yesterday, my homepage (and only the homepage) is missing from google.com, when I search for [keyword].

However:
- it is ranking better than ever in google.co.uk and google.ca
- when I use the old trick of searching for [keyword -adfadfs] , the page appears in google.com at around the same position of google.ca and google.co.uk

This happened to me in the past with other sites and keywords, and eventually the site reappeared with better rankings.
I hope the same happens now.

F_Rose

10:44 pm on May 7, 2006 (gmt 0)

10+ Year Member



g1smd,

I have noticed that thier are three sets of datacenters. In two sets when I do a search for site: 24 results (pages) come up for our site.

In the third set which is our default 24 come up as regular results the rest app. 600-1000 come up as supplemental results?

If I show 24 pages on all datacenter does that mean our site has a problem?

g1smd

10:46 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Please quote the actual IP address of where you see the results. Accesses to google.TLD are serving results from one of 80 datacentres, and different people will get results from different datacentres at the same time, hence they will see different results.

For me google.co.uk was serving stuff from 72.14.207.104 for a while earlier then 66.249.93.104 later on and right now -- which means that what I now see at google.co.uk is totally different to what I was seeing earlier.

claus

10:48 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> When I search for terms on pages that used to be in the index

You mean a quoted search (search terms in quotes)?

I've had some erroneous results with these queries lately, them not displaying what they ought to..

F_Rose

11:04 pm on May 7, 2006 (gmt 0)

10+ Year Member



If most of our pages are not in Google experimental and cleanup datacenters does that mean that Google considers our pages duplicate and that is the reason they are getting rid of it?

g1smd

11:11 pm on May 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I am not sure whether that is the case, or whether it just needs the site to be recrawled (or a bottleneck in the new Crawl Cache to be cleared) first.

I hope this will be more clear in the coming weeks. About time Matt Cutts gave a hint or two methinks...

F_Rose

11:20 pm on May 7, 2006 (gmt 0)

10+ Year Member



I see other sites going through the same drama...That means I am not the only one affected..

But what concerns me is, if Google is doing this on purpose, for duplicate contents (which I don't think we have) I would want to know,If thier is something I need to do to resolve this issue, every day that passes, is a pity..

I feel it's a crime what Google is doing to us..Google speak up.. You are getting people so annoyed..We are loosing our entire trust in you..It's a shame..

Atomic

12:22 am on May 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> When I search for terms on pages that used to be in the index
You mean a quoted search (search terms in quotes)?

Yes, and without quotes as well. Some of the terms are very unusual and my pages used to show up. Now they don't, they are no longer cached(they were previously) and they don't appear in a site:domain query.

I guess nothing is ever 100% sure but it looks to me like the pages are no longer in this particular index. On the bright side the pages are slowly returning so I have fewer and fewer pages the check.

This 254 message thread spans 9 pages: 254