Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Pages Dropping Out of Big Daddy Index

Part 2

         

GoogleGuy

7:59 pm on May 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Continued from: [webmasterworld.com...]


internetheaven, you said:

I had 20,300 pages showing for a site:www.example.com search yesterday and for the past month. Today it dropped to 509 but my traffic is still pretty constant. I normally get around 4,500 - 5,000 to that site per day and today I've already got 4,000.

So, either Google doesn't account for even a small percentage of my traffic (which I doubt) or the way Google stores information about my site has changed. i.e. the 20,300 pages are still there, Google will only tell me about 509 of them. As far as I can tell, I think the other pages have been supplemented.

That resonated with something that I was talking about with the crawl/index team. internetheaven, was that post about the site in your profile, or a different site? Your post aligns exactly with one thing I've seen in a couple ways. It would align even more if you were talking about a different site than the one in your profile. :) If you were talking about a different site, would mind sending the site name to bostonpubcon2006 [at] gmail.com with the subject line of "crawlpages" and the name of your site, plus the handle "internetheaven"? I'd like to check the theory.

Just to give folks an update, we've been going through the feedback and noticed one thing. We've been refreshing some (but not all) of the supplemental results. One part of the supplemental indexing system didn't return any results for [site:domain.com] (that is, a site: search with no additional terms). So that would match with fewer results being reported for site: queries but traffic not changing much. The pages are available for queries matching the supplemental results, but just adding a term or stopword to site: wouldn't automatically access those supplemental results.

I'm checking with the crawl/index folks if this might factor into what people are seeing, and I should hear back later today or tomorrow. In the mean time, interested folks might want to check if their search traffic has gone up/down by a major amount, and see if there are fewer/more supplemental results for a site: search for their domain. Since folks outside Google couldn't force the supplemental results to return site: results, it needed a crawl/index person to notice that fact based on the feedback that we've gotten.

Anyone that wants to send more info along those lines to bostonpubcon2006 [at] gmail.com with the subject line "crawlpages" is welcome to. So you might send something like "I originally wrote about domain.com. I looked at my logs and haven't seen a major decrease in traffic; my traffic is about the same. I used to have about X% supplemental results, and now I hardly see any supplemental results with a site:domain.com query."

I've still got someone reading the bostonpubcon email alias, and I've worked with the Sitemaps team to exclude that as a factor. The crawl/index folks are reading portions of the feedback too; if there's more that I notice, I'll stop by to let you know.

[edited by: Brett_Tabke at 8:07 pm (utc) on May 8, 2006]

ClintFC

2:49 am on May 9, 2006 (gmt 0)

10+ Year Member



In fact, thinking about it, wasn't Big Daddy seeded from an index dated around July/August last year?

It's not possible that Google's backlink index hasn't been "refreshed" at all since BD was rolled-out is it?

icedowl

4:16 am on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



interested folks might want to check if their search traffic has gone up/down by a major amount

I've seen quite a change, what may be a migration to another SE, and quite a drop. I sure hope that Google can claw its way out of the hole it has dug for itself.

Looking at just the uniques from the last year for one of my sites:


Month--Google----MSN----Yahoo--
May 05..4,706....306....2,334
Jun 05..3,162....233....1,386
Jul 05..3,031.....88....1,298
Aug 05..2,829....118....2,018
Sep 05..2,258....266....2,477
Oct 05..2,409....367....2,669
Nov 05..3,055....217....2,211
Dec 05..4,334....328....2,675
Jan 06..1,960.....92....1,275 First appearance of Supplemental pages
Feb 06....276....172....1,134
Mar 06....121....763....1,032
Apr 06....152..1,945....1,366
May 06.....48....178......355 (8 days only)

Pathetic.

bkpix

4:20 am on May 9, 2006 (gmt 0)

10+ Year Member



Hi Google Guy:

I'm glad someone from Google is paying attention. Like Clint, my problem is not with supplemental results but with the fact my site has almost totally disappeared from Google.

Basically: New site, submitted to Google in March, with about 50 pages. All original content, no spam, no black hat craziness of any kind. Site was initially indexed up to about 40 pages worth; then in April pages started to disappear from the index. Now I have only the main URL left, in two versions.

Anything I can do to fix this?

snowweb

4:39 am on May 9, 2006 (gmt 0)

10+ Year Member



Hmm.. I don't see this as just 'backlink' problem. If you have a low PR, your pages should still show in the index under a site:domain.com search. If they don't, I see that as a bug.

To Googleguy, I would like to say that it is great to at last have some feedback but it does sound as though you are barking up the wrong tree and still don't really recognise the scale of the problems being experienced. You appear to be of the opinion that those whose sites have all but dissappeared from the index were deserving of such treatment because you can see that they have a penalty.

First of all, did you check to see whether the penalty really was deserved or whether it was awarded in error? Did you actually see the offence yourself? How do you know that penalties are not being awarded erroneously?

I know for sure that I had no duplicate content, no spam techniques, just a simple website selling web hosting and web design. It has about 40 pages (only homepage displayed now though in site:). All page titles and descriptions are different, I even have a sitemap and can see that Googlebot has has visited them in the past.

Secondly, why can you see whether we have a penalty and we can't? Is there really any harm in putting a red dot on our sitemaps account summary page with the text "Duplicate content penalty" with a link to what this mean and even specifically what content?

Surely that would beat all the emails Google must receive daily asking why there site has been dropped.

Also, there should be a simple link to press if you believe the penalty was awarded in error. A human should then review the pages and correct the penalty if it was an error.

Honestly, I'm exhausted by Google. I don't have trouble with Yahoo or MSN, my pages are always reliably there. Why is Google such hard work? I want to spend my time writing content for my users, and designing web sites for clients not working as a Google slave.

I never considered writing any special code for the search engines until now. There was simply no need, but I'm forced now, for the first time, to do that. Since I now have nothing to lose (except the homepage), I have written a separate template which php will use for the Googlebot when he comes. It's nothing fancy, it simply removes the fancy navigation, images etc which has the effect of putting the content nearer the top of the page. That way if it's a problem with duplicate content because it's only checking the first 200 lines of my template and never sees the real content, it should then avoid awarding it a duplicate content penalty.

It also will hopefully increase my keyword density (or at least help to make Googles calculation of this, accurate).

GG, please don't think I'm getting at you. It's great that someone at Google is taking our concerns seriously. I hope you're able to kick some butt there and get things sorted out for us real soon!

Regards

peter

reseller

5:37 am on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Good morning Folks

The following DCs show the higest number of indexed pages for a site I watch.

Take a look. Hopefully something good for you on these DCs

[64.233.179.104...]
[64.233.183.104...]
[64.233.187.104...]

[66.102.9.104...]
[66.102.11.104...]

[216.239.59.104...]

Wish you all a great day.

Relevancy

5:39 am on May 9, 2006 (gmt 0)

10+ Year Member



Clint, That is what I am thinking as well.

Big Daddy took out dmoz clone sites and I believe a lot of crap directories (probably more types of sites) and since newer sites rely on backlinks to get indexed faster, (since they can't not build PR on thier own) that with little or no link credit form directories our sites are now not able to get indexed beyond the first level until it increases.

IT's all about authority status... Big Daddy killed what little authority newish sites had and therefore kill our indexed pages. I run lots of newer sites as well as older sites. Only the newer ones where hit. Older ones still have their pages and get new pages indexed.

tigger

6:07 am on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm seeing a small recovery in page count but talk about slow! my site dropped to 148 then started to recover now I'm at 213 from a site with 600 pages and at this rate its going to take all year to get my content back in

reseller

6:28 am on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi Folks

I suggest nominating tigger for the "Reward for being patient" :-)

arubicus

6:49 am on May 9, 2006 (gmt 0)

10+ Year Member



Seems like patience is wearing thin around here lately. At least for me. I am one of the most patient people you would ever meet but my ulcer is telling me enough is enough.

jenkers

6:53 am on May 9, 2006 (gmt 0)

10+ Year Member



I'm seeing pages come back for one particular site after dropping out for about 6 weeks now.

However, the cache date on the ones returning is 14th and 15th August last year - so am not getting my hopes up about these sticking again (but they are now not supplememntal).

arubicus

6:53 am on May 9, 2006 (gmt 0)

10+ Year Member



Maybe we should sacrifice a cow or toss a virgin in a volcano.

I vote we throw reseller in first. LOL

reseller

7:03 am on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



arubicus

"Maybe we should sacrifice a cow or toss a virgin in a volcano.

I vote we throw reseller in first. LOL"

How about starting with Dayo_UK first :-)

moftary

7:35 am on May 9, 2006 (gmt 0)

10+ Year Member



Ok I have been keeping an eye on all relevent threads about this but mystery isn't resolved yet.

No one knows the reasons behind pages that are dropping from the index, that is ok. But how come a site of mine that used to have 150.000 indexed pages to have one, with tons of inner pages retaining their PR?

To keep it short, how come a page that is not indexed in google to have its own PR?

tigger

7:42 am on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>To keep it short, how come a page that is not indexed in google to have its own PR?

LOL I've got loads of those on my sites wanna swap - most are PR3's as well, sure makes linking interesting right now

moftary

7:46 am on May 9, 2006 (gmt 0)

10+ Year Member



LOL I've got loads of those on my sites wanna swap - most are PR3's as well, sure makes linking interesting right now

My interior pages has PR5, so I beat you :P
Also I have a lot and a lot of precious google bots activities that are priceless!

This 249 message thread spans 17 pages: 249