Big Daddy Part 5

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Big Daddy Part 5

GoogleGuy

5:43 pm on Mar 13, 2006 (gmt 0)

continued from:
[webmasterworld.com...]

Okay, quite a few people should see their pages coming
back. If you haven't seen any change (that is, if your pages are still supplemental), I'd like to look into that too so that I can see if there's any common factor remaining.

So: if your pages are still supplemental, feel free to write to sesnyc06 [at] gmail.com with the subject line of "stillsupplemental" (all one word), and I'll ask someone to check the emails out.

Hope that helps, and I'm glad that lots of people are seeing a full recovery,
GoogleGuy

[edited by: Brett_Tabke at 5:20 pm (utc) on Mar. 22, 2006]

jrs_66

7:15 pm on Mar 23, 2006 (gmt 0)

--- According to Google's own statements BD contains no algorithmic tweaks.

I believe they said no 'major' changes, I don't think most people see any 'major' changes... if any at all

--- Instead it is a fundamental change to their infrastructure, new bot, new indexes, etc.

I guess they would create a new bot and build a new index to have the same results?!?!?!

---(possible even all pages that are new since January-ish).

You've got to be kidding... 80% of my site has been built since then... all there!

---Yes this may well get rid of some Spam, but it will also get rid of all of the quailty content too.

I suppose everyone should expect to see '0 results found' pretty soon! That would be a bad business move...

---I guess you are just lucky that you aren't effefcted (yet). Maybe you should just count your blessings instead if coming over all smug?

I just try to build quality, useful websites... a strategy that's worked for me for almost 10 years...

gford

7:16 pm on Mar 23, 2006 (gmt 0)

This has to be the most bug riddle update in BigG history eh?

I sure hope they work it out. It seems to still be going on as a site that was not in SH 2-3 days ago (last I checked) is now. Crap...

ryan26

5:37 pm on Mar 23, 2006 (gmt 0)

I've seen several instances where Google has reverted back to 1-2 year old pages that are now 301 redirected to a new location as the "canonical" version of the page. The new pages, which were correctly redirected and updated way back when, seem to have disappeared from the index while the old locations are now showing as "supplemental results".

Has anyone else seen this type of behavior recently? Wondering if this an isolated issue or not. I've just setup a Google Sitemap with all of the new URLs and submitted successfully... next step will be to create one with the old URLs and see if they reindex the redirect.

europeforvisitors

8:01 pm on Mar 23, 2006 (gmt 0)

Trouble is, their new bot and/or index contains a serious bug that indiscriminately discards perfectly good pages (possible even all pages that are new since January-ish).

If that's the case, isn't it reasonable to assume that they'll work on identifying and fixing the bug after the rollout is complete? Maybe I'm naive, but I don't think Google wants to discard "perfectly good pages" at random or to dump all pages that have been published since January.

avalanche101

8:06 pm on Mar 23, 2006 (gmt 0)

marketingmagic

Re Your Post:

"Where is Google pulling titles from lately?
We're ranked well for a particular KW but where they are pulling the title from has me completely stumped! "

I've noticed that for some results google has been returning our DMOZ/Google directory title and description, for others a mixture of the 2.

Then there is the index cache date at the bottom, which will say something like: 20 march, but when you click on the cached page link it'll give a variety of cache dated pages - anywhere from dec 2005 to 20 march 2006 at the moment.

lesjkajski

8:27 pm on Mar 23, 2006 (gmt 0)

Hello,

can you al look in your logs and see if both Googlebots are visiting your site if you had damage last weeks?
Also if you are not affected can you see or both or only 1 bot is crwaling your site.

Perhaps there is a conflict between these 2.

old : Googlebot/2.1 (+http://www.google.com/bot.html)
new : Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

g1smd

9:46 pm on Mar 23, 2006 (gmt 0)

At a friends house an hour ago, a search like site:domain.com for several sites brought ZERO results, even for a site:dmoz.org search. I didn't manage to find out what IP that datacentre was on. Some sites returned ONE page only (A site:dmoz.org search returns 22 000 000 results now, back home).

drall

9:53 pm on Mar 23, 2006 (gmt 0)

jrs_66 I think you are way off the mark, the one site out of our network that was effected is PR7, online for 5-6 years, has about 1500 pages of content that was all built and written by hand and has thousands of sites linking to various parts of it for reference.

This has nothing to do with them getting rid of MFA junk or autogenerated spam or attacking mega sized sites, it is a bug. pure, plain and simple. I know of several other "authority" sites that have been hit the same way the one site of ours has been. GG says its a bug, MC says its a bug and all of my teams research says its a bug.

soapystar

9:58 pm on Mar 23, 2006 (gmt 0)

if i take an affected url i.e one that the url used rank for pre bd..and then type the key phrase plus the site name to get it to show again i get two results...

the first is the current url and version..the second is indented..its last years version of the same url and showing as supplemental...a clear example of the same url being listed twice..once as a supplemental and once as a normal index listing..showing to me google is treating them as sperate urls even though its the same one...either the current url is getting the old urls ranking (in thats its ranked supplemental), or having one url treated as two urls means u cant help but trip a duplicate filter....

the difference between the two versions is not the content but a series of links to inner pages that were removed.....

ClintFC

11:02 pm on Mar 23, 2006 (gmt 0)

Why do so many people seem to think that Google are aware of the problem?

It seems to me that they most certainly are not aware of any problem with BD. Only today, Matt Cutts was proudly announcing that the complete rollout is now only days away - with just two bugless, fully-functioning, non BD DCs remaining. Hooray!

A few weeks back Google thought they might have a minor problem. Matt Cutts then announced that the minor problem had been fixed. One or two people then thanked him profusely. And here we all still are.

Those of us who have been effected know that there is a problem, but Google absolutely do not. It's not like they are quietly beavering away at a fix. They aren't. Because they have yet to acknowledge or even notice that it's broken.

Whitey

11:32 pm on Mar 23, 2006 (gmt 0)

We�re down to just 1-2 data centers left in the switchover to Bigdaddy. It�s possible that the Bigdaddy switchover will be complete in the next week or two. Just as a reminder, Bigdaddy is a software upgrade to Google�s infrastructure that provides the framework for a lot of improvements to core search quality in the coming months (smarter redirect handling, improved canonicalization, etc.). A team of dedicated people has worked very hard on this change; props to them for the code, sweat, and hours they�ve put into it.

Matt Cutts

Armi

12:30 am on Mar 24, 2006 (gmt 0)

On 66.249.93.104 and 64.233.179.104 I�m out of thr Supplemental Hell with a few pages!

tr95

4:05 am on Mar 24, 2006 (gmt 0)

I just noticed something I've never seen before in the Google results for one of my sites, a high quality, non-commercial informational website. The title of the home page is in the form of "Site name - Description", where the site name is a unique, invented word that is the same as the domain name. Without the description, most people would have no idea what the site is about. Until today, Google always displayed the entire title from the <title> tag in search results. However, when I checked this evening, the title for the home page displayed by Google is simply the one-word site name without the description. The word is a run-on of two different words, and Google has capitalized the first letter of each word in the name. They must be using the anchor text from some other link, perhaps DMOZ, to get the title. Or perhaps the short title reflects the anchor text of the majority of links, as many linking sites are "insiders" who know my website and assume that their readers also know about it and read it. It's odd that anchor text from elsewhere would override the <title> tag, particularly where the <title> tag is more useful and informative.

If Google is doing this with other websites, the quality of Google search results has just taken a major dive. I can see why they might want to truncate the titles of sites that simply contain strings of spammy words, but removing the spam-free description of a website from the displayed title makes no sense and will only confuse users.

dethfire

4:34 am on Mar 24, 2006 (gmt 0)

I have two forums still in supp hell because google indexed the printable threads and hammered me for dups

dmje

5:43 am on Mar 24, 2006 (gmt 0)

OK even more confused here now. Just did the site:example.com -inurl:www search and the number of pages has gone down considerably, which I guess is a good thing, but what I dont understand is that it is still showing some pages that are current and active, my sitemap for instance, shows it as a sup result and clicking on the cache link in the search results it shows a date of 2005, but clicking on the page title and going to the current page and looking at the cache using the google toolbar cached snapshot of the page it shows a date of just a few days ago.

I do not understand this at all, how can both be? If it has a cache date of just a few days old, why would it still show sup? Could it be that the page was crawled within the past few days but G has not reindexed it as of yet and when it does it will no longer show supplemental?

Sure hope somebody can explain this to me where it will make some sense.

broker_boy

10:27 am on Mar 24, 2006 (gmt 0)

Hope is stil out there!

I got an email telling me they are looking through my problem.

Wooo!

At least they are dealing wih this manually rather than algorithmically

Cheers,

gcc_llc

10:52 am on Mar 24, 2006 (gmt 0)

(sits here with hands folded, praying)

Whitey

11:44 am on Mar 24, 2006 (gmt 0)

no movements for us

Summary :

1 page is showing - almost zero Google traffic
total estimated pages for caching around 95,200
viewed first 999 results, assume the rest are similar

64.233.179.104
Home Page cached 8 Mar ,
all remaining pages 42,099 supplementals
last caches around 25jul05

72.14.207.104
Home Page cached 8 Mar ,
all remaining pages 23,099 supplementals
last caches around 25jul05

66.249.93.104
Home Page cached 8 Mar,
all remaining pages 23,109 supplementals
last caches around 25jul05

301 redirects from old to new pages done around 21 Feb
Site Maps submitted around 21 Feb

March Traffic

Uniques

Yahoo.........10255
MSN............2733
Google..........360
Altavista.......341

Overall traffic down around 60-70% thru Google

Is this pattern shared amongst webmasters?

itloc

2:01 pm on Mar 24, 2006 (gmt 0)

@Whitey

Same here.

soapystar

2:05 pm on Mar 24, 2006 (gmt 0)

well..as someone who never had the homepage problem (where it doesnt show for the site search as top result)..i now do thanks to BD.....

gford

2:18 pm on Mar 24, 2006 (gmt 0)

I am noticing all but 1 of my sites where I redirect from [mysite.com...] to [mysite.com...] are in SH. Wonder if the 301 redirect is causing google fits?

broker_boy

2:53 pm on Mar 24, 2006 (gmt 0)

Gford,

Yes all cases i have studied all had canolonical issues at one time or another. Thay all have 301's in place too.

Cheers,

chewy

2:55 pm on Mar 24, 2006 (gmt 0)

50 page site

64.233.179.104

7 out of first 10 pages cached in March 06
last 3 out of 10 cached in June 05
remainder 99% url only.

64.233.187.99

Just changed since last night!
The June 05 cache date starts at position # 11
remainder of site 98% is url only.

Here's where the problem defines itself:

All DC's still show a very nasty June 05 cache date 404 page (which happens to be an https page which never really existed) showing up as an indication of where the bots get hung (around position 11 - 14, with indications that it is progressively moving down the list).

I suppose this is slow progress.

after about 10 results, all data centers show https pages, presumably as duplicates - and the site only has 1 genuine https page.

This 1 key https page has been modified with robots noindex nofollow to stop the bots.

Please Google, send a newly-smartened up Bot our way that can solve this problem.

broker_boy

4:00 pm on Mar 24, 2006 (gmt 0)

Lol

The new bot is supposed to be smarter!

Cheers,

optimist

4:27 pm on Mar 24, 2006 (gmt 0)

I'm beginning to think G doesnt not know how to handle any redirects at all 301 0r 302 could both be dangerous.

It might be smart to remove all non-www 301's and request anyone linking to your site with a 302 redirect remove the link, and just make better links to the site.

I do not expect a fix if its been this long.

< continued here: [webmasterworld.com...] >

[edited by: tedster at 5:29 am (utc) on Mar. 25, 2006]

This 265 message thread spans 9 pages: 265

Big Daddy Part 5

GoogleGuy

jrs_66

gford

ryan26

europeforvisitors

avalanche101

lesjkajski

g1smd

drall

soapystar

ClintFC

Whitey

Armi

tr95

dethfire

dmje

broker_boy

gcc_llc

Whitey

itloc

soapystar

gford

broker_boy

chewy

broker_boy

optimist

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week