homepage Welcome to WebmasterWorld Guest from 54.205.189.156
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 275 message thread spans 10 pages: < < 275 ( 1 2 [3] 4 5 6 7 8 9 10 > >     
Google Datacenters Watch: 2006-01-30
Observations, Analysis and Remarks
johnwards




msg:772938
 3:55 pm on Jan 30, 2006 (gmt 0)

< continued from [webmasterworld.com...] >

This is just odd.

The 64.* DC's return about 300 pages from my site.

The 216.* DC's return about 46,000 pages from my site.

And the 66.* return 69,000 pages from my site.

Currently I have about 65,000 pages.

If I go to google.co.uk I get 46,000 pages. If I go to google.com from my US based server I get the same 46,000 results.

It is all very odd and confusing.

[edited by: tedster at 9:56 pm (utc) on Jan. 30, 2006]

 

Dayo_UK




msg:772998
 11:49 am on Feb 1, 2006 (gmt 0)

>>>>Do the individual BD DC's cache pages with a seperate crawl or do they use a common Mozilla Googlebot crawl to cache pages for all Big Daddy index DC's?

Yes, pretty sure on that.

Obviously they can get updated and they may spread to some DCs before the other - although that process is normally quick.

As that DC has the exact cache (even to the nearest minute/second) as the non-BD dc then it is not showing an update Mozilla Googlebot cache IMO but a Normal Googlebot cache.

Of course as time goes on things change and we may see a merge or something.

>>>We assumed new links and text changes were calculated at this point and used in the new serps?

Well to a degree in the past this was true - however there has always been underlying indexes etc - which means you can rank on words that no longer appear on the newly cached page etc. G1smd has a frustrating experience in this area.

BD is obviously a bit different - MC is saying no ranking changes for one - so it would not surprise me if the ranking structure of said pages is based on different data to what the Mozilla Googlebot/Big Daddy crawl has gathered aswell. EG Perhaps ranks are based on Normal Googlebot crawl.

Of course this is speculating a bit now.

[edited by: Dayo_UK at 11:54 am (utc) on Feb. 1, 2006]

Ellio




msg:772999
 11:51 am on Feb 1, 2006 (gmt 0)

I agree with you it does seem to be the defualt cache. Very odd.

bluewidgets




msg:773000
 12:00 pm on Feb 1, 2006 (gmt 0)

Mozilla Googlebot
does anybody can give me the IP of Mozilla Googlebot

Ellio




msg:773001
 12:01 pm on Feb 1, 2006 (gmt 0)

BD is obviously a bit different - MC is saying no ranking changes for one - so it would not surprise me if the ranking structure of said pages is based on different data to what the Mozilla Googlebot/Big Daddy crawl has gathered aswell. EG Perhaps ranks are based on Normal Googlebot crawl.

If this is the case then sites that rank on Big Daddy but not at all on default must now be appearing for "other" structural reasons if the results are both using the same cache to rank.

Dayo_UK




msg:773002
 12:08 pm on Feb 1, 2006 (gmt 0)

>>>>>If this is the case then sites that rank on Big Daddy but not at all on default must now be appearing for "other" structural reasons if the results are both using the same cache to rank.

Yes, that is what I have been thinking. Esp With MC coming out and saying:-

"the changes on Bigdaddy are relatively subtle (less ranking changes and more infrastructure changes). Most of the changes are under the hood, and this infrastructure prepares the framework for future improvements throughout the year."

IMO the next stage of the BD process is when things might really happen.

Bluewidgets

Mozilla Googlebot has various IP addys.

It can easily be identified in the logs by user-agent though:-

66.249.72.103 - - [31/Jan/2006:HH:MM:SS +0100] "GET /mypage.html HTTP/1.1" 200 9148 www.mysite.com "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-"

bluewidgets




msg:773003
 12:20 pm on Feb 1, 2006 (gmt 0)

66-249-72-33
is that a mozila or a normal one?

Dayo_UK




msg:773004
 12:23 pm on Feb 1, 2006 (gmt 0)

From searching on the web it looks like a Mozilla Googlebot.

johnwards




msg:773005
 12:43 pm on Feb 1, 2006 (gmt 0)

This is starting to make more sense.

I have a google sitemap, but its not got all my pages as I was experimenting. Has my 300 odd landing pages, but not the much more deeper than that. (Differing counties/towns for my property stuff)

In the majority of the BigDaddy DCs it only has these pages, plus a couple of others.

In the current live databse it has 45,000 pages.

In this big daddy DC [66.249.93.104...] it has 66,000 pages which is roughly the right amount.

Possibly the sitemaps are going to be a lot more important?

DanMoore




msg:773006
 1:04 pm on Feb 1, 2006 (gmt 0)

Hi guys,

I've been lurking in the data centers threads for quite some time and I must say that you guys have taught me a lot and I hope you continue to share this fascinating stuff.
I have many sites one of which is brand new, about 5 months old. This site seems to have 15000 pages indexed in BD while on regular google, only 12 pages are indexed. Regular Googlebot only visits my homepage on this newer site while Mozilla Googlebot crushes the site 24 hours per day. That explains the difference in # of pages indexed between the data centers.
I have also noticed that on my older sites (some of which are quite old and rank very well), .htm pages are now crawled by regular googlebot only while php pages are now crawled only by Mozilla Googlebot. Any thoughts?

foolsgold




msg:773007
 3:03 pm on Feb 1, 2006 (gmt 0)

DAYO_UK 'IMO the next stage of the BD process is when things might really happen'.

So any thoughts on what this will be and when?

Dayo_UK




msg:773008
 3:14 pm on Feb 1, 2006 (gmt 0)

>>>>>So any thoughts on what this will be and when?

This is all just IMO - remember.

I think the next stage would be to apply the re-calculated internal PR as a result of changes to 301/2/canonical handling and other infastructure changes to the serps.

Whether this has to wait until the BD infastructure hits all DC is anyones guess.

If MC is asking for another call for feedback by the end of the week - the optomistic side of me hopes that something more will happen by then that requires this feedback.

LunaC




msg:773009
 4:30 pm on Feb 1, 2006 (gmt 0)

On 66.249.93.104 my main site is back to url only for the index page, and WITH the www, despite a 301.... and Google having corrected itself and listing without the www's last update.

- Many more links are leading to my url without the www.
- Internal links are all full urls without the www.
- 301's been in place for many months
- Google sitemap has been up for some time as well.

Since I did all that I've now fallen out of MSN and Yahoo is showing old urls that haven't existed for almost a year with a current cache.

No idea if any of that is related, bad luck, or something on my servers side. (Tested and the 301's seem ok but with all the problems with the other SE's listings, I'm now wondering if I'm missing something.)

I'm beyond frustrated at this point!

RobinK




msg:773010
 6:08 pm on Feb 1, 2006 (gmt 0)

Reseller,
I was hoping we wouldn't make it the full year in this awful google limbo but for what it is worth. Unhappy Anniversary and may you not have many more.

BillyS




msg:773011
 6:31 pm on Feb 1, 2006 (gmt 0)

So what is BigDaddy all about anyway?

Infrastructure / under the hood? Are we talking 64 bit processors in those inexpensive Google servers?

Mozillabot is that gathering more sophisticated information that the 64 bit processors can now horse around?

Is the slow rollout due to hardware change outs?

Pure speculation. But it seems that BigDaddy is more about a platform then fixing results (at this moment).

Ellio




msg:773012
 6:36 pm on Feb 1, 2006 (gmt 0)

Pure speculation. But it seems that BigDaddy is more about a platform then fixing results (at this moment).

The latter bit has been said many times but the fact is Big Daddy HAS already fixed a lot of sites including ours.

I genuinely feel sorry for those not yet effected by the fix but it seems incorrect not to recognise the large number of Big Daddy fixes already reported on WebmasterWorld alone...

dakman




msg:773013
 7:25 pm on Feb 1, 2006 (gmt 0)

the new results propogating among the dc's are strange...

both my compeition and I had like over 100,000 pages indexed then all the sudden they are down like 800% now 216.* and 66.* are showing only 14,000 pages

i have noticed many sites are going down in index page counts... i wonder if this is just flux or if its some new algo...

i dont think you can publish thousands of pages and expect them to be indexed anymore unless you have high PR / IBLs or are an aged site..

I checked older sites with way more IBL's in my industry and the new updates dont seem to have touched them

Bushmaster




msg:773014
 8:24 pm on Feb 1, 2006 (gmt 0)

I have also noticed a significant indexed page drop as of today. Yesterday I was showing 1,080,000 pages indexed in most DC's. Now today I am showing 299 pages indexed in most DC's! I hope this isnt something permenant.

CainIV




msg:773015
 8:34 pm on Feb 1, 2006 (gmt 0)

The latter bit has been said many times but the fact is Big Daddy HAS already fixed a lot of sites including ours.

I genuinely feel sorry for those not yet effected by the fix but it seems incorrect not to recognise the large number of Big Daddy fixes already reported on WebmasterWorld alone...

I would have to agree by confirming fixes for some of my site (not all)

What criteria is used to apply the proper fixes to site x is beyond me.

Two sites I applied 301's to right after Jagger show no http:// listings in G and the homepage shows up first in any site search.

The third site is quite the opposite, with the non www urls still at an all time high...

LunaC

When did you apply 301?

Here's a decent checklist to perform:

Use a proper header checker to see that the 301 does redirect properly and the destination (www) page returns '200 ok'
Use Xenu link checker to check for non www pages in the site navigation and change all of them to www.

Feel free to PM if you need a 3rd eye.

nohllywd




msg:773016
 8:55 pm on Feb 1, 2006 (gmt 0)

I read this thread everyday and I have a silly question. What is Big Daddy, is this different than the Data Centers or what?

BillyS




msg:773017
 10:05 pm on Feb 1, 2006 (gmt 0)

>>>The latter bit has been said many times but the fact is Big Daddy HAS already fixed a lot of sites including ours.

I'm not saying sites have not been fixed, but based on Matt's comments that seems to be a secondary effect. For some of us, problems continue.

My pages are down 50% in BigDaddy despite being spidered all the time. There is no reason for this low number - even ASK, which is relatively slow to index a site has more pages.

reseller




msg:773018
 10:14 pm on Feb 1, 2006 (gmt 0)

Good evening Folks

Fun time. Fetch your warm Cappuccino.. sit back and relax :-)

Here you have a BigDaddy

[64.233.179.104...]

Here you have non-BigDaddy

[64.233.187.99...]
[64.233.187.104...]

Run your testing keywords/keyphrases on all the 3 DCs and take a look at top 10 listings.

Enjoy!

Good night and God bless.

Ellio




msg:773019
 10:27 pm on Feb 1, 2006 (gmt 0)

>>>Run your testing keywords/keyphrases on all the 3 DCs and take a look at top 10 listings.<<<

I see no changes from recent results for our keywords in top ten on both BD and non BD.

Reseller - not sure what I am suppose to see or be having fun with?

arbitrary




msg:773020
 10:43 pm on Feb 1, 2006 (gmt 0)

For a site I recently launched, some of my pages are fully indexed on BD but are url only on non-BD so I am looking forward to have Big Daddy fully implemented.

slade7




msg:773021
 10:53 pm on Feb 1, 2006 (gmt 0)

me like Big Daddy AND non-Big Daddy

I'm #2 in both for my most common kw - which is where I should be and have been off and on forever.

dakman




msg:773022
 11:06 pm on Feb 1, 2006 (gmt 0)

all i know is big daddy sliced my pages 6x times (aka lost 70,000 pages) ... its not happening to me but hundreds of other sites big time... this means sliced traffic..

i wonder if big daddy will eventually reindex all my pages again anytime soon.. the results are good but why cant google figure out a way to updating the results they have instead of throwing them out and re-indexing again with their new algo's..

jilla




msg:773023
 11:11 pm on Feb 1, 2006 (gmt 0)

I have a few basic questions I am confused about. When I check my site on 17 data centers I see basically 2 or 3 different results

<url removed -- no tool sites please, see Forum Charter [webmasterworld.com]>

Which are the centers that are the pre-dominant ones in US?It seems that the same center will have different results with refreshes, so does that mean each center also changes a few times each day too?

Which one/ones are ways to understand where our US traffic comes from?

I have more traffic today than yesterday and am trying to understand it in terms of data centers. (I also don't really understand the big daddy stuff so if that does get brought up can it be explained more).

[edited by: tedster at 12:32 am (utc) on Feb. 2, 2006]

arbitrary




msg:773024
 11:22 pm on Feb 1, 2006 (gmt 0)

I get the feeling there has been a bit of a pull back in terms of how much the Big Daddy datacenters are being rolled out.

The reason I feel this is that I have been receiving less traffic to a new site that is better indexed by Big Daddy.

Is anyone seeing traffic levels change as a result of Big Daddy and do you think Big Daddy has been rolled out our scaled back in the last day or two?

BillyS




msg:773025
 12:19 am on Feb 2, 2006 (gmt 0)

Matt posted an update on his blog

g1smd




msg:773026
 12:26 am on Feb 2, 2006 (gmt 0)

>> It seems the serps data and the cache data dont have to match with each other. <<

Indeed. For a search where the words you search for are no longer on the real page, you can sometimes return the page as a match, and have those words show up in the snippet, even though those same words are nowhere to be found in the cache or on the real page. The ranking and snippet are disconnected from what is stored in the cache. I see thousands of examples like that.

Steph_R




msg:773027
 2:12 am on Feb 2, 2006 (gmt 0)

BD now shows on my default google for the first time. (in Texas)

nohllywd




msg:773028
 3:43 am on Feb 2, 2006 (gmt 0)

I just check BD a few messages back and my main page is listed twice. Both are the same url. Does this make sense to anyone?

This 275 message thread spans 10 pages: < < 275 ( 1 2 [3] 4 5 6 7 8 9 10 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved