homepage Welcome to WebmasterWorld Guest from 54.204.73.126
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 126 message thread spans 5 pages: 126 ( [1] 2 3 4 5 > >     
Mozilla Googlebot and the New Index at 64.233.179.104
Moved on from Jagger
Dayo_UK

10+ Year Member



 
Msg#: 32409 posted 9:58 am on Dec 13, 2005 (gmt 0)

OK - Jagger is over - long live "Big Daddy" - as named by MC for the test DC.

The index growing on 64.233.179.104 does seem to be largely a Mozilla Googlebot generated index - and this new index is being built for the future - so can we say Mozilla Googlebot is now taking over from normal Googlebot.

OK ignore supplimentals etc for a moment - as all DCs have this problem and have a look at the cache dates for pages that are indexed...... some of these pages have only been fetched by Mozzilla Googlebot (even on the same day as normal Googlebot visited)

Eg. On the test DC I have a homepage cached 30th November at 5:40 - fetched by Mozilla Googlebot - while on the other DCs it is cached on 30th November at 3:40 - fetched by normal Googlebot.

So in many ways this does look like building a whole new index parrellel to the existing index - with largely Mozilla Googlebot crawl data.

Some pages appear very old - eg another page is cached on the test dc on 6th November - but on the other dcs it has cache in December - checking the logs - 6th November was the last time Mozilla Googlebot visited this page.

OK - there are pages in the test DC only visited by normal Googlebot - however, pages crawled by Mozilla Googlebot do not appear on other DCs.

The newest pages on the DC crawled by Mozilla Googlebot seem to be in November - eg no pages crawled by Mozilla Googlebot in December have made it to the index yet.

Some pages crawled by Mozilla Googlebot in November have not made it to the index - so I dont know if G are working with a sample data size......

For confirmation that this is a whole new build of the index MC said on his blog:-

"the test data center certainly has some different crawling and indexing characteristics."

OK - folks remember also that MC said that this index will roll out in months and is in a test state so I guess no need for early panic stations and slagging of Google in this thread.

Now 301s, 302s, Canonicals - for me a lot more 301s Google has crawled and indexed correctly. 302s - still lots in the index (mainly supplimentals) - not seeing any new 302s that show the url of the linking site but the content of the destination site (seeing the newest at about August 2005 time) - no doubt others may find some.

What are other observations people have seen with the new crawling and indexing on this test dc.

 

FromRocky

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 32409 posted 5:18 pm on Dec 13, 2005 (gmt 0)

I've noticed on the mentioned test DC:
1. Some caches dated December 2nd and the rest in November.
2. All of the URL only pages have converted to supplimentals. Thus, there is no URL only left on this test DC. Does this indicate you either have a full or supplimental listing from this test DC? There are only two types of listing? Does anyone know what this supplimental listing mean?

fatpeter

10+ Year Member



 
Msg#: 32409 posted 5:32 pm on Dec 13, 2005 (gmt 0)

I added over 3000 pages to my site in one go last month. Probably unwise as I only have 400 pages indexed! Mozilla has picked them all up. Googlebot hasn't got any of them. They seem to rank o.k on the test d.c.

Dayo_UK

10+ Year Member



 
Msg#: 32409 posted 5:38 pm on Dec 13, 2005 (gmt 0)

FromRocky

Yes, lots of url onlys have the title and description back - this really need to be recrawled but the required recrawling of supplimentals is pretty much standard accross the dcs.

The caches in December - the ones I have seen tend to be normal Googlebot rather than Mozilla Googlebot. Normal Googlebot still adds pages to the test dc - but Mozilla Gbot does not add pages to the other dcs or so it seems.

fatpeter

What sort of cache date are you showing for those pages picked up by Mozzilla Googlebot - and are pretty much all the pages crawled by the bot indexed. EG 3000 crawled and it appears 3000 indexed?

fatpeter

10+ Year Member



 
Msg#: 32409 posted 6:37 pm on Dec 13, 2005 (gmt 0)

"What sort of cache date are you showing for those pages picked up by Mozzilla Googlebot - and are pretty much all the pages crawled by the bot indexed. EG 3000 crawled and it appears 3000 indexed?"

Can't go past a 1000 but it looks like they were all crawled and added.Cache dates around the middle of november. Only odd thing... a site: search always gave an accurate number of about 400. I added 3000 and now the site: search gives 11000 results.

ddogg

10+ Year Member



 
Msg#: 32409 posted 7:15 pm on Dec 13, 2005 (gmt 0)

Wow I am excited! Mozilla bot is the main bot that has been crawling me for months. My pages actually show up in this index, hurray!

Strange how it shows the page count as being 10x what it actually should be.

Please update to this version!

Dayo_UK

10+ Year Member



 
Msg#: 32409 posted 7:26 pm on Dec 13, 2005 (gmt 0)

ddogg

What sort of cache dates are you showing?

Are you seeing a pretty much full crawl to index return - or just a sample?

ddogg

10+ Year Member



 
Msg#: 32409 posted 7:36 pm on Dec 13, 2005 (gmt 0)

My cache dates are earlier than what current Google is showing. End of November it appears.

Seems my whole site is indexed. In current Google 99% of my pages are url's only. In this version they are actually indexed and ranking like normal (no sandbox or any weirdness, 3 1/2 year old site though so shouldn't be sandboxed anyway).

I had been getting deep crawls by Mozilla bot for months but very very few pages would actually end up indexed. This is more like it!

zeus

WebmasterWorld Senior Member zeus us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 32409 posted 7:51 pm on Dec 13, 2005 (gmt 0)

In my case I see the non www is gone and only www. is index + no supplemental results, ok still no ranking, but thats because its not a update.

I also see a established site has gone from 50.000 indexed pages to 900, but that could have something to do with mozilla bot theory.

stinkfoot

10+ Year Member



 
Msg#: 32409 posted 8:57 pm on Dec 13, 2005 (gmt 0)

Geez .. no sup results ... no 10 year old caches ...

Does this mean my being a troll is at an end?

I think not he he he!

sja65

5+ Year Member



 
Msg#: 32409 posted 9:21 pm on Dec 13, 2005 (gmt 0)

On a site without a 301 redirect.
-Do site:name.com
....returns name.com/index.html
....followed by all of the pages as www.name.com/page.html
-Do site:www.name.com
....returns www.name.com/index.html
....followed by all of the pages as www.name.com/page.html
The only difference is the www on the home page. Number of pages is the same (home page is showing cached at the end of November)

On a site with the 301 redirect, results are identical with and without the www in the site command.

This looks like an improvement for the canonical issues.

I've also noticed several of my pages returning from supplemental and now ranking again.

fourchette

5+ Year Member



 
Msg#: 32409 posted 9:51 pm on Dec 13, 2005 (gmt 0)

Hey all,

My experience:

What i'm seeing here is that google basically reincluded my whole site. The site was completely wiped out from google.com in the early days of Bourbon.

Since May we completely rewrote our site, fixed canonical issue, deleted thousands of pages, etc...

Our site and all of it's pages (200) are now back in the index and showing on 4th 5th page for competitive keywords.

SO:

-This index is quite fresh AND for us it's either:

1) penalties or more agressive filters are now yet applied

2) complete reinclusion and weight off of the multiple sites penalties that were on this domain.

Ok I hope it's option 2.. but...

COuld they still add filters and penalties to this index, or are we dealing with soon to be stable results with all site penalties applied?

lee_sufc

10+ Year Member



 
Msg#: 32409 posted 10:12 pm on Dec 13, 2005 (gmt 0)

the results on the test dc now look the same as the others - this keeps happening

Miop

10+ Year Member



 
Msg#: 32409 posted 10:48 pm on Dec 13, 2005 (gmt 0)

OT - but can't access the other thread.
Spamming b's that have plagued my sector have gone from the Serps finally.

To whoever is watching who might have had something to do with it - thanks.

afterburner

10+ Year Member



 
Msg#: 32409 posted 11:27 pm on Dec 13, 2005 (gmt 0)

64.233.179.104

This is google.ru isn`t it?

Nikke

10+ Year Member



 
Msg#: 32409 posted 12:49 am on Dec 14, 2005 (gmt 0)

64.233.179.104 has found all but 10 of the pages listed on google.com for my main site. (9,590).

The latest cache is 4 days older than any I can find on google.com, but that cache hasn't been updated for 7 days now.

Googlebot, whatever I said, I didn't mean it. Please come back!

texasville

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 32409 posted 1:02 am on Dec 14, 2005 (gmt 0)

64.233.179.104 has by far the best results for me as far as indexing my site. On site search I have all pages indexed, the ones that had dropped into supplemental are now not- but- I have old pages that I long ago let go 404 just to get them out of the index. One has a cache date of september2, 2004-But otherwise I can live with this.

iblaine

10+ Year Member



 
Msg#: 32409 posted 1:21 am on Dec 14, 2005 (gmt 0)

Results have not changed for my sites. Could be those changes are yet to come. Matt Cutts says it's an update so I will continue to monitor that DC. Supplamental results are gone but I still have tens of thousands of strange extra pages in the index.

reseller

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 32409 posted 7:28 am on Dec 14, 2005 (gmt 0)

Good morning Folks

For the benefit of further discussion, I'm recalling what Matt wrote regarding the test DC.

"Broker Boy, I do expect that data center to eventually go live, but it will take a few months, in all likelihood. That data center (64.233.179.104) recently moved into regular rotation recently, and I wouldn�t be surprised if one more data center joined it in the next week or so. After that, I�d expect those two data centers to stay in the rotation (but not spread) until after the holidays. Not sure about that, but that�s my best guess."

"Joe, I believe we've instituted some more intuitive results for site: queries within the last few weeks. The test data center will be where most of the progress on 301s/canonicalization takes place."

Wish you all a great day!

Dayo_UK

10+ Year Member



 
Msg#: 32409 posted 7:35 am on Dec 14, 2005 (gmt 0)

Howdy Reseller

Intresting, I missed the comment to Joe on MC blog.

(I kind of think that Big Daddy might really be the real Jagger3 - ie GG talked about Canonical, 301 things and the base for a new index as Jagger3 and this did not happen - but hey whatever - I still think MC/GG get excited sometimes about a change and say it is coming before it actually happens or is ready to happen, we do know that they have pride in working for Google so I guess they get excited too when big things happen - lets hope it comes through).

The test DC is not showing test results for me at the moment.

reseller

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 32409 posted 8:36 am on Dec 14, 2005 (gmt 0)

Good morning Dayo_UK

>>I kind of think that Big Daddy might really be the real Jagger3 - ie GG talked about Canonical, 301 things and the base for a new index as Jagger3 and this did not happen - but hey whatever - I still think MC/GG get excited sometimes about a change and say it is coming before it actually happens or is ready to happen <<

What I like most about GG and Matt is that they are posative and optimistic fellow members. Great.. the two gentlemen get excited sometimes and tell us things before it happen :-)

For example:

GG & Matt! Now I have 50% of my pre-Allegra Google referrals. When do I get the rest of my pre-Allegra traffic back. Thanks a bunch :-)

recar

5+ Year Member



 
Msg#: 32409 posted 9:52 am on Dec 14, 2005 (gmt 0)

Good morning,

can somebody please confirm that the Testcenter 64.233.179.104 currently uses the same results as www.google.com. I don't see testresults anymore.

The results are the same all along. Same cache dates, some number of pages returned. Bad as ever since Jagger for our site.

Am I the only one seeing this?

lee_sufc

10+ Year Member



 
Msg#: 32409 posted 10:07 am on Dec 14, 2005 (gmt 0)

recar - you're not the only one - this has been happening every now and again for the past couple of weeks

oddsod

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 32409 posted 10:11 am on Dec 14, 2005 (gmt 0)

Dayo, if indeed they are building a new index from scratch - to take over from the current mess - that's really big news, and kudos on your observations.

Dayo_UK

10+ Year Member



 
Msg#: 32409 posted 11:34 am on Dec 14, 2005 (gmt 0)

oddsod

The test DC is not showing test data at the moment - when it next goes live I would be intrested in your observations on your sites which may have non-www, canonical problems etc.

Cheers

Miop

10+ Year Member



 
Msg#: 32409 posted 2:04 pm on Dec 14, 2005 (gmt 0)

My site is ranking pretty poorly on all DC's but the .uk one where we are doing a lot better.
I can't see if Jagger has got to the .uk site yet or not - I hope so!

arbitrary

5+ Year Member



 
Msg#: 32409 posted 1:25 am on Dec 15, 2005 (gmt 0)

I am not seeing this bot in my logs.

Can we get an idea of how many people have been crawled by this IP?

BillyS

WebmasterWorld Senior Member billys us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 32409 posted 1:40 am on Dec 15, 2005 (gmt 0)

Can we get an idea of how many people have been crawled by this IP?

The difference should be:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

versus

Googlebot/2.1 (+http://www.google.com/bot.html)

In the user agent field.

I've got around 1,200 pages crawled by Mozilla/5.0 and another 700 by Googlebot 2.1 for far this month. This is for a website with approximately 1,000 pages.

Powdork

WebmasterWorld Senior Member powdork us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 32409 posted 5:23 am on Dec 15, 2005 (gmt 0)

I am seeing the test results now matching the rest of the dc's except that the test dc has removed some js redirect doorways from the serps.

edited to make more sense.

ReSiever

5+ Year Member



 
Msg#: 32409 posted 7:17 am on Dec 15, 2005 (gmt 0)

I am seeing the test results now matching the rest of the dc's

I'm seeing the same.. I told my collegues a few days ago, I think it was before the weekend, that the datacenter on this IP shows very old results, with cache dates in early november.

Next to that, I wasn't seeing any URL-only results, but i do seem them again now. After doing some testing the 'Big Daddy' now seems to be the same as most of the datacenters again.

Too bad, maybe they know that we know and switched IP's ;)

This 126 message thread spans 5 pages: 126 ( [1] 2 3 4 5 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved