| 8:39 pm on May 18, 2003 (gmt 0)|
"is simply that they haven't got the time or space to build another Google that stays offline."
They want back two months because the March deepcrawl failed, but the February deepcrawl was also tarnished by dmoz/rdf/google directory incompletion. I suspect they have wanted to make this change since the January Superflux, and finally ran out of patience and used the best of the deepcrawls from this year.
It seems to me that a better idea would be to focus on the deepcrawl and get the first truly updated index since October. Then do the mad scientist bit after that.
| 8:48 pm on May 18, 2003 (gmt 0)|
I think you have a fundamental misunderstanding of DNS. This has been explained in numerous other posts, but I will post again in case any one new comes across this thread.
www.google.com is a logical, not a physical identity. When a user types www.google.com into the address bar, IE (or whichever browser they use) makes a DNS query. The response to that query will be any one of the nine data centres, including -sj. Currently 5 of the 9 data centres are showing the new index. So effectively the user a 5/9 chance of seeing one of the new indexes.
This has been happening since -sj first showed the new index, and as more data centres assume the new index the more frequently users will see it.
If you put a sniffer (or similar equipment) on your network connection and go to www.google.com you will see this mechanism in action. If you keep doing repeated searches on the same term you will see a qualitative demonstration of this mechanism. For surfers and webmasters alike these changes are very real and not just 'test' results.
www2 and www3 are currently 'hard wired' to point to -sj so behave differently to www.
| 9:02 pm on May 18, 2003 (gmt 0)|
Very nice description merlin30.
| 9:02 pm on May 18, 2003 (gmt 0)|
Actually the 5 data centers that show the "new" index are not showing the same index at all. There are different indexes on each of the datacenters showing 617,000 backlinks to yahoo.
| 9:09 pm on May 18, 2003 (gmt 0)|
Actually Merlin there are some errors in your Google DNS assumption.
First of all Google's DNS servers only return *one* IP address to an 'A' record request. This has a TTL (time-to-live) of 15 minutes--so multiple requests for 'www.google.com' will return the *same* IP address for 15 minutes because of DNS cacheing on the user side. The IP address returned will be a random one and will represent one of the nine datacenters' public IP addresses. This method is different than, say, yahoo, which gives out *multiple* IP addresses for a given query and allows the end user or a DNS cache on the user's network to choose one of the IPs that have been returned at random.
When you then access the datacenter associated with that IP address you may get differing results with each request because each datacenter has load balancing going on (which has nothing to do with DNS). One will (probably) access a different machine at that datacenter with each request.
P.S. There are now 7 of the 9 datacenters showing the "new" index.
| 9:31 pm on May 18, 2003 (gmt 0)|
Thanks for your deeper description; I recommend all webmasters learn DNS. Your extra information helps them. I deliberately kept mine simple; all I was trying to highlight was the main point that www.google.com eventually resolves to one of the data centres and not something different.
And as you say, 7 data centres now have the new index - 7/9 means that most of the time we will now see the new index.
Contrary to what many posters argue (even experienced ones) the update has most definitely occurred!
| 9:33 pm on May 18, 2003 (gmt 0)|
Actually the update hasn't occurred ;)
The index you see now has old data (February or so) and freshbot stuff; backlinks and deep crawl information from April still need to be folded in.
Once all the datacenters are (somewhat) synchronized you'll see this new information being applied.
| 9:41 pm on May 18, 2003 (gmt 0)|
Ah semantics, Critter!
Agreed the new index is based on an old link structure and requires new data to be added. However, the index that users will see (now mostly) is different to the one they were seeing about 3 weeks ago.
Perhaps not an update, but certainly a significantly changed index (not merely everflux) giving an altered traffic pattern. For some webmasters, a benign change; for others, disastrous.
For me? Somewhere in between.
| 9:43 pm on May 18, 2003 (gmt 0)|
How are you so sure they are going to integrate the deepcrawl info from April after all the data centers seem to be synchronized? What if the April deepcrawl totally failed and they do not want to use the data from it.
| 9:51 pm on May 18, 2003 (gmt 0)|
Somewhere (around post 115,375,389 i believe) Googleguy confirmed that deepcrawl data from April is not (all) in this index and would be included over the next couple weeks after the datacenters settle.
Same with backlinks and spam filters.
| 9:54 pm on May 18, 2003 (gmt 0)|
Fair point. GG has mentioned that this data will be rolled in over time. But until that happens everyone really should consider the reality of what they see today. I would recommend that means finding other sources of traffic - become search engine independant.
I know that it easier said than done. I am trying to do this and it is a slow process. However, events such as this Google update highlight the reality of the situation - become search engine independant or eventually forget about doing business online.
| 10:19 pm on May 18, 2003 (gmt 0)|
>> I think you have a fundamental misunderstanding of DNS. <<
I fully understood it, even before it was repeated 50 times in the last three weeks.
>> The response to that query will be any one of the nine data centres, including -sj. <<
So for the most part, you'll rarely see any of those results.
>> Currently 5 of the 9 data centres are showing the new index. So effectively the user a 5/9 chance of seeing one of the new indexes. <<
Ah, now you're talking. That is the one piece of information that I did not know. I assumed only -sj and -fi had it. I lost that information in the other 3 000 posts made in the last couple of weeks.
Hoever, a point I made a couple of days ago, and which I repeated again today, [webmasterworld.com...] and which no-one has yet commented on, or replied to, is that from what I can see the data that these two are serving up is now a lot newer than what it was serving a week ago.
[edited by: g1smd at 10:23 pm (utc) on May 18, 2003]
| 10:21 pm on May 18, 2003 (gmt 0)|
Its actually 7 of the 9 data centers though right now. At least the last time I checked.
| 10:33 pm on May 18, 2003 (gmt 0)|
Truth be told, its my pride.
I work very hard with one thing in mind, wipe out my competition. My site has come out of no where and taken alot of 1st page, and #1 rankings for my products.
Im not going to lie, I enjoy the hell out of beating my competition. Im angry because: My work, which I worked every night for 3 months straight till 2:00am (get up at 7:30 to go to work) is sitting in googles index, bouncing around getting expose for 2 mnutes here, 20 mintues there ect ect.
I really get a kick out of this stuff. In addition, I use my ranking to leverage my product manufacturers into better pricing.
However, Google at this point represents 10% of my incoming income. So by no means am I upset about money. However the potential for google to become 25% to 50% is there. Also the leverage thing with my manufacturers.
For the most partI enjoy turning the screws on my competition, call me demented, but I like it : )
As far as advertising ect, our company spends alot on magazine adds, alot. Google is Free, the interent is Free. We provide good websites for google. Google charges for adwords(not free). If anybody is doing anybody a favor its the webmaster, who goes by the rules, and provides a RELEVANT website.
Search engines are a dime a dozen, there are tons of them out there. They are simply maps to the internet. They are not entities inwhich one should feel blessed to be listed in. They are maps, if you are a good map, then you map it out good.
I feel no obligation to google, I feel no love for google, I feel google is simply mapping my website for GOOGLES CUSTOMERS.
There is always 2 sides to a coin, a flip side to every situation. Not many here realize what Im saying..........
| 10:34 pm on May 18, 2003 (gmt 0)|
Didn't mean to patronise you - apologies if I did. I'm just trying to make the point that for most of the time (7/9 chance) people are going to be seeing this new index. The 7 data centres that are showing the new index do all vary - but they vary less between each other than the index they are replacing; I'd describe it as different shades of green. And as Critter pointed out, even when you hit a data centre you are likely to hit different machines in that data centre so you are likely to see different stages of the index change in different data centres.
| 10:39 pm on May 18, 2003 (gmt 0)|
When I am looking for a book I go to Amazon .....
| 10:41 pm on May 18, 2003 (gmt 0)|
merlin30 - When I look for a relevant site (currently) I go to AlltheWeb.
| 10:45 pm on May 18, 2003 (gmt 0)|
google is broken. Kiss your tarffic and money good bye :(
No more IPO I guess either.
| 10:48 pm on May 18, 2003 (gmt 0)|
|google is broken. Kiss your tarffic and money good bye :( |
I agree, 100%
| 10:50 pm on May 18, 2003 (gmt 0)|
I never had much tarffic anyway...
(sorry, couldn't resist)
| 10:52 pm on May 18, 2003 (gmt 0)|
Think I know what you're saying.
Alltheweb is kind to my site. So are Teoma, Ink and Altavista. Google has been kind, but is a little harsher now.
But tomorrow they could all be complete BA$T&RD$ - and I still want traffic.
And I will still go to Amazon for my books.......
| 11:04 pm on May 18, 2003 (gmt 0)|
I'm amazed at how much folks obsess about what this or that datacenter is doing hour by hour during this process. Seems like you're confusing the roadmap with the territory.
My main content portal is flipping between PR5 and PR6, which tells me that my 'toobar PR' is probably going to drop down to 5 when this thing is done. My rankings are down on a bunch of 'major' search terms. Is the sky falling, though? I don't think so.
Referrals from Google are way up. The number of search terms in my logs is way up. The rate of conversions from the traffic they send is up too. At the end of the day, the only thing that matters is whether I get more targeted traffic to my sites.
Not that Google is the primary source of traffic for any site I operate - if you're relying on search engines to deliver traffic, you need to get a better marketing strategy.
| 11:04 pm on May 18, 2003 (gmt 0)|
My traffic over the last two months from Google had increased greatly do to some hard work adding content and getting solid themed inbound links. Since the new index has started showing more often on WWW, our traffic from Google is back to what it was in Feb.
All I can do is wait it out like everybody else becasue the results that are spreading across the datacenters do not reflect the last two months work.
Yes, it shows fresh tags and pages that were in the index prior to march, but pages that came in for the march and april deepcrawl are gone as well as the links.
Whether Google was broke or this was a planned thing to roll out something new (which is what it sounds like from GG posts) there are sure to be more anxious moments ahead.
| 11:07 pm on May 18, 2003 (gmt 0)|
Hi there guys,
going back to the DNS conversation, i am still kinda unsure of where really google gets its results from.
So, when i search for a keyword widgets, then google randomly outputs the results from one of the 9 datacenters?
| 11:16 pm on May 18, 2003 (gmt 0)|
As there are 9 data centres, you have a 1/9 chance of getting results from any specific one. Whichever data centre you do hit, that is the index you will see. When all the data centres have broadly synchronized their data it will look like you are hitting the same data centre every time. Until then, you will observe differing results. Although the process isn't strictly random, to all intents and purposes it appears so.
| 11:17 pm on May 18, 2003 (gmt 0)|
darkroom, to keep it simple, yes, that is exactly what Google does.
Historically at some point in time the update ends and all data centers usually end up the same, so people don't see different results on every search.
This update is different, and different data centers are showing different indices at various stages of something new, so right now we are playing Russian roulette!
The depth of "panic" around here at the moment is related to the uncertainty of the final result. When the gun is pointed in my direction and the trigger pulled will there be a bullet in the chamber?
[edited by: percentages at 11:19 pm (utc) on May 18, 2003]
| 11:18 pm on May 18, 2003 (gmt 0)|
| 11:21 pm on May 18, 2003 (gmt 0)|
Merlin30 "As there are 9 data centres, you have a 1/9 chance of getting results from any specific one. Whichever data centre you do hit, that is the index you will see. When all the data centres have broadly synchronized their data it will look like you are hitting the same data centre every time. Until then, you will observe differing results. "
This part makes sense.
"Although the process isn't strictly random, to all intents and purposes it appears so."
How do u know this process is not random? Or how is it not random?
| 11:21 pm on May 18, 2003 (gmt 0)|
Is the chance of hitting any data centre equal or does it depend on geographic location?
Also can Google weight the data centers so some are used more often or is they only control they have over a data centre on/off (by redirecting it to another)?
| 11:27 pm on May 18, 2003 (gmt 0)|
I believe the data center you get is based upon load balancing. That is a piece of hardware that attempts to distribute the load evenly across numerous servers. In addition if one server goes down there is a fail over device that automatically sends the request to a different machine.
I also believe Google has a feature which keeps you on the same server for 15 minutes once connected....although I have either seen results to the contrary or 15 minutes has passed faster than I thought!
| 11:27 pm on May 18, 2003 (gmt 0)|
When you hit a data centre, you will then hit a particular server in that data centre. There are likely hundreds, if not thousands of servers in each data centre. To spread the traffic evenly among the servers a load balancing algorithm will be employed. The load balancing algorithm is unlikely to be a random number generator - more likely it will assess the current load of each server and assign work appropriately. That is why I say it isn't strictly random - but it likely appears so.