homepage Welcome to WebmasterWorld Guest from 54.163.70.249
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 211 message thread spans 8 pages: < < 211 ( 1 2 3 [4] 5 6 7 8 > >     
Is Freshbot now Deepbot?
The line is getting drawn ever thinner
trillianjedi




msg:104752
 4:18 pm on May 22, 2003 (gmt 0)

I've seen several postings about this now in the last few days, although this is my first actual experience of it.

I'm being hit very hard by google's freshbot at the moment, and going deep too. At first glance at what is currently going on with the little guys, I had to check and double check that the IP's were 64.... (they are).

It's behaviour, in terms of hard hitting and depth of crawl (it's going through the entire site) is more like the character of the old deepbot.

In fact, it's identical behaviour to deepbot the last time it crawled this site back in April.

I'm interested in hearing from others who are seeing the same.

TJ

 

Stefan




msg:104842
 11:36 pm on May 23, 2003 (gmt 0)

[quote]the last deepbot crawl data can be used for cross-verification. It's also there as a worst-case back-up, but I don't think we'll need to use it.[/quote'

Good stuff GG. I'm interpreting that as either 64.68 or 216.239 doing a deepcrawl of some sort soon, then showing up in the next update. The light at the end of the tunnel is just around the bend.

GoogleGuy




msg:104843
 11:43 pm on May 23, 2003 (gmt 0)

Oaf357, some people have plotted this out pretty carefully, but we're also cautious enough to keep a back-up that we can switch to as a safety net. That's just good preparation, not something to worry about.

Dolemite




msg:104844
 11:50 pm on May 23, 2003 (gmt 0)

Ok, so now we have got newer data being "gradually" brought in and, wait till the next update "Let me clarify. The next update should bring in more backlinks, data, etc. Those dang prepositions."
Can it be both of these?

This has me confused as well. In my mind the addition of new backlinks, the lack of which is big problem with the current index, are either brought in "gradually over time" or in the next update. Unless it happens gradually during the next update...?

GoogleGuy,

I hate to be critical, we all appreciate your presence here, but I can't help thinking we'd all be better off if you stated your responses clearly and specifically once rather than saying the same thing obscurely eight times.

So many people take your word as gospel, and I feel like I'm reading the King James version. ;)

jojojo




msg:104845
 11:57 pm on May 23, 2003 (gmt 0)

"Let me clarify. The next update should bring in more backlinks, data, etc. Those dang prepositions. :) "

Exactly... there is no gradual anything.

Basically we skipped an update while gogole played around with some stuff. End of story. Better bust out the cases of scotch cuz you'll be drinking for another 2-3 weeks.

Oaf357




msg:104846
 11:57 pm on May 23, 2003 (gmt 0)

Oaf357, some people have plotted this out pretty carefully, but we're also cautious enough to keep a back-up that we can switch to as a safety net. That's just good preparation, not something to worry about.

I totally agree. Maybe my line of work has something to do with it but whenever we have to go to a back up plan, it's not good, at all.

GoogleGuy




msg:104847
 12:01 am on May 24, 2003 (gmt 0)

jojojo, some data will be brought in, such as spam filters. A larger set of data will be brought in with the next index.

parabola




msg:104848
 12:10 am on May 24, 2003 (gmt 0)

2-3 weeks seems pretty optimistic jojo since most uodates take at least 4 weeks...

Stefan




msg:104849
 12:11 am on May 24, 2003 (gmt 0)

My take on it

Right now, for the SERP's, what you see is basically what you get. The algo will have other parameters introduced before the next update that will alter things slightly. A true deepcrawl will begin soon and replace the ancient data currently in the Google database, during the next update, although some freshbot changes can be expected sooner. Normalcy, such as it is, is anywhere from 3 to 7 weeks away. A sudden appearance of missing serp's, pages, and PR isn't going to happen in the next few days or for probably much longer but hope springs eternal.

dvduval




msg:104850
 12:12 am on May 24, 2003 (gmt 0)

Other than fewer backlinks showing and the data being spread across the 9 datacenters, what makes this latest update different than previous updates?

Also, between now and the next regular indexing, what might be different than the status quo?

steveb




msg:104851
 12:22 am on May 24, 2003 (gmt 0)

"I would expect at least another update of the form where the crawl/index cycle finishes and then data centers are updated in the traditional dance."

Being kneedeep in trash makes me feel like being non-cryptic, so...

Ths trash results will stay till about July 1st, at which point an update based on ongoing freshcrawl data will occur.

Does this mean that Google is simply incapable of deleting uk.co URLs until July?

Switching to a new system is one thing.
Accepting horribly degraded results for two months is one thing.
Basing results on a deepcrawl from months ago is one thing.

But Google Guy, for heavens sake why can't the largest, most important search engine on the planet delete/ignore a TLD that no longer exists?

((It is one thing for Google to rank gobs of spam ahead of you. It's pretty humiliating for Google to rank a site that can't possibly exist ahead of you.))

steveb




msg:104852
 12:27 am on May 24, 2003 (gmt 0)

"Other than fewer backlinks showing and the data being spread across the 9 datacenters, what makes this latest update different than previous updates?"

LOL, well for starters most data wasn't updated, it was rolled back.

The results just changed, they weren't updated in any sense of that word. It is like we were all watching The Empire Strikes Back when suddenly the theater owner switched us to the middle of Stars Wars... while promising Return of the Jedi will appear at some point.

jojojo




msg:104853
 12:29 am on May 24, 2003 (gmt 0)

"Hate to be so persistent but, is this newer data being picked up now or is it going to be picked up sometime before the next update?"

Now it's not. No it won't. This update is over. Next one will start 10th of July or so. This update was not really an update - if you considering updating with the data from the previous month.

Nothing new has changed or happened to the process yet.

GoogleGuy




msg:104854
 12:30 am on May 24, 2003 (gmt 0)

steveb, I'm checking out your dominic report now--I'll investigate the uk.co tonight.

dvduval




msg:104855
 12:33 am on May 24, 2003 (gmt 0)

The results just changed, they weren't updated in any sense of that word. It is like we were all watching The Empire Strikes Back when suddenly the theater owner switched us to the middle of Stars Wars... while promising Return of the Jedi will appear at some point.

That's what I meant to say, but in a subtle way. You could also say that Star Wars 7,8 and 9 have been announced, but first we are going to start over with Star Wars!

Let me try again:
What, if anything, is POSITIVE about the latest update?
And, what POSITIVE things might we expect that are different from the status quo between now and the next update?

TheAutarch




msg:104856
 12:40 am on May 24, 2003 (gmt 0)

Now it's not. No it won't. This update is over. Next one will start 10th of July or so. This update was not really an update - if you considering updating with the data from the previous month.

Nothing new has changed or happened to the process yet.

I know there is not going to be much change in the way of updated SERPS.

Maybe I should have been more clear. When I said new data, I meant a crawl and was just trying to find out if there was going to be an official deep crawl of some sort between now and July. GoogleGuy said something about using newer data than the April crawl and that would imply picking up some between now and the next update. What I initially was trying to find out is if it's really already started, which is probably very redundant and annoying since that's what this topic is all about.

Stefan




msg:104857
 12:46 am on May 24, 2003 (gmt 0)

TheAutarch, there was a particularly busy 64.68 freshbot visit on many sites in the last couple of days. GG said something to the effect that "he was glad people had noticed". There has been speculation that the deepcrawl will shift from a monthly cycle to an ongoing process. No one knows except those at Google. Either way, we can expect to have our sites deepcrawled over the next while and hope that it will show up in the serps eventually.

mrguy




msg:104858
 1:02 am on May 24, 2003 (gmt 0)

Ok,

I went away for awhile and came back and the same thing is being rehashed over and over and over.

As I see it based on what GG said.

The May update is over. Accept it and move on.

With the exceptions of some filters and a little data here and there, what you see is what you get until the next update around the second week of June. At that time all the freshest data will be brought in. From the latest crawl or past two, who knows, or does it really matter. A fresh crawl will do just fine.

This has been done to lead up to what I perceive as their rolling update. After maybe one or two more as we know, the dance as we know it will go away and we will in essence be dancing all month long as the freshies act as deepbots.

GG eluded to as such with his statements above.

So, see you at the dance :)

TheAutarch




msg:104859
 1:12 am on May 24, 2003 (gmt 0)

Stephan, yes I remember him saying that he was glad people noticed. I asked if that meant the deepcrawl was happening now. I guess he just can't say but I'm with you, thinking this will just be an ongoing process.

carfac




msg:104860
 4:38 am on May 24, 2003 (gmt 0)

>>> With the exceptions of some filters and a little data here and there, what you see is what you get until the next update around the second week of June. At that time all the freshest data will be brought in. From the latest crawl or past two, who knows, or does it really matter. A fresh crawl will do just fine.

I think Mrguy has it there. The only other thing I see is that GG liked it that we noticed FB was DeepCrawling. I am hoping that all the FB data goes in like FB data over the next couple weeks with all the other "adjustments".

dave

carfac




msg:104861
 6:35 pm on May 24, 2003 (gmt 0)

FWIW...

A new site, I just put up for a friend... Linked to it from my PR 7 site two days ago. My index page is now fresh in cache... and a new page (my friends) is in the main directory.

Probably the result for FB, obviously, but it does show that new data IS going into the Index!

Happy LONG weekend to all!

dave

carfac




msg:104862
 6:37 pm on May 24, 2003 (gmt 0)

OK, VERY weird.

I just went back and looked again- SAME keywords (and they are VERY obscure, and very site-specific- this guy repairs OLD widgets, and there is a very specific term ("widgetmister conservation") I used)...

anyway, bith his site and my fresh site are not showing...

Sorry to use more bandwidth here, just thought it was strange to be in and out of main index....

dave

uber_boy




msg:104863
 6:09 pm on May 25, 2003 (gmt 0)

I'm not sure why you think Google is currently using February data, parabola, but my investigations certainly don't support this hypothesis. Granted, page ranks fluctuate for a time and then reverted to their levels from April. However, the actual pages in the index, at least as far as I can tell, are not the same as they were at any point in the past.

g1smd




msg:104864
 6:37 pm on May 25, 2003 (gmt 0)

For my site, Google was using data from late March for a very long time, but nearly two weeks ago, they suddenly swapped (and -sj was first I think) to data they collected on or about 2003-05-08. This then spread over the next week to all datacentres.

merlin30




msg:104865
 7:29 pm on May 25, 2003 (gmt 0)

It isn't the pages that are cached that are indexed/cached that are old. It is the back link structure that is old. As backlinks and thus PR have a significant effect on ranking then many webmasters are seeing rankings as they were in Feb/March.

Napoleon




msg:104866
 7:41 pm on May 25, 2003 (gmt 0)

Merlin30 has it. The key 'data' that is actually missing is the data that is used to calculate ranking and PR. That is generally quite old in what we are currently seeing.

It sounds like that is going to be brought in for around mid-June.

mauijaws




msg:104867
 7:53 pm on May 25, 2003 (gmt 0)

In my case Google definitely reversed the caches to the February update. I know because we did a redesign in April which was picked up by deepbot from the logs.

Previously ranked pages are still listed in the SERPS, but do not have any cached infos in them despite being deepbotted.
Some pages reflect newer changes (freshbot activity I assume), but it still looks like a mess.

version2




msg:104868
 7:54 pm on May 25, 2003 (gmt 0)

Right now the cache is showing April's data for me.

parabola




msg:104869
 9:29 pm on May 25, 2003 (gmt 0)

Uber boy,

Errrm, no.

The data that is old is the backlink data which is the integral part in ranking sites. Most believe the backlinks showing are from Feb crawl. GoogleGuy has even admitted that they have not used recent deepcrawl data for this update. This is NOT a point of debate. Cache means nothing and new pages may appear as a result of freshbot.

We will see new data come in during the next update.

parabola




msg:104870
 9:34 pm on May 25, 2003 (gmt 0)

>>TheAutarch, the last deepbot crawl data can be used for cross-verification. It's also there as a safety-net back-up, but I don't think we'll need to use it. <<

This is what GoogleGuy said when asked if April deepcrawl data will be scrapped.

The backlinks data is old.

WeirdCode




msg:104871
 10:12 pm on May 25, 2003 (gmt 0)

Deepbot checks robots.txt:
crawl27.googlebot.com - - [25/May/2003:14:27:02] "GET /robots.txt

The bots are allowed in, and so it goes:
crawl27.googlebot.com - - [25/May/2003:14:27:02] "GET /first directory/some file.php

But it seems that Deepbot isn't alone:
crawler11.googlebot.com - - [25/May/2003:14:32:22] "GET /second directory/some other file.php

Freshbot didn't even bother to check robots.txt, it just continues where Deepbot left:
crawler12.googlebot.com - - [25/May/2003:14:53:51] "GET /first directory/some other file.php

Ain't they cute? It's almost as if they were one and the same now.

BigDave




msg:104872
 2:17 am on May 26, 2003 (gmt 0)

Both crawl and crawler are now coming from freshbot IPs. You will drive yourself crazy trying to figure out what is going on now based on past experience.

This 211 message thread spans 8 pages: < < 211 ( 1 2 3 [4] 5 6 7 8 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved