Forum Moderators: open

Message Too Old, No Replies

Has google ever crawled prior to updating?

         

nadsab

4:58 am on Apr 8, 2003 (gmt 0)

10+ Year Member



Has google ever deep crawled prior to updating with data from the previous crawl? Just curious.

Jon_King

3:58 am on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That would be the cart before the horse.

teeceo

4:07 am on Apr 9, 2003 (gmt 0)

10+ Year Member



You mean the "chart" before the horse:).

teeceo.

metablue

3:16 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



I don't see why this wouldn't work. All the new data is out there. Deep Crawl just follows links to get to the whole web, right?

Gibble

3:23 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would be happy if it did, I missed the last deepcrawl due to a foobared robots.txt :(

Brett_Tabke

3:23 pm on Apr 9, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Yes they did - several times last year, and was routine in 2001.

Jon_King

1:14 am on Apr 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks again, Brett for the correct info!

wlswat

2:55 am on Apr 10, 2003 (gmt 0)

10+ Year Member



speaking of foobared robots.txt files..
is this ok to let everything through...

User-agent: *
Disallow:

chiyo

2:59 am on Apr 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I really dont see much relation between deep crawls and monthly updates, (they can be before and after and more than once a month, though you usually see a deep crawl starting just before or just after the start of the monthly update) and im increasingly seeing new pages freshbotted in the few days before an update appearing as permanent items for the next month for our sites.

nadsab

4:40 am on Apr 10, 2003 (gmt 0)

10+ Year Member



I just got deep crawled on four of my sites. Highest google bandwidth I've ever seen but I made lots of changes.

NotePad

5:18 am on Apr 10, 2003 (gmt 0)



Are you sure it was a deepcrawler and not the fresh bot?
From what i understand the deepcrawl happens right after an update.

I wish google guy would help more often on these kinds of questions.

Wishfull thinking on my part.

chiyo

6:44 am on Apr 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Notepad. yes, we don't take note of IPs and whether they have been defined beforehand as "deep" or "fresh" crawls. I havent looked too deeply into it, as i havent had time, but it seems there is a lot of confusion on what is what via IP and google does change the IP a bit yes?

BAsically, for our purposes we define a "deep crawl" as when Google visits 90 to 95% of our pages in a day, and "fresh bot" when it crawls only 10 to 15% -usually our highest PR and pop pages. Sorry if i was misleading.

tombot

9:11 am on Apr 10, 2003 (gmt 0)

10+ Year Member



Boy how I would love it if the deep crawl was now before the update. Otherwise I am looking at another month+ before I get any PR or significant traffic from Google.

Brett_Tabke

9:34 am on Apr 10, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Those deepcrawls that were before the update did not produce data for the immediate update. It was the update after that.

It was very good reasoning on Googles part to spider before the update to acquire data to produce the update after that. With the crawl just before the immediate update, no webmasters had time to re-engineer their pages after seeing the algo switched update ;-) So they were always 60 days removed from algo tweaking based on the previous update. Tricky - follow that?

bcc1234

9:40 am on Apr 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think they need all computing power they can get to calculate PR, so it would not be reasonable for them to leave too many boxes for crawling.

I mean, the more iterations you can do - the more accurate results you'd get, right?

tombot

11:46 am on Apr 10, 2003 (gmt 0)

10+ Year Member



So it looks like I'm hurting for another month+ any way you look at it. Oh well, patience is a virtue and all that jazz.

nadsab

1:21 pm on Apr 10, 2003 (gmt 0)

10+ Year Member



NotePad,

Yes I'm pretty sure I was deep crawled because it hit over 30 of my pages and the bandwidth is the highest bandwidth I have ever had from google for that site. More than twice as high as any other, I did make lots of changes to my pages though.

nadsab

1:25 pm on Apr 10, 2003 (gmt 0)

10+ Year Member



Brett_Tabke,

Thanks for that info, that is why I originally asked this question, and actually I made a mistake, I meant to title this thread "Does google ever deep crawl before the upcoming update" from the previous deep.

I will still hope that my deeps will be indexed this time though. Maybe deep crawls take place over several weeks? Intermittently? That's what I was looking for in my question, but I guess google will not let us know their secrets (or keep changing them). :)

Jon_King

4:55 pm on Apr 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Couldn't the update be nearly real-time if Google did something like the SETI peer-to-peer so to harness the wasted processing cycles of all the surfer's computers combined?

MetropolisRobot

12:41 am on Apr 11, 2003 (gmt 0)

10+ Year Member



oh yeah, i can imagine some people spending hours then working out how to alter the data sent back by the google-seti...now that would be entertaining...

Jon_King

1:15 am on Apr 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Haaaaa. It was good for a laugh anyway.

rfgdxm1

2:05 am on Apr 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>It was very good reasoning on Googles part to spider before the update to acquire data to produce the update after that. With the crawl just before the immediate update, no webmasters had time to re-engineer their pages after seeing the algo switched update ;-) So they were always 60 days removed from algo tweaking based on the previous update. Tricky - follow that?

There is a major flaw in this logic. This assumes that the sort of webmasters that hang around this forum are somehow typical. The webmasters of 99.9+% of all pages on the Net never tweak their HTML based on the Google update. Given the percent of webmasters who do is statistically insignificant, I tend to believe that when Google deep crawls is based on technical considerations, and not sneaky webmasters.

chiyo

4:58 am on Apr 11, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would tend to agree with brett, rfg. WebmasterWorld has a large enuff readership of SEO types to be significant, but more importantly there are also many many others who dont read or know of www who tweak their pages immediately after an update starts after doing a quick reverse engineering job in the hope of these changes being picked up for the next update during the immediate deep crawl after the previous update. It always made good sense to me and i wondered why Google did not do this earlier. Really only the most professional SEO guys could make a real impact by doing this, given the time they had to work out the new algo and changes to pages. But that group is probably one of Google's main irritants!