homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

Google now starting to include April deep crawl data
Page with error from April

 9:31 pm on May 22, 2003 (gmt 0)

I was just looking at the cache of some of my sites to see if Freshbot was putting things into the main index, when I happened upon one of my pages where the Google Cache is showing an error in my navigation bar that was only there for a short time in April during the deep crawl.

I suppose that this might have been a freshbot hit that included it in -sj, but it is a page that doesn't get freshbotted.

Has anyone else noticed any April data that can be confired as from the time of the deep crawl showing up?



 9:49 pm on May 22, 2003 (gmt 0)

I've been checking regularly, but I haven't seen any yet. However, now that freshbot is out deep crawling, I would think we might start seeing more April content showing up.


 9:50 pm on May 22, 2003 (gmt 0)

I run 10 sites and I see no April data on mine. That of course means nothing. I would be careful to say the freshbot does not hit a page though. The freshbot just slammed one of my sites indexing about 50 out of 500 pages. It ran into a string of 404 because it cant handel session variables and then went away. It even tried to follow this link /cfdocs/CFML_Reference/contents.htm. That is part of an error message for ColdFusion (that one had me stumped for a while).


 10:00 pm on May 22, 2003 (gmt 0)

I have checked the cache on around 300 pages of my site. I found 4 with the error. These are very definitely from the deep crawl time frame.

When I fixed the error there was some distinctive coade in there for about another 2 weeks. I didn't check every page, but I found a few of these pages from that time frame. I cannot say whether these were fresh or deep for certain.

All the rest of my pages are old format which would be correct for the Feb or Mar deep crawl, or new format that would be correct for fresh. The recent fresh content does not always have the fresh date.


 10:01 pm on May 22, 2003 (gmt 0)

HAve you seen any backlinks from March or April?


 10:02 pm on May 22, 2003 (gmt 0)

I have backlinks that I never had. They showed up a few weeks ago.


 10:04 pm on May 22, 2003 (gmt 0)

No new backlinks yet. I would guess that they would want to get all the new data in before they recalculate the backlinks.


 10:09 pm on May 22, 2003 (gmt 0)

if they get april data, that is GREAT news.

a lot of us are already losing patience.


 10:12 pm on May 22, 2003 (gmt 0)

:D My site seems to be back to PR8 again this month. Last month it was down to PR7, the month before PR8 and the month before PR7.


 10:24 pm on May 22, 2003 (gmt 0)

I checked 1 page out of 2,000 and it was crawled 03/28/2003


 10:24 pm on May 22, 2003 (gmt 0)

I'm not too sure that freshbot is deepcrawling. I've been studying freshbot very carefully since Dominic and all the urls it's going after seem to be based on old data about what is fresh. I think when there is a new site, freshbot will crawl a page, follow the urls found on that page, then repeat the procedure.

For older sites, it will analyze different versions of a page from one index to another, then decide which pages get refreshed and the freshbot will chase those pages and not follow links.

If it were acting like the deepbot, it would follow links. I have not seen it go after a single new link on tons of pages that I saw it crawl.

It did crawl a lot, I agree, but it looks very different to me then a deep crawl.


 3:39 am on May 23, 2003 (gmt 0)

OK, I take it back. Today I saw it finally behave a bit more like a deepbot.


 5:12 am on May 23, 2003 (gmt 0)

my case is funny :

when i i goto my site.com there is an old cache of starting march .. when i goto cache www.site.com ... i have a cache of mid-march (which include those 5 cross links i had talked about)

When i search for some keywords i get the results of the page about 1 month old, but the cache is the oldest .. for other search terms which match with the latest website text and title of about 10 days ago (main page) .. i get result back showing the main index page updated just 10 days. Sometimes it shows the lastest cache of 10 days ago sometimes not (with the same keyword) .. weird

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved