Is Freshbot now Deepbot?

Forum Moderators: open

Message Too Old, No Replies

Is Freshbot now Deepbot?

The line is getting drawn ever thinner

trillianjedi

4:18 pm on May 22, 2003 (gmt 0)

I've seen several postings about this now in the last few days, although this is my first actual experience of it.

I'm being hit very hard by google's freshbot at the moment, and going deep too. At first glance at what is currently going on with the little guys, I had to check and double check that the IP's were 64.... (they are).

It's behaviour, in terms of hard hitting and depth of crawl (it's going through the entire site) is more like the character of the old deepbot.

In fact, it's identical behaviour to deepbot the last time it crawled this site back in April.

I'm interested in hearing from others who are seeing the same.

trillianjedi

10:22 pm on May 22, 2003 (gmt 0)

Critter,

Sorry, I'm not quite with you - can you explain a bit more?

I think you're saying that freshie is crawling pages on a site that are not linked from other pages on that site?

I don't quite follow how deepbot would have found them in April?

<EDIT: re-read your post and understand now. It hasn't followed either old-index links or other fresh links to get there, the only place could be from April deep crawl links. OK, that's interesting.....>

Critter

10:26 pm on May 22, 2003 (gmt 0)

Hi TJ:

No, I mean that the pages fresh is getting right now can be found by crawling my site--but fresh has only got 100 or so pages, starting today. I've never seen the freshbot before.

When fresh first started crawling today she was getting pages that are 'deep' in the site, she didn't start with my home page (well she did, but the subsequent pages can't be gotten to from the home page).

So fresh must be crawling pages that deep got in April.

Peter

trillianjedi

10:32 pm on May 22, 2003 (gmt 0)

Critter,

What you said made me revisit the logs with a slightly different view and you're absolutely right.

There are some pages for which my site is in the same boat as yours - freshie simply could not have got there by following existing fresh links.

Well spotted.

So, this raises another "what is going on here then" type question.

Is it possible that google is using the freshbot "method" as a means of getting the April deepcrawl data into the "new" index I wonder?

Critter

10:36 pm on May 22, 2003 (gmt 0)

Let's just hope those results get in the index in 24 hours or so...like fresh usually does.

Peter

WebGuerrilla

10:41 pm on May 22, 2003 (gmt 0)

Yes, it does appear that FB is recrawling April DP.

I'm looking at a log that contains April data. And FB is requesting the pages in identical order sa last month.

But what isn't clear is whether or not it really is FB. It could simply be DP running from IP's that have been FB in the past.

I think things will be much clearer tomorrow. If this crawl is really part of the fresh system, we should all end up pretty happy by this time tomorrow.

trillianjedi

10:49 pm on May 22, 2003 (gmt 0)

It could simply be DP running from IP's that have been FB in the past.

Wouldn't that mean a lot of PC's having software switched from freshbot to deepbot? Would explain the timeframe I guess, but seems like a lot of work and I don't really see why they'd do it that way around.

As you say, we'll perhaps know more tomorrow.

My instinct is google is now using freshie as a means of introducing the April deepcrawl pages into the index rather than having to update all of the indexes to a single new build.

But heck, it's all speculation until there's a fat lady somewhere singing her heart out!

parabola

11:12 pm on May 22, 2003 (gmt 0)

All my pages are already picked up by the freshbot, so my concern is really getting April backlink data factored into the ranking of the index. Anyone think freshie would do that?

BigDave

11:18 pm on May 22, 2003 (gmt 0)

Wouldn't that mean a lot of PC's having software switched from freshbot to deepbot?

These machines would most likely do network boots. change an entry in your DHCP server, send a remote reboot and the bot machine comes up as a totally different beast.

I know for a fact that google uses PXE to network boot many of their systems. I cannot say for certain that they do it with their Googlebot machines, but if they don't, they should.

r3ved

11:31 pm on May 22, 2003 (gmt 0)

The pages that were hit today were new as of 2 weeks ago. It is the first time the bot has ever seen them.

Felina

11:32 pm on May 22, 2003 (gmt 0)

This really makes me wonder if we will ever see Deepbot again.
I'm getting pages pages crawled that were new 3 days ago, and they are linked from internal pages not from main page.

trillianjedi

3:39 pm on May 23, 2003 (gmt 0)

Is anyone seeing their freshbotted pages from the April deepcrawl appearing in the index yet?

Very curious to know whether they go in the index "proper" or as regular freshbot pages.

Pricey

3:45 pm on May 23, 2003 (gmt 0)

I'v still not had any new (week old) pages crawled at all :/ My site is well ranked (pr 4 for my keywords), but Googlebot rarley appears in my logs. Tbh, I don't even think any of my sites have seen a deepcrawl yet.

BigDave

4:02 pm on May 23, 2003 (gmt 0)

TJ,

As I mentioned in another thread, I have seen several of my April deep crawl pages in the index. I had an error in my navigation bar that was only htere for a very short time during the deep crawl. I have found four of those pages "cached" in google. One of these was in before freshbot hit it yesterday.

It has not yet changed over to the freshbotted version from yesterday.

About 1/3 of my pages that I have looked at are versions that were freshbotted since the last deep crawl, and have been move in permanently. I made a change in late April that I have looked for by viewing source of the cache.

Napoleon

4:14 pm on May 23, 2003 (gmt 0)

>> All my pages are already picked up by the freshbot, so my concern is really getting April backlink data factored into the ranking of the index. Anyone think freshie would do that? <<

That's the moot point... the missing data is probably the cause of the diminished quality of the Google dbase.

Just slapping the missing pages back in the index anywhere is not the issue. It's getting the quality pages ranked properly again, with a 'proper' PR and relevancy calculation.

Of course, as speculated, FB could now have the facility to acheive all that. In other words, re-ranking on the fly, without an entire (monthly) re-calcuation for the whole database.

Sadly, that's all it is... speculation. However, something has to give soon with this index in terms of the backlink data, to align with GG's comments (weeks not months: and we are weeks into this thing already)

parabola

4:24 pm on May 23, 2003 (gmt 0)

Napolean, that is what I was saying- thanks for making the sad truth a bit clearer. I have my doubts as to whether thye are going to re-score the index based on new backlinks anytime soon. It's easy to get new pages in...

This 211 message thread spans 15 pages: 211