Forum Moderators: open

Message Too Old, No Replies

Very Freshbot

Let's collect what we can in this thread

         

Clark

4:39 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, GG has stated that there is a "Very Freshbot" now. And the IP for the DeepBot will probably no longer be around. So as Esmerelda (good name, btw..updates should be named after females not males) progresses, let's see what we can figure out about the new behavior of the googlebots and share the knowledge here.

It sounds to me like the "very" freshbot behaves "very" much like the old freshbot. Capturing "very" fresh content, bumping the PR very high, but only for a couple of days, then drops it from the index.

Meanwhile, the "old" freshbot is like the old deepbot. So the only difference is that it looks like we'll have a continuous deepbot throughout the month.

So now the trick will be to try to watch our logs closely, see if "fresh" pages get a boost and disappear...and mark the IPs of the bots that got that content so we can identify if the "very" freshbot has an ip range like the old one did.

P.S. One thing I'll never understand, why did the old freshbot and does the "very" freshbot drop pages from the index altogether instead of just turning their PR to a 1 until the full update assigns a more permanent PR number? This would help return *some* results for searches that have no other pages. Would be most likely to help searches for "fresh" content that have little data on the net.

Stefan

5:30 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



By trying to read between the lines of what GG was saying, it seemed to me that veryfreshbot would result in freshtags and not last long, while the otherbot would have sticky results and act as a continuous deepcrawl (probably much less active than veryfresh though).

Good thought on trying to match bots to ip range. Veryfresh should be spotted by watching for pages with freshtags, then going into the logs to see what last crawled it. I have lots of freshtags on pages to look into right now... should make notes cause the serps will change and tags disappear.

ADDED: Just found two jun15 tags picked up 0:00 - 0:30 UTC Jun15 by 64.68.82.xx There is also a jun 14 tag by the same ip range.
Any ideas on how to get the otherbot?

ADDED AGAIN: You know, that's the only ip range, 64.68.82, the regular freshbot, that I've been seeing since the deepcrawl of Apr...
Is the same bot doing everything then they handle the results differently after? I guess Google will be having a used-bot sale soon.

Clark

5:58 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I never really paid attention when people were saying fresh tag. Does that mean the date next to the listing?

kevinpate

6:35 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, the date in the serp info indicates the page was freshly crawled and added, e.g. there are lots of June 14 and 15 fresh tags right now

AthlonInside

6:50 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Very Freshbot", I recall googlebuy name it DeepFreshBot (or FreshDeepBot?).

JasonHamilton

7:08 pm on Jun 16, 2003 (gmt 0)

10+ Year Member



mmmm DeepFriedBot. /me heads off to KFC.

AthlonInside

7:16 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Jason, please get me a drumstick and a large coke! :)

Clark

7:20 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Athlon, check out the update thread. GG said there's a "Very" freshbot out as well now. Or something like that...

Powdork

9:38 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On June 1st I was deep crawled by 64.68.82.*
I know this because it was the only time new pages were crawled prior to a spelling correction in the title uploaded on 6/2.

This little bugger
64.68.82.38 - - [14/Jun/2003:22:24:21 -0400] "GET / HTTP/1.0" 200 8589 "-"
got my index and it dropped off the map when the fresh tag showed up. Its gone from all datacenters for all practical purposes. During that fresh crawl there were two broken links found (not to do with index page however). Maybe I just got a fix it ticket.;)

As to the original question. I see the same ip's acting both as very fresh and deepfresh. For me so far, the last two digits in the ip address are greater than 10 and less than eighty.

Powdork

9:40 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I should add, normally freshbot arrives and gets robots, then index, and then others. This time (the 14th) she grabbed index last. Don't know if that's relevant.

DroffatsX3

9:54 pm on Jun 16, 2003 (gmt 0)

10+ Year Member



edit-moved to another thread

Stefan

11:00 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Alright, I have a news page, gets the date changed daily. It's in the -fi cache, with no freshtag, showing May 31. I have to dig through logs back to mid-May and see all its bot ip# visits. It's a project, will take a while, but might give a clue.

Powdork

11:02 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes I should have added that my new page is only showing up on -fi.

Stefan

11:05 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I actually meant a "News" page...

That's why it gets the date changed most days. :-)

bolitto

12:57 am on Jun 17, 2003 (gmt 0)

10+ Year Member



Maybe offtopic but can anyone explain what the heck Esmerelda is?

On another thread someone said "as Esmerelda progresses" - first of all, does it even exist?

Therefore my question - is Google updating? What data do you base it on since backlinks are a mess now?

bolitto

1:01 am on Jun 17, 2003 (gmt 0)

10+ Year Member



Well don't bother replying. The emeralda thread was conveniently bumped up here and now I see it.

Sorry for the offtopic detour.

Stefan

1:16 am on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Esmeralda was the girlfriend of the Hunchback of Notre Dame, I believe.

No problem on the detour, but if you manage any great discoveries on the new googlebot, please post them here. :-)

bolitto

2:01 am on Jun 17, 2003 (gmt 0)

10+ Year Member



Stefan, no new discoveries on newbot yet, but!

Look who's all over my sites!

"MSNBOT/0.1 (http://search.msn.com/msnbot.htm)"

I've checked the IP - it's legitimate, Microsoft is crawliing the web!

IP 131.107.137.xxx

Stefan

2:20 am on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This should be in a different forum maybe... I've never seen one of those though!

Man, my new host only gives me the logs once a day zipped...
Bill is up to something eh?

Powdork

2:30 am on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



They recently did a mini crawl of sorts with this guy
MicrosoftPrototypeCrawler (How's my crawling? mailto:newbiecrawler@hotmail.com)
I didn't get the ip though

Stefan

2:33 am on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



bolitto, that should make a good new thread in forum 11, spider identification perhaps. If MS has luanched a crawler, big news.. I'm not sure what forum it should go in, none of them seem to fit exactly.
I have 2 odp cats, so if bill is crawling I expect I'll see him soon. What are they doing with Ink then? (Different forum).

bolitto

3:02 am on Jun 17, 2003 (gmt 0)

10+ Year Member



Bill is up to something!

Posted it to the PFI forum as well, I think it's cool news and put it here just cuz we were in the middle of the conversation.

This is BIG news, because my crawl was not just superficial, it went all the way through many many pages.

1 year Google earnings are still WAY less than R&D budget at Microsoft, they can probably dump Google in 1 or 2 years if they take this seriously.

Interesting times ahead.