Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Googlebot visits new URL 4 minutes after viewing with Toolbar

         

waynne

4:52 pm on Jan 17, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



I have only just noticed this, I was aware it was happening but I'm just surprised at the sheer speed. Whilst playing around in my servers error log I noticed that after I visited a new page or even typed a non existant url that Googlebot visited the site just 4 minutes later. Perhaps Google are starting to use toolbar users as spiders for new content. Perhaps it is just me and I got the golden toolbar? Pages do have adsense but the reverse IP states Googlbot Crawl 66.etc so I'm pretty sure it is not the adsense bot.

bwnbwn

10:40 pm on Jan 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



article bot spyware bot call it what you want to but that is part of sending Google back info through the browser the bot visits every page you visit kind of like a seeing eye dog. This has been going on for years so there are millions of them there golden tool bars.

Does have a darker side though better have no index in robots text the very first time you begin working on a site viva ftp as it will be indexed most likely at a time when you don't want it then you have all kinds of problems

jomaxx

7:05 am on Jan 18, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Think about the green bar. Every time you visit a page, the toolbar needs to send the URL to Google so that it can display that page's PR value. Maybe it's a privacy issue (in fact for sure it is), but it couldn't really work any other way.

potentialgeek

9:37 am on Jan 18, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google doesn't need the Toolbar data.

I guess it and Yahoo track the root server to find out every time a new domain goes live.

I just started a new site and didn't ask Google to crawl (Add Url). I don't recall going to it via a toolbar (the domain is too long to type). I was curious to see how long it would take for Google and Yahoo to find it "blind."

Answer: a few days.

I wasn't quite finished with the site, but Yahoo sent me traffic on day 1. Google gave me only one long tail (about 5 words), but I made the site for Yahoo users.

There are various ways for companies like engines to find "hidden" or new sites. Internet traffic data is available. Geeks have their methods.

Perhaps Google gets the data on every reg'd domain. Then it "pings" or tries to crawl the possible sites for those domains every so often. Very easy.

p/g

BradleyT

1:05 pm on Jan 18, 2008 (gmt 0)

10+ Year Member



Can confirm OP - noticed that on a test domain this week and almost made the same post on Wednesday.

ecmedia

3:40 pm on Jan 18, 2008 (gmt 0)

10+ Year Member



I am surprised that you are surprised. Google clearly states that at time of toolbar installation that it will track whatever you do. Obviously, one very strong component in serving SERPS is toolbar data compiled from users.

waynne

9:27 am on Jan 21, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



The key here is not what happens but the sheer speed - a few minutes! Most designers test new CMS areas etc before they add security features. Test pieces of code are frequently tried out including server reset scripts and ping tests. If Google visits these pages and then reindexes them and keep visiting them it has profound security implications for all of us webmasters.

The rule "never put anything in a public folder you do not want to appear in the SERPS!" is more true than ever before. So if anyone has a password.txt file in a "secret" folder move it!

driller41

2:13 pm on Jan 21, 2008 (gmt 0)

10+ Year Member



Are you saying that having the PR enabled is the issue here, it does say if you enable this feature that there are privacy issues.

I assume if you do not enable PR feature then the toolbar does not phone home.

Robert Charlton

7:26 pm on Jan 21, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I assume if you do not enable PR feature then the toolbar does not phone home.

It doesn't have to be the Toolbar. Google will spider publicly available server logs. See discussion on this thread....

Why is Google indexing my entire web server?
[webmasterworld.com...]

In the original post on the thread we're now discussing:

I noticed that after I visited a new page or even typed a non existant url that Googlebot visited the site just 4 minutes later.

What ought to be at issue in this discussion is whether this has happened with the same frequency more than once. There have been some discussions about Google indexing news pages on subjects that were in the current news fairly quickly... and this, if it can be observed on a broad range of sites, would be extremely interesting. If Google hits just one new page once right after posting, maybe not so interesting.

If, over a period of time, it hits a lot of new pages right after posting, that starts to get interesting again. Then, you'd start asking other things about these pages, among them questions of topicality.

I don't believe the original poster has mentioned whether this four minute spidering interval has occurred more than once. Also, did Googlebot go right to the page that was just put up, or did it start at another page and find the new page?

BillyS

10:56 pm on Jan 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't think it's the exact same issue, but Matt Cutts discusses this topic. There was a claim that Google somehow found pages that didn't have a link to them. That is simply not true and has been tested independently.

jomaxx

11:47 pm on Jan 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



True (in Dec. 2006), although Matt's claim was that Google doesn't index such pages, not that it doesn't spider them. For one thing, if the page has AdSense on it then it will definitely be spidered by the Mediapartners bot at least.

Note also that both spiders crawl from the same IP blocks, so one can be mistaken for the other if you only rely on the IP address.

waynne

9:50 am on Jan 22, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



The 4 minutes is a consistant thing. I was in the error log and noticed when I hit a non existant page that googlebot came by 4 minutes later and threw the same 404 error for the page I had previously visited. At first I thought it was a coincidence but after a number of trials I concluded it was deliberate.