Forum Moderators: Robert Charlton & goodroi
Does have a darker side though better have no index in robots text the very first time you begin working on a site viva ftp as it will be indexed most likely at a time when you don't want it then you have all kinds of problems
I guess it and Yahoo track the root server to find out every time a new domain goes live.
I just started a new site and didn't ask Google to crawl (Add Url). I don't recall going to it via a toolbar (the domain is too long to type). I was curious to see how long it would take for Google and Yahoo to find it "blind."
Answer: a few days.
I wasn't quite finished with the site, but Yahoo sent me traffic on day 1. Google gave me only one long tail (about 5 words), but I made the site for Yahoo users.
There are various ways for companies like engines to find "hidden" or new sites. Internet traffic data is available. Geeks have their methods.
Perhaps Google gets the data on every reg'd domain. Then it "pings" or tries to crawl the possible sites for those domains every so often. Very easy.
p/g
The rule "never put anything in a public folder you do not want to appear in the SERPS!" is more true than ever before. So if anyone has a password.txt file in a "secret" folder move it!
I assume if you do not enable PR feature then the toolbar does not phone home.
It doesn't have to be the Toolbar. Google will spider publicly available server logs. See discussion on this thread....
Why is Google indexing my entire web server?
[webmasterworld.com...]
In the original post on the thread we're now discussing:
I noticed that after I visited a new page or even typed a non existant url that Googlebot visited the site just 4 minutes later.
What ought to be at issue in this discussion is whether this has happened with the same frequency more than once. There have been some discussions about Google indexing news pages on subjects that were in the current news fairly quickly... and this, if it can be observed on a broad range of sites, would be extremely interesting. If Google hits just one new page once right after posting, maybe not so interesting.
If, over a period of time, it hits a lot of new pages right after posting, that starts to get interesting again. Then, you'd start asking other things about these pages, among them questions of topicality.
I don't believe the original poster has mentioned whether this four minute spidering interval has occurred more than once. Also, did Googlebot go right to the page that was just put up, or did it start at another page and find the new page?
Note also that both spiders crawl from the same IP blocks, so one can be mistaken for the other if you only rely on the IP address.