| 9:41 am on Mar 31, 2004 (gmt 0)|
>>>>now spidering pages that have no inbound links
Doubt it sincerely. How would Google & co find the page in the first place?
| 9:45 am on Mar 31, 2004 (gmt 0)|
Listing of the sub-directory if not denied via .htaccess?
| 9:50 am on Mar 31, 2004 (gmt 0)|
Google can find pages in one (or more) of these ways:
- link on another page
- link found in referer log of a site the page links to (you probably don't know the link is there!)
- someone who installed the googlebar viewed the page. Googlebar then reported the existance and viewing of the page back to Google.
| 10:10 am on Mar 31, 2004 (gmt 0)|
Google do spider unlinked pages. Read this article, it'll send cold chills down your spin.
| 10:12 am on Mar 31, 2004 (gmt 0)|
A site like Googledorks (search for the link in google) is dedicated to reviling searches in Google that will show confidential information. This information is practically never linked to from other pages
| 12:59 pm on Mar 31, 2004 (gmt 0)|
I have a new site, with two main purposes - one is a forum for a group to which I belong; there are no links to this forum. The users know the URL as
www.mydomain.com/forumname, whereas I have subsequently created a subdomain to access the forum as forumname.mydomain.com. The other part of my site is a public website. I've been waiting hopefully for Google to come crawl my main site, so far no luck. But wouldn't you know it, the first referral I got from Google came from a search for the private forum! Google referred it as the subdomain url. No idea how they found it.
| 1:25 pm on Mar 31, 2004 (gmt 0)|
Do a search for Google's DomainPark, some of the answers probably lie here.
| 1:26 pm on Mar 31, 2004 (gmt 0)|
from my experience it will spider an orphan page as long as it was linked to at one time, until it returns a 404/410 etc.
| 5:10 pm on Mar 31, 2004 (gmt 0)|
So I guess the rumors are true. Well its nothing major but it is nice to know how Google works.
| 5:59 pm on Mar 31, 2004 (gmt 0)|
I have seen Google spidering the sites that i used in a form's post address..e.g. On my index page, I have an input box and the form is submitted to www2.mysite.com/form.php OR submitted to www3.mysite.com/form.php. Everytime the page loads, I show randomly either www2 or www3. Google has crawled the index page of both www2 & www3 sites. There is NO link to these pages from anywhere else in the web world.