Forum Moderators: open
Clearly This is the main algorithm used by google,
and any person doing any kind of search engineering needs
to fully understand it.
Now, are there some tutorials and clear notes out there?
I don't mean the original papers by the google founders,
I'm looking for 100% clear tools, visual examples that a non-technical person could understand quickly.
Any one out there who knows 100% how page rank in google in calculated can provide some links/notes?
There just seems to be too much speculation and guesswork right now.
Even a "basic" example would do - clearly no-one except people working in google will know the complete page rank algorithm! I've read the faqs, there is no simple "dummies" list of steps we can take - can we all help each other to build them? A clear flow chart - see my step 1 below as a starting point.
[edited by: The_Subtle_Knife at 11:20 pm (utc) on Feb. 23, 2003]
The point is that you need at least one incoming link for google to find you and be considered in the index. It might just be one PR3, and the rest of your pages are able to successfully build up to a PR6 somehow. But it has to be able to find the site by following links from one of its seed directories (yahoo and dmoz).
My best guesses about the subtle knife's identity:
1) Smart guy. Play dumb. He's got everyone throwing their research at him.
2) An alter ego of one of the serious members of the forum having a bit of fun.
3) A Google rep checking what the public knows about PageRank!
4) An Overture search engine engineer who wants to improve their algorithm....
Which is not the same as being indexed. Freshbot does not put you in the main index, though if they were refering to the main crawl it would.
Many of the answere were along the lines of "there are links, you just haven't seen them"
Many sites have their stats pages available on the web. If while checking to make sure that the links on your new site work, you follow a link to a site that displays their stats, you have an instant inbound link for google to find. It is actually common for porn sites to try and spam these to get more inbound links.
Hosting services often post the URLs of domains they host, or have some part of their service desk crawlable. Then there is the whois database, who knows if they crawl that somehow. I think they can and there is probably some PR passed that way.
But all the evidence I have seen is that if you have no incoming links at all, and you submit your site, freshbot will find you, but you will not get deep crawled.
I have no proof of what I am saying here and would be interested if someone has run some actual experiments. I will try it on my site with a small 100 page or so loop with no outgoing or incoming links. I will submit it to google and wait to see what happens after the next update.
I guess you are right. I found the Google explanation [google.com] you are talking about.
My memory may be failing me, but I'm sure I had a site indexed without any links pointing to it. It stayed in the index for a few months and was eventually removed.
Shame, it still is a good site. This just wasn't the sort of site anyone else would link to unless we pestered them.