Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: open
joined:Oct 21, 2003
1. Download FireFox browser from [mozilla.org...]
2. Install the GoogleBar Mozilla extension from [googlebar.mozdev.org...]
3. Install the PR indicator extension from [toolbar.nickstallman.net...]
The PageRank indicator bar will appear right next to the PageInfo button on the GoogleBar when you restart the browser.
Check [prgooglebar.org...] for details.
If you have any privacy concerns please post them in the forums there so everyone can see them.
At the moment it is not being logged. I will be putting in hostname logging soon tho. The full URL will not be logged.
You're correct about Sony, jgstyle, but times have changed. In the US, you've got the Digital Millenium Copyright Act (DMCA), and Australia undoubtedly either has or is in the process of implementing similar legislation. Reverse-engineering of this kind (if it is indeed the case - I can't tell at the moment) is clearly illegal.
There's an "initval" in the code that seeds the algorithm. Google's initval is 0xe6359a60. This is one of two things that are specific to Google in the code. The other is that they stick the word "info:" in front of the URL before hashing it. That's no secret either, since they also do this in the QUERY_STRING that phones home.
The initval probably sticks out like a sore thumb when you decompile the toolbar (I wouldn't know because I don't have a decompiler). That would mean that Google is making no effort to hide anything. It's only used because the code as written defines an arbitrary initval.
Google's toolbar is about as "secret" as Alexa's Traffic Rank. If you look at Alexa's page, they've put all sorts of garbage HTML tags between the digits of the ranking number, which is a hilarious attempt to conceal it from any screen-scraper programmers whose careers are less than two weeks old. This too would not fly under the DMCA, because it is simply not a serious effort by Alexa.
The applicable laws, therefore, are copyright laws. "Fair use" comes into play. If you are noncommercial, if your scraping is primarily for monitoring the social role and functioning of search engines, if the traffic from your scraping doesn't load Google's or Alexa's servers (not possible, compared to all the toolbars out there that phone home with every new web page seen by the browser), then Google or Alexa cannot stop you through their legal department. All they can do is block you or your server.
In Google's case, they could write a new hash algorithm. They could also change the initval, but that would get discovered very quickly, and requires only a tiny change to all the scraper programs that are now out there. I've got my own working nicely, and will put it online in a couple of weeks if the PageRank portion keeps working. I'm also scraping Alexa's Traffic Rank and Yahoo's rather thorough external backlink count in the same program.
( link:http://www.example.com/anypage.html -site:www.example.com )
I don't think Google will bother this time around. No one takes PageRank too seriously anymore. The chances are better that the PageRank indicator will disappear entirely with the next toolbar update, and the "phone home" lookup discontinued. That would be a good thing for Google to do anyway.
The only thing keeping it going is Google's ambition to profile everyone with their immortal cookie, and their use of the toolbar to find new domains for crawling, and maybe some Alexa-style traffic-tracking that we don't know about. Without the PageRank indicator, none of this can be done as effortlessly, because then they cannot justify the "phone home" feature of the toolbar.
If you can't justify it, that means it more clearly gets labeled as "spyware" -- which, of course, it was since day one, in December 2000. The only difference between bad spyware and acceptable spyware is having a good cover story. That's the real function of the PageRank indicator on the toolbar.