Forum Moderators: open
Now all this talk of bad neighbourhoods got me thinking - what's the best way to ensure a site isn't part of one? Well, in the absence of a Badrank toolbar, I think the best way is to target sites with some PageRank.
So here is the question. If, in order to determine the PageRank we automatically open a browser window with the toolbar installed for each site we are interested in, is that a violation of the TOS? We are not querying Google directly. Actually we are just browsing, but of course we know the toolbar will query Google for the Pagerank of the page, but that is Google's doing - not mine.
And how many queries per hour would be deemed automation?
Any views?
Thanks
[edited by: Marcia at 9:28 pm (utc) on Aug. 24, 2003]
If it then requires a human to look at the Googlebar and note down the PageRank then I would say "why bother with that automated step of opening a browser window" if you think that is going to upset Google?
Just have your crawling process output a list of interesting URLs for a human to assess later, then you're not breaking anyone's Terms of Service.
Having said that, I don't think what you've described could be considered automation any more than your browser opening a default "home page" automatically could be; so I sorta see where your IT guy is coming from. GG mentioned personal use, but every SEO in the land is using the toolbar for "business purposes"...
Any non-Google checks we could run to ID a bad neighbourhood?
It's hard to resist the temptation to automate this part of our business. At the moment we pay 2 people to do the same work and they do it very slowly. Google themselves automate everything and anything, and, I might point out, perfom automated queries on my website, so is a case of the pot calling the kettle black :)
I assume the problem is that we would consume too much of Google's bandwidth. As others have pointed out in the past, we would gladly pay for the right to run automated queries, but Google have steadfastly refused such a service, even for the API service.
You don't need to intercept the toolbar or have a human look at the green bar. It's enough to check the temporary internet files where google places a file for each pagerank query. This can be done automatically.
so there is no need to fiddle with any of Google's software and break their TOS in this regard.
I am just looking for an acceptable way of crawling the web and elliminating bad neighbourhoods without a human check. Any ideas would be welome.
That is all this hinges on (IMHO). If you are planning on probing the Toolbar communication to extract PR automatically then a definite no no.
However, if all you are doing is firing up a browser window automatically in order to prompt a human to look at it (and visually extract PR as part of their review process) then I don't think you have a problem.
Google will ban you if they want anyway; that has nothing to do with ToS. They stand only as "Exhibit A" should Google decide to try and sue you for breaking them.
[edited by: dmorison at 11:13 pm (utc) on Aug. 24, 2003]
>>That is all this hinges on (IMHO). If you are planning on probing the Toolbar communication to extract PR automatically then a definite no no.
Huh? It's my connection, and I run a sniffer.
Now, if you want to *manually* do this, then
1/ have your program build you a page of links for each search
2/ turn on a sniffer with logging(this might be easier to parse than individual temp files)
3/ bring up the page in a browser
4/ *manually* click on each link
5/ autoproc *your* temp or sniffer dump files on *your* disk
+++
Huh? It's my connection, and I run a sniffer.
Sorry - of course you can run a sniffer if you want to - network traffic is "sniffed" all over the place, but that is not the point.
The point is that when running that sniffer is part of a process that has been intentionally designed to automate PageRank lookups then you are in contravention of Google ToS.
If Google stores their files on your compute, then once they hit your disk--Google looses all *ownership* of them (they still have the right to use them via their TOS / EULA for the toolbar, they just don't have ownership of them)--they are now owned by YOU--even if you are trying to use them to sabbotage Google and burn Rome with them--they are still yours.
As long as you don't violate any laws by your use of the information, do what you want with it, it's yours after all, Google gave it to you fair and square. :)
Jordan
Page Rank is almost meaningless now.....why the heck make an enemy of someone who can be your best friend for no good reason?
Pagerank isn't meaningless. It's a good means of ensuring the site hasn't been penalized and isn't part of a bad neighbourhood, or at least that that is what Google thinks.
This all came up because of a quote from the Google Search Department where they say linking to 20 bad neighbourhoods could well be a problem. These days linking is an essential part of a web site, so we need to regularly check we don't have bad neighbourhoods in our outgoing links and sites we potentially would link to.
Sometimes I have looked at a site and could find nothing wrong, but SEOs have sworn it was full of dirty tricks. I'd like the Google seal of approval.
Hmm. I'm not so sure about that one. If you've ever read an Microsoft or Adobe or any other large software companies License agreement, they generally say that the software in fact is still owned by them and they have the right to take it away from you at any time. I'm not a lawyer but reading the EULA for the toolbar on Intellectual Property Rights, I'd say the what MonkeySage says is not true.
-snip-
You acknowledge that Google or third parties own all right, title and interest in and to the Google Toolbar, portions thereof, or software provided through or in conjunction with the Google Toolbar, including without limitation all Intellectual Property Rights.
-snip-
-Google Toolbar
-Portions thereof
-Software provided through or in conjunction with the Google Toolbar
Google would have to specify that any *information* and / or that any *file format* is also theirs, I believe. And even if they did specify it, local property law would take precedence over third-party contractual / consensual obligation. I'm no lawyer either, but I've seen a number of cases where people were charged with possession of illegally obtained software, even though someone else had put it on their computer, simply because it was on their computer.
Also, they do not specify (that I could find, but perhaps I'm missed it?) that PR information may *only* be accessed through the PR indicator on the toolbar, rather than through the file where the information is stored, which is the real question in this particular issue.
Mabye GoogleGuy can clear up the matter for us. I would personally use the information myself, but I advise everyone to make their own decision in the matter and do the research on what constitutes ownership / possession in their own locale, as well as try to determine if Google officially forbids using the PR information unless it is accessed through the toolbar.
Jordan
The sad fact of the matter is lots of "new age enterprises" post terms and shrink wrap agreements that make the eyes glaze over. Observing a TOS is all about one's own conscience and risk tolerance.
I *have* seen spider software that needed to be tuned for various factors until the spider was indistinguishable from a random browser at the level that the target webmaster was willing to invest in reliably detecting anomalies. At that point it was in the door. But not before multiple ban/tune cycles.
Personally, if it comes over the wire, I'll use it with whatever viewing software is useful to *me*, including automated. If they want to restrict it, they can use a subscription model and quit selling ad space. Follow the money.
+++
Pagerank isn't meaningless. It's a good means of ensuring the site hasn't been penalized and isn't part of a bad neighbourhood, or at least that that is what Google thinks.
Maybe I'm wrong on this, but hasn't PR seemed a little behind (weeks, not months ;) what the page actually probably is (and more importantly how it ranks in the SERPs)? Ditto with backlink.
While I still look at backlinks, more interesting to me of late is the rate at which new pages are added to their index. Check out how many pages are in the various servers over a week or so and you can see variances. Not sure if there's anything there, but it's always nice to have a few new pages of content in the index.
Back to topic-topic, though, I would say you're breaking the TOS having something automatically open browser windows. Google is queried at that point, afaik, to grab the PR. Automating that would, imho, be clearly against the TOS.
My two SERPs,
kpaul
And if I put Internet Explorer in my list of programs to launch when my PC boots? Isn't that automation too?
I don't want to get into an argument about semantics because in the end if Google doesn't like what I'm doing they will "ex-communicate" me anyway and and discussion of what defines automation would be null and void.
I think Plumsauce is right, but I would like to play the game within the rules where ever I can.