Forum Moderators: open
Ff you where ranking pages for something like a search engine, wouldn't you be more interested in what is not found on the pages, rather than what is?
Maybe this has been discussed to death, and I just missed it, but I have never really seen it talked about before (but I do not study this stuff like I am sure many here do).
Just a small walk down this page, and you find
Terms of service ¦ Privacy Policy ¦ Report Problem ¦ About
All trademarks and copyrights held by respective owners.
Member comments are owned by the poster.
BestBBS v3.15 (c) WebmasterWorld.com 1998-2004 all rights reserved
Now that is meaningless in and of itself, but if you where ranking sites, would it not be easy to look for things like a
contact link
privacy policy
employment
help
tos
site map
etc
I got no clue, but it seems that any real content page is going to have a certain percentage of those kind of links, under one name or another.
Another thing I would probably look at is if the images are on another domain name, and that domain name is on the same IP (really lowering any chance its an image server as I dont see much gain on the same IP, on the same box, running probably on the same server, though I guess it would be possible to have a small image server running on the same IP, just route it through a different port or the like.)
But really, I am curious of if anyone has really looked into what is not on a page, that is normally on a page in some form, and if so any conclusions.
Sure about that? In particular, I would think the gang at Google would want to be favoring content that was created just to be informative and valuable to the user, rather than created with doing well in SEs in mind. Let's imagine that Jane is a fan of 1970s bubblegum rock music, and puts up a site about that. This is a labor of love site, and the idea isn't to sell things. Why would Jane have a link about employment, a site map, a TOS, etc? Of those you list, all I'd expect is some sort of contact method listed.
The flaw in your logic is that you are defining a "real" site as being commercial, yet not just spam. From the very beginning Google has always favored non-commercial info sites. PageRank being the perfect example. Amateur sites have a tendency to link quite freely to other sites. Even just for reasons that they happen to personally know and like the other webmaster. Commercial sites tend to be hesitant to link out. They don't want to send people to the competition.
And, Google makes money selling Adwords. They have little incentive to want to favor commercial sites on the SERPs. Thus, you may want to be asking the question "What are the characteristics of the typical site Google wants to favor?"
If you are trying to design some feature which can be exploited by commercial spammers but (as has already been pointed out) will seldom be found on true informational sites, I believe you have a winner.
If you are trying to design a feature which is implementable, I think you're depending too much on the backend information processing features of the Mark I human eyeball -- still a unique design, even with all Google's work.
I confess, I've been having similar fantasies: any page that contains all of the words "Las Vegas" "Orlando" "and "Hotels" should be delivered to oblivion, with page rank penalty of -6 or so, and not allowed to be indexed for any of those words. But I run into one of the problems I've pointed out above: such a test, however implemented, would be simple enough in effect to be spotted and defeated by any of several schemes. Such a thing could only be a short-term solution, as would blocking out any site with an explicit link to hotelnow.
Ok, even affiliate sites can have those. Whats so difficult about putting up a tos and an employment page?
Now whats so difficult about getting dmoz, yahoo, a number of chambers of commerce sites, a couple of .edu and a few .gov sites to link to your site?
do a search on [site:stanford.edu research]. How many employment links do you find? It is only a minority of them that even have a copyright notice. Yet, in their fields, I would bet that many of these sites are very important.
That said, there might be some actual value in what you suggest, but I do not think that they would be of any overriding importance.
And I can tell you what the spammers do, because they try to do it to us all the time. "What can I add to my page to make it listable in the ODP?" If we tell them, "you have to have a TOS," they add a TOS. If we tell them, "you have to have an address in Ulan Bator," then by gum, within 15 minutes they've made up an address in Ulan Bator. And they tell us "We just moved. That's right. We just closed our office in Sao Paulo, and moved all our employees, files, phone numbers, and legal charters to Outer Mongolia." And they have the web page to prove it (in Portuguese and Babelfished Tocharian).
That's the kind of guns the enemy is training on your fort. How many shots will it take for them to blow away the gate? Do you think it would last a whole month? I don't.