|Google knows all links and backlinks but isn't sharing. Would you pay?|
Just a thought.
Imagine there was a service on Google to pay for a comprehensive list of backlinks for a site. Very much like the link: function but one that works 100% properly.
Would you use the service?
I know that is would be very unpopular and hence Google is unlikely to do it, but the question is more about the value of backlinks than Google making a penny.
Also, if the list was 1,000 links long, how much would you see as reasonable?
I would not pay even a penny, simply not worth it. I think Google may well need to eventually abandon the concept of PR and Hilltop. Links will only be important to drive traffic to your website from the linking site.
it doesn't really matter since there are other sources where you can easily see ALL of your backlinks...
yahoo for example lists every damn backlink regardless how unimportant it is :)
|I would not pay even a penny, simply not worth it. I think Google may well need to eventually abandon the concept of PR and Hilltop. Links will only be important to drive traffic to your website from the linking site. |
I think the PR concept would continue to be valuable if it were combined with positive and negative site weightings from an in-house editorial team. Inbound links from certain domains would transfer more or less than their nominal PR, depending on the weighting factors they'd been assigned. Once this top-secret directory of domains and their positive or negative weightings became large enough, the trickle-down effect of the assigned weightings would help to neutralize the PR influence of link exchanges, purchased text links, networks of affiliate sites, etc. in topic areas where "PR abuse" was considered to be a problem.
Nope, wouldn't pay.
The only value to me would be to find sites that link to similar sites to mine. There are lots of other ways to do something like that without using the link command.
<< yahoo for example lists every damn backlink regardless how unimportant it is :) >>
No it doesn't. I have a site that has some really obscure links to it. Even on the same day, using various yahoo link tools, I get different returns, it's very random. The rough numbers are there, but there are sites I know link to the site, and those simply aren't returned, or are listed one time, then vanish for months or even years. As far as I know, currently no search engine has a really reliable link command, although the yahoo one is best. Part of the problem is that yahoo seems to be randomaly simply dumping pages out of its index, so if your site is linked to on a dumped page, suddenly it's not.
Time for a new search engine to be built I think, one that's built from the ground up to handle at least 40 billion pages to start. I don't see google, yahoo, or msn filling that role today.
|yahoo for example lists every damn backlink regardless how unimportant it is |
Besides what 2by4 said, another problem seems to be with only the first 1000 results displayed. I know several sites which have more than 10k links, but you can retrieve only the first 1000 results. One can get a few other remaining backlinks by using a variety of keyword combinations, but thats not the point.
If there were some SE, including Google which offered a paid service for a complete list of backlinks for a site, I would be the first to use it :)
|Also, if the list was 1,000 links long, how much would you see as reasonable? |
I would pay only if they made available every damn backlink, not just 1,000 links. That's almost available from Yahoo as of now.
|I think the PR concept would continue to be valuable if it were combined with positive and negative site weightings from an in-house editorial team. |
I don't think Google will ever do this. If I read their culture right, they are deeply, ideologically committed to algorithmic solutions. Human-based solutions have no economies of scale, and you have to review them regularly. An automated solution has no such limitations. Google is filled with engineers who think the web knows more than the web knows. And, imho, they are right :)
|If I read their culture right, they are deeply, ideologically committed to algorithmic solutions. Human-based solutions have no economies of scale, and you have to review them regularly. |
What is PageRank if not a "human-based solution"? Without humans providing links, the formula wouldn't work.
For that matter, Google makes use of humans to define, refine, test, and monitor its algorithms. It also uses humans for QC purposes. What I'm suggesting is simply another set of human-set QC standards for the algorithm--one that would reduce the need for manual QC checks by preserving the usefulness of PageRank (which has been weakened considerably since the original PR formula was published).
Oh, I totally agree with you. I thought you meant that Google would supplement its algorithm with human evaluations (ie., Google-employee human evaluations).
> Time for a new search engine to be built I think
I recently read in a newspaper (forgot where) about an open-source kind of P2P project on that. Just like with the seti project you dedicate some of the space on your harddrive and processor capacity to the project. The programm uses free bandwidth or scans the pages you have visited from your temp-file and contributes the results to the major index. since the crawling- and indexing- issues are thus executed parallel, the organisation would only have to cope with query-bandwidth and even that might be executed in a parallel/redundant architecture. They said the system would produce the first reasonable results if only a few hundred computers participated. Has anyone else heard of this and a good source at hand?
Apart from the linguistic issues in information retrieval, I would agree with OP that at present it would indeed be quite interesting to get thorrough information about the mere (Link-)STRUCUTRE of the internet. Note the emphasis Brin and Page put on performance issues in their first papers: With the rise of php, producing thousands of idempotent URLs on millions of websites almost every day I see no way a single organisation like google or whoever will really cope with this in ten years or so.
Has anyone in here ever calculated how much time and performance it would cost to crawl the whole internet (I mean those sites you can reach starting with dmoz and yahoo like google does). My hoster offers 4,6Mbit bidirectional bandwidth on his dedicated servers, which means more than 500 kByte per second. Whats the average length of a webpage? 10kB?20kB? That'd amount to only a few thousand days on a single machine...
Alltheweb lists ALL backlinks as far as I know
Maybe I should phrase this a fair bit better:
Forget Google and all the others, imagine a service that allowed you to do the following:
Lists delivered in .csv and other formats.
Lists of backlinks for any site you wish.
Lists of sites that particular sites link TO.
PR or similar ranking already in the results.
Each link also states how many outbound links are on that page. (thus likely benefit of link from page)
The principle keywords associated with each page.
How often the page chages (difficult for dynamic but a boon if it tells you no-one has changed the site in 6 months)
These are the things I was hoping to see as ideas (not that they are the best, just thinking as I'm typing)
What would you like to see?
|What would you like to see? |
None of the above.
I would pay a lot for a list with my competitor links that count. For example: for the search phrase “WIDGET A” the first site that ranks have 100 links form witch the following are counting as following:
www.site1.com – 10 points
www.site2.com – 8 points
But I am sure that there will never be such a list.
A recent interview with a google engineer, a theory guy, shows that google knows perfectly well that spam can be detected much better by humans than any algo. But because they have that stupid ideology where everything has to be run by machine in terms of the large scale components of their system, their system has almost collapsed this year. It's my opinion that either they wake up and start adding some human judgement somewhere along the process, it can be following a long series of automated spam detections, they will never beat spammers, who do use such human judgement to create sites that are just close enough to the current favored model of site layout and design to be undetectable to a machine as spam. But easily detectable by a human. I've read this suggested elsewhere, and it struck me as not a bad idea, if you're willing to adopt a creative, flexible, open minded approach to the problem. Oh, whoops, I guess I wont' be seeing that any time in the near future if this year's dismal google performance is any indication.
look on yahoo
Does yahoo use a system like that? That would explain certain things. But their search is so week, doesn't index much more than a few pages of many sites currently, drops pages. Like they are trying to run a system that needed a big overhaul last year without spending any money on it or something. Can't figure that company out to be honest.
Alltheweb lists ALL backlinks as far as I know
Look on Yahoo
I found an interesting result for backlinks on the widexl dot com link popularity tool. while the google and Yahoo links look about right as far as the link command goes, the msn beta has multiple times more links than all the others.