Forum Moderators: open
[cs.toronto.edu...]
One thing immediately struck me when I looked at the graphs. Hilltop's rank was statistically virtually no better than Google before Hilltop.
My conclusion:
1. Hilltop was only tested with broad search terms (see Hilltop paper)
2. Mr. Bharat either failed to test or failed to disclose the results for more specific searches (aka money words) such as 'profession city state' ' business city state' etc.
3. Hilltop by it's design is indeed a mom-and-pop filter if you look at the desired results. (see results of Hilltop paper Figure 1. Hilltop Ranking for the Query: "jobs" (no mom-and-pops show in results)
4. Hilltop requires result pages for popular searches to be found in directories (expert pages).
5. Hilltop accuracy is not statistically better than the original Google for query accuracy.
6. Hilltop claims a goal to reduce spam. (and it does compared to other search engines but not Google)
7. But if there was no statistical increase in accuracy over Google (getting rid of spam) as seen in the resulting tests using broad search terms, the only logical reason to implement Hilltop therefore would seem to be to increase Adword revenue by filtering out smaller sites (non-name-brand that people would not miss) since relevancy does not statistically increase with Hilltop.
Ok...fire away.
It is worth mentioning that Hilltop is an attractive hypothesis for some of what Google seems to be doing; but so far as I know it hasn't been confirmed by any kind of rigorous testing, or by any official statement. Have I missed something?
I've speculated on what Google might do to whack, say, travel-rezzer affiliate spam (they could remove a billion pages from their index without hurting comprehensiveness of the results) WITHOUT using Hilltop. There are all sorts of things that one might try; and by the nature of things most of them wouldn't be mentioned in public.
And it is also worth noting that the definition of "affiliated" could be easily defeated by most serious affiliate spammers: 20 domains on separate ISP's, $500 a month or less, and you could set up your "expert doorways" to your own indetectably-associated sites. (In fact, most serious spammers probably already have such a network of doorway domains set up. This may not yet have been true in 1999.) For this reason alone, I doubt if UNMODIFIED Hilltop is in use at Google.
Sorry, but the quality of results has not improved since Florida. I'm actually seeing a much higher level of fakey directory style spam and so many sites that are just not relevant to the search. Weather and portal sites still dominate the "city real estate" searches. And sites with real content are very difficult to find or completely gone. Google's use of Hilltop acts as a filter that removes smaller sites that lack the directory style or shear number of links that are now required. Also, the concept of Hilltop has devalued pr and has made lr (local rank or authority status) king. With no real way to determine what the local rank is for a site, Google has made it more difficult for sites adjust the marketing plans.
Google is now the indexer of directories. I recommend people go directory to any large directory and start your seach there because Google isn't any better.
3. Hilltop by it's design is indeed a mom-and-pop filter if you look at the desired results. (see results of Hilltop paper Figure 1. Hilltop Ranking for the Query: "jobs" (no mom-and-pops show in results)
Sorry...can't agree with this. The whole reason you aren't gonna find any "mom and pops" for the term "jobs" should be obvious - 105 million results competing for an overly broad term - with many, many heavyweight sites involved (with higher PR sites mainly listed). Looking over the top 30 in Google, can't really find fault with the listings.
The whole "money word" thing that has been floated around here for months I also don't believe in, either. My main site, a mix of hard earned content and affiliate sales (and very big), has continued to rank just as well if not better than before for "money words". Yet, a secondary "fun site" that is in NO WAY related to any sort of money word (and doesn't even have a single ad of any kind and with a PR of 5) can't be found anymore in Google since January or so.
The difference between the two sites is that my main site has hundreds of links pointing to it - most of them "free links". Yet the PR is only 5 because many links point towards interior pages or because the links are from low PR pages that are still topically related to the pages on my site they point to. By contrast, my "fun site", while having the same PR as my main site, only has a handful of links pointing to it (one from DMOZ), only one of which point to interior pages.
What does seem to have changed is that Google does indeed judge a site as "authoritative" in some way anymore. I believe it does this by examining the links (not just the number, but the quality) pointing to the site. I also think it likes deep links, too. If your site has the right mix of quality links in sufficient number, your site does seem to get a boost. If your site doesn't have either enough links in total or enough quality links, your site won't be found.
This new filter by G does seem to make it hard to get new sites listed (see previous threads about Sandboxing). I don't think Google is deliberately sandboxing all new sites. Instead, these sites just don't have the right mix of links (number and quality) pointing to it to make it "authoritative".
And since the sites can't be found, it makes it all the harder to get new links to the sites since no one can find it.
That said, once your site reaches an "authoritative" status, new pages you put up on your site tend to rank very well right out of the gate. GG once mentioned that Google likes "large", multi-theme sites instead of tiny little ones spread over hundreds of domains. The new filter G is using seems to show this in my opinion.
I personally think the Google SERPS over the past few months are the best they have been in a while. It got significantly better after Austin (when I noticed a ton of crap quality sites disappear) and refined tweaks to their algo since then have allowed some of the innocent sites that got caught up in the whole mix to come back up in the listings.
My two cents.
Jim
You give as your interpretation of some evidence you've seen, that Google is currently being successfully attacked by that same kind of spam. Since I expressed agnosticism on the issue of whether Hilltop was actually in use, that's not arguing with me either. But I find it interesting.
I do agree that the impression you cite does easily fit the hypothesis that Google is using some form of Hilltop, and that (aside from the sidesniping) it's relevant to this discussion. I could only wish you'd given more details, so those with alternate theories could compare them with the actual evidence you have.
Let us assume Google is indeed applying an improved version of Hilltop in the current algo. I would like to make few things clear from the standpoint of a confused webmaster caught in the middle of Hilltop/authority site/LSI/CIRCA et al
A large site can have different subjects dealt with in its sub-pages. Those subjects can be broad enough topics to be subjected by Hilltop. Ex: A widget company dealing with Blue widget, Red Widget and more. Here Blue widget, Red Widget are broad enough subjects and are applied with Hilltop. Enlightened with Hilltop, the site has gone after links from good on-theme sites linking to the Home page. In the absence of deep links from respective thematic sites, will those sub-pages rank well for their search terms, no matter how many quality links the Home page has achieved? Will the sub-page dealing with Blue Widget, ever get a higher rank, no matter how well the Home Page linked? Or should one treat each single page as separate entities and get deep links to stand a chance?
IMHO Google is intelligent enough to associate Blue Widget as subject under Widget and still rank the page even if that page doesn't have deep links.
Thank You
Mc
The problem right now is just getting your sites to come under this "authoritative" status, for lack of a better word. And what the trick to achieving this is only known by the smart people at Google - and I bet they won't be telling us anytime soon. :)
All the top sites for money related keywords were commercial sites, tweaked and optimized for that particular keyword. Now consider the situation of a surfer who's come to look for the information regarding that particular keyword.
What did Google do after florida.
The effort was basically to privide a set of mixed results.Little wonder that commercial sites still rule the roost, but the reuslts now contains authoritative infomation sites, gov sites and other related sites.
NOW, while LSI ensured that all the results very homogenous, hiltop ensured that group of affiliate sites spamming the results were removed.Moreover:
1)PR hasn't lost its importance.It's now a part of the ranking algorithm. No. of votes to your site and maturdness level of your site has importance.
2)Small sites with few pages are gone.
3)Hiltop has made sure that Official sites retain the top position.
Optimization till now was just a child's play. The real game starts now. I hope we'll are enjoying this.
I'm sure you love is still intact ILUVSRCHENGINES.
;>)
Ya think? I have received several emails this week from merchants thanking affiliates for their efforts and resulting in record earnings for the merchants (and the affiliates of course).
If anything a Hilltop type algo will do more for affiliates and Ad based directories than anyone. A number of the "Hilltop" sites maybe Google Adsence publishers, but a greater number will be affiliate sites.
"Hilltop" may kill the proverbial "Mom and Pop" sites, but it doesn't damage the big players which includes many affiliates.
My 2 cents
Mc