| This 117 message thread spans 4 pages: < < 117 ( 1  3 4 ) > > || |
|Google SEO Mythbusters - what is proven and what is guesswork?|
There are lots of ideas that float around the SEO world. Some are true - either tested (best) or established by authoritative communication from Google spokespersons (second best). For some there is good suggestive evidence, but not 100% proof. Other SEO ideas may be opinion, or even just plain wrong.
What SEO "facts" fall into what category? What kind of testing would you do to move a given idea up a notch or two on an SEO "trustability" scale of:
1. Google evaluates the content section of a page differently from the rest of the template. I say "True".
I've seen this effect several times in the power of a link in the body of an article compared to a link somewhere in a "related links" section. Links in the body content rock the house.
2. Google is using human editorial input to affect the SERP. I say that's "Probable".
I sure can't think of a test to make this one proven. The big fuss over eval.google.com is several years old, and Google has now filed a patent on how to integrate editorial input into the algorithm.
3. Using a dedicated IP address helps in ranking. I say that's "Opinion".
In fact, I have moved domains from shared IP to a dedicated IP and seen no obvious change. I think it depends on the company you keep when you share an IP. If you do your own hosting, this could be tested by starting a new domain on an IP address that already has a few banned sites -- assuming you have manged to get a few domin banned somewhere along the line.
4. Seeing any urls tagged as Supplemental Result means there is a problem. I say that's a "Myth".
In fact, g1smd has shared a lot of work in this area - and while seeing urls tagged as Supplemental MAY BE a problem, there are many reasons for Supplementals. In fact, a given url can appear as both Supplemental (with n older cache date) and as a regular result (with a more recent cache date.) This kind of thing is NOT a problem.
WHAT IS PROOF?
One true counter-example is enough, logically, to disprove any proposition. However, being sure you actually have a counter-example can be a challenge with something as complex as today's Google Search. It's a good idea to get a handle on formal logic if you want to untangle the mass of information available about Google. Here's a good resource on that: [nizkor.org...]
So what do others think? Any SEO myths you want to debunk? Tested "truths" you want to share? Testing ideas you want to propose?
How about common fallacies in logic when thinking about SEO?
[edited by: tedster at 5:29 pm (utc) on Oct. 31, 2006]
Agree with M_Bison, If we consider the ranking as a function of multiple variables, then in different points, there will be different effect of changing the value of the same variable. for F(x1,y1,z1) increasing z1 by bit might raise the value of F. but it is not necessarily true for F(x2,y2,z2).
Google Page Rank affects Traffic levels Directly - Myth
Agree - i managed to exlude all bots from all files with an errant wildcard char in robots.txt (doh!) - result: all PageRank was then n/a (from homepage PR of 5) - SERPs were pretty much the same!
SEO is alchemy - TRUE
Primary index for competitive keywords will only list you after earning enough trust from sources that already have it ( from co-op listed sites, long timer authorities, so old that "who knows when they became important" sites, .gov and whatnot... or passed on from major players, and all this has nothing to do with PR ) - True
On-page relevance has less to do with ranking than off-site relevance passed over to you with IBLs - i say True...
META descriptions only affect on-page relevance, thus don't really affect ranking much ( unless you sc*ewed them up, for then they might raise a flag ) - opinion ...or rather experience.
Being around for a long time will get your site away with mistakes for a while - True
...but if you don't correct them, they'll get to you - True
The higher your PR is, the faster and more often you're indexed, eventhough this has no direct connection to ranking - True
The higher PR the more links are followed on a page - opinion
No PR and no deep IBLs to a page may mean "soon to become supplemental no matter how legit, unique and on topic it is, sorry we don't need this many results" - If this is to be true... ain't it just plain funny? Yeah, i'm laughing too >:D
No i'm not.
tedster, thank you very much for the link to that fallacies-page. Let me add this:
According to Sir Carl Popper, apart from pure mathematics with its full induction we cannot "prove" anthing in the "real world" with 100% accuracy. Even if a theorem has worked ten tousand times, this is no proof it will work the 10001st time. And although computer-programms are based on pure mathematics, observing such a black-box (like googles algos) from outside has the same epistemological status as observing the real world. According to finite automata theory you cannot jugde from the output of a computer programm to the algorithm producing that output.
This means that all you can expect is a status of "very high probability", but not "truth", and I'd dare to correct you insofar as authoritative communication from Google spokespersons probably is not second best, but the best we can expect.
Let me also add a theorem and a proposal to test its probability:
Theorem: User behaviour is an important ranking factor.
Test: Imagine you had written two small tutorials of -lets say- ten chapters each on two different or similar topics with almost equal competitiveness for the relevant keywords. The normal way to publish these would probably be to write and submit a seperat static html page for each chapter. An alternative would be to store the chapters in a database and provide a chapters-link-list with get variables for the relevant chapters. With these "dynamic" links being idempotent URLs, google will also be quite likely to index all twelve chapters, good for the long tail.
But let me propose a third alternative: Instead of such a link-list use a <select><option ....></select> construction with ten such options for the ten chapters and with an onChange="submit();" -event attached to each entry and the "action" of that form being this very same (dynamic) page. Google will only index this page with its first chapter, but it will observe that this page is viewed up to ten times in each visitor-session. I believe that such a tutorial-page will receive a fairly high and STABLE ranking (compared to its traditionally written counter-probe) provided the intro page has sufficient content to get to one of the first ten spots for a short testing period.
And the advantage is: Even if this theorem is disproven, you are left with the alternative to rewrite this text the traditional way and get the other nine chapters indexed a few days later.
|BeeDeeDubbleU, maybe there simply is a lot of competition for your commercial site? Established, authority sites? |
There is competition but that does not explain the fact that my site cannot be found for the term mauve widgets, i.e. the domain name, while it gets found as I would have expected for other terms.
I believe that a filter is preventing it from being ranked. Perhaps it is too obvious to Google that I am targeting "mauve widget" traffic. I can think of no other explanation as to why it does not feature in the top 600 results. I am not expecting to be no.1 but it is a white hat site with pertinent, useful content so it should be much further up the list.
|Hyphenated domain names harm ranking - False |
Confirmed. Infact our hyphenated domain name ranks better than our non-hyphenated version. It's as though the search engines recognise the two words as being separate because of the hyphen.
|The higher your PR is, the faster and more often you're indexed, eventhough this has no direct connection to ranking - True |
We had to email Google and ask them to slow down their spidering of our site (which has over 200,000 pages indexed in Google) and all the pages have what Google describes as a "low" page rank.
This is how myths are started.
I visit a site to check name days. I find the names on the page, see what I want and leave. I am happy.
I visit a site to check Polish culture. I find a section in a site that has 20 pages on Polish culture. I click around and get what I want. I am happy.
I visit another site to find informatuion about RSS and get an MFA site and click out. I am not happy.
I then visit a site that has information about RSS but it is a mess and it runs me around the site and I do not get what I want. I am not happy.
Any measurement of user data here tells you nothing.
A single click on a page and out can mean satisfaction or not satisfaction.
Multiple clicks in a site can mean satisfaction or not satisfaction.
Myths are started by people promoting their own theories without doing their homework.
|The higher your PR is, the faster and more often you're indexed, eventhough this has no direct connection to ranking - True |
Actually nothing is "TRUE" or "FALSE" in SEO.
All i can say is what i experienced so far.
I'd assume that your experience is the less common than mine though.
Which doesn't make it any less important, quite the opposite for it show irregular behaviour from G bot and that's always a lead on something big.
Google wanting to keep iteself updated so bad that you have to ask them to stop... i know this happens sometimes, but USUALLY low PR pages get updated in the index less often. Usually that is. So that Googlebot doesn't crawl hundreds of billions of pages when it could crawl those of some significance and importance ( only according to G algos of course ), which would be just common sense on their end. People are whining - me included - that their pages aren't refreshed in the database fast enough... or people notifying G of a hyperactive G bot. Both happens, but the first one is more common.
There are other parameters that affect crawl rate too...
Perhaps your site had one of these set G bot into a loop.
Perhaps it was caught up in either updates recently when they more or less reindexed half of the net every other month, to build up the data for the new infrastructure. You know, supplementals, co-op stuff, and all this.
But if you say low PR pages are being crawled at the same rate as high PR pages... that sounds just... off. To me. ( there are exceptions, but in general, low PR pages are being ignored more often than not )
But then again i might be wrong, and low-PR pages with no(t much) inbound links, hence no(t much) reason for G algos to mark them for indexing / reindexing ( more often than not ) may have NOTHING to do with crawl rates. (?)
Yes, this may be true.
It's just i've always experienced the other way around.
So maybe there are other triggers, factors as well, and PR - when observed just by itself - indicates lower crawl rates... than on a similar site, on a similar page with higher PR ( ...more "higher end" IBLs ).
URL with keyphrase ranks better - TRUE, for recent updates. from what i see in my industry, having keyphrase in your URL helps a lot in ranking.
|One true counter-example is enough, logically, to disprove any proposition. |
Unlikely in most cases if weighting is used in a complex algorithm.
What is more likely true is that some things might work on one site might not work on all.
|How about common fallacies in logic when thinking about SEO? |
Binary logic, big fallacy.
Anyone have any data on linking style... example
|Anyone have any data on linking style... example |
I have an example....
951,000 Serps for "keyword1 keyword2 in Kentucky" (without qotation marks) shows the #1 and #2 serps with identical titles;
#1 - keyword1 keyword2 in Kentucky
Showing 1 to 10 of 594 keyword1 keyword2 In Kentucky ... Did you find what you were looking for in keyword1 keyword2 In Kentucky? If not, would you like to try a ..
#2 - keyword1 keyword2 in Kentucky
keyword1 keyword2 in Kentucky - Trucking Companies Hiring Truck Drivers From Kentucky.
Both pages have a PR3.
The third serp has a url of;
The title and description are very different than the first 2 and is a PR2
I own the 2nd serp.
[edited by: classa at 4:16 pm (utc) on Oct. 30, 2006]
|links to other internal pages from your own page content is a good way to build links... (ie don't just rely on the navigation menu) - TRUE |
I've found this works when I want to give a page a little nudge in the serps. I make sure the linking is from pages that at least touch on the topic of the page they link to. So the key word or phrase is in the content of the linking page.
Actually it makes sense for the visitor as they can go read more about something mentioned in the linking page.
This isn't a huge factor and probably wouldn't work in a highly competitive topic.
|Google Page Rank affects Traffic levels Directly - Myth |
Over the last 6 months I made a site and carried out a linking programme specifically designed to inflate page rank. The results when i got to PR 5?
No increase in traffic. Full stop.
Sounds odd, doesn't it?
But sanpetra's experience is that of many webmasters. The AOL data leak confirmed this experience. If you are not in the top 3 to 5 results, you almost do not exist to the users.
This was one of the saddest and most important conclusions from that AOL data which was released by accident this year. And, it explains why so many serious marketers are buying ads on Google, Yahoo and such.
Sorry I don't have a link to did a worthwhile analysis, but as I recall the traffic starts coming strong when you hit the #3 spot, but nothing compares to being #1.
|Google evaluates the content section of a page differently from the rest of the template. |
Your use of the word "template" may be a little misleading in that it anticipates a study of a site to see if there is a template in use, and the dvelopment of a heuristic to apply to a site. Regardless of whether there is or not, the analysis may be on a page-by-page basis.
They lay out the framework for such an approach in a patent application on segmentation of a page based upon a visual model:
Document segmentation based on visual gaps [appft1.uspto.gov]
While the document discusses the context of looking at different parts of a page while extracting information from that page for purposes of informing their local search directory, it also explains how this visual gap segmentation can be used to understand more fully what images are about, and to differentiate between different parts of a site. A snippet:
|Although the segmentation process described with reference to FIGS. 4-7 was described as segmenting a document based on geographic signals that correspond to business listings, the general hierarchical segmentation technique could more generally be applied to any type of signal in a document. For example, instead of using geographic signals that correspond to business listings, images in a document may be used (image signals). The segmentation process may then be applied to help determine what text is relevant to what image. Alternatively, the segmentation process described with reference to acts 403 and 404 may be performed on a document without partitioning the document based on a signal. The identified hierarchical segments may then be used to guide classifiers that identify portions of documents which are more or less relevant to the document (e.g., navigational boilerplate is usually less relevant than the central content of a page). |
Similar in some ways to Microsoft's VIPS: a Vision-based Page Segmentation Algorithm [research.microsoft.com] which is expanded upon more fully in Block-level Link Analysis [research.microsoft.com].
IBM suggested an alternative approach in one of their recent patent applications:
Detecting content-rich text [appft1.uspto.gov]
Regardless of how Google might be doing this, chances are that they are capable of it, and may be doing incorporating something akin to one of those approaches.
Gimp, sorry, but you did not really read or understand what I wrote (experimentally voting for post actions in contrast to href-links). I do have such an example on my website and it is very succesful in ranking; what I do NOT have is an equivalent counter-example.
Secondly your post is a good example for logical fallacies: You mention four "types" of sites: Two where people leave after one click and two where people leave after many clicks, both pair covering one good and one bad site. You say this would sufficiently prove that tracking user click-behaviour would be useless.
The key of your fallacy is
|...it is a mess and it runs me around the site.. |
If an algorithm is able to decide between such arbitrary clicking around and a highly informative site carefully guiding you through the topic, this algo might well be able to decide between the good and the bad types of that massive-click-aroud-category.
subdomains rule the day - opinion
I am aware of a couple of sectors in which the majority of city/state/widget searches are filled with single sites using subdomains for every town, county and state in the US.
Makes those particular searches totally useless. It has been a problem for several months.
>Seeing any urls tagged as Supplemental Result means there is a problem
I think this is true for two reasons. One being supplemental results do not rank favorably. It's an indication that your SEO has room for improvement. Two, it's my understanding that supplamental pages are not part of google's main index. Cutts says they're an experiment to deliver more results for obscure queries. This sounds like sugar coating what it is, which is an index for pages that are not good enough to make it to the main index.
Gimp I too have extreme difficulty envisioning Google using user tracking as a ranking basis. However my problem with it stems from:
1: The size of data sets that would have to be proceesed.
2: The incompleteness of the tracking data.
But if they wish to make assumptions they can play any game they wish.
Now back to the topic at hand.
The dedicated ip helping your serp position (now pay attention to the words)
Indirectly: True (by preventing same server site crossfire)
Reason for statement: Observed status of multiple sites affected on single shared servers.
If you wish to be safe, the dedicated IP is the way to go, if you think otherwise be my guest and do it your way.
Like others have said it is hard to mistake a fully qualified url for a different one, likewise it is hard to mix a single site on a dedicated ip with another site. Remember, there is more involved in runing a site than what goes on at the server.
I'm glad everyone has the black box fully figured out ;-).
What about the use of Flash and its effect on the SERPS? I don't do Flash if I can help it, and thus don't know what's myth and what's true (although I'd like to)
- If your site is entirely in Flash, it will not be seen by Google or other SEs and thus will tank or never rank well at all
Here is my opinion on few more that came up after my first post.
1. Using a single, relevant <H1> tag prominently assist SERPS - 100% True
2. Using double <h>, <title> tags would hurt the rankings - Opinion
I think Google would just ignore it.
3. Page title is the one most important factor for one page ranking for specific keyword - True
There are exceptions to this - I know the site ranking for very competative keyword phrase and they don't even have it on the page.
4. Inbound links from authority and related sites with key phrase in the anchor text are very important for ranking for that key phrase - True
5. Locating your server in the country of your target audience helps SERPS - True
6. Pagerank is useless - Myth
I think of Pagerank as a tourque of the website - but it's a webmasters driving (SEO) that would win the "race"
7. Links inside a <noscript> or <noframes> element get followed. - True
I even have proof of this
9. Links inside a <noscript> or <noframes> element pass PR and other backlink influences. - Opinion
Those don't show in the link: command...
10. Page load time affects rankings - Opinion
Maybe somehow indirectly? Would wonna hear more opinions on this one...
11. Page times-out during loading affect rankings - 100% true
If page doesn't load completely it'll be booted from the SERPs (eventually)
12. Backlinks have less influence when they first appear - Probable
Google indicated many times that the age of the links is a factor. The real question is of how much of the factor is it...
13. Google uses whois records to detect sites and networks of sites. - Opinion
I wouldn't be surprised if that is true. I know they use whois to discover new sites.
14. Google uses toolbar data to gauge quality and affect rank - Probable
I almost positive they do this (not like they'd ever admit it...) Also, on the same note, they use personal search to gauge quailty and the does affect rank.
15. Google has trouble judging which of three keywords in a phrase makes the key distinction - Probable
I think they use some algo for this, and it isn't perfect...
16. Google penalises sites that it sees as being over optimised - Probable
For me this is almost true. I know of few sites that were "penalized" for no appearent reason, other than being "over optimised"...
17. Keyword in the domain name helps ranking (for that keyword) - 110% True
I have countless examples for support this. Futhermore, keyword in the URL (directory name, page name) also help.
18. Adding outbound links to relevant sites makes a BIG difference in SERP results - Opinion
Sorry, austtr, but I don't think this true. It is a factor for non-authorotive sites, that's for sure.
19. Links to other internal pages from your own page content is a good way to build links... (ie don't just rely on the navigation menu) - True
Yes, I have many examples to support this. If one wants to rank high, step numero uno: need to design the interlinks corretly. Nav menu byitself it not enough.
20. Having Adsense on your site (oldie) helps you rank better - Opinion
Yeah, I know I get half of this forum screaming at me for this. There is really no evidance for this. There is no way Google would admit that. But I have few cases there Big Adsense spenders also do well in free search, and their sites shouldn't have been doing THAT well...
21. Having lots of affiliate links on your site lowers your ranking - Opinion
Again, really no evidance for this. But it also falls under "having too many links" on one page - will lower rankings...
22. Google Page Rank affects Traffic levels Directly - Myth
But sure does affect it indrectly.
23. The higher PR the more links are followed on a page - Probable
For me this is almost true. But also somewhat irrelevant because with proper sitemaps everything gets followed anyway.
24. Hyphenated domain names harm ranking - False
Hyphenated domains do in SERPs just as well.
25. Too long of the domain name hurt ranking - Probable
I belive google's algo compares the "density" of the domain name. For example: www.bluewigets.com and www.blueshinywigets.com - for "wigets" first domain has a higher density and will rank better. So the length infulences ranking indirectly.
26. Keyword in subdomain doesn't rank - False
Ha! You gotta see the SERPs for my industry - pure subdomain spam.
27. Flash affects SERPS - Opinion
If it's just a banner, probably makes no difference. If the whole thing is flash, would probably hurt the rankings - google is still very much text oriented...
Using a dedicated IP address helps in ranking: Sometimes.
I had a client on shared hosting where Google kept putting another sites title on that sites title. I would complain to Google, they would fix it, and a month later someone else's title was on there and one was an adult site's title. I moved the site to a dedicated IP and it kept happening. Then I realized I forgot to switch the DNS settings on the domain when it was moved to a dedicated IP. The same site, while still on shared hosting, was also affected by a 302 redirect from another site on the same IP. Moving to the dedicated IP address fixed that problem.
Re believing what Google or Matt Cutts says or that it's the best info we've got : No
they used to claim that there is nothing a competitor can do to harm your site however, recently they changed that to read there is "practically" nothing a competitor can do..... So you have to take what they say with a grain of salt and do your own research. Google doesn't feel that hijackers are their fault and thus we are left to our own devices to take care of it.
Google is using human editorial input to affect the SERPs: True
I have probably broke the record for informing Google SPam and Google AdSense when I found a scraper or hijacker breaking Google's rules (hidden text, sneaky redirects, etc.) I even got a letter back from Google one day thanking me for my frequent input on spammers. So they not only make note of the letters we send in they keep track of who's sending in the reports. I have seen them take quick action more than once when a 302 was involved as the culprit site often disappeared from the SERPs and the affected site recuperated shortly thereafter. However, since Google disabled the inurl command, so it no longer shows hijackings ,they haven't heard from me as much.
Fact- Adding adsense to your pages will get you indexed faster as long as people are looking at the pages.
We recently did an experiment, added two pages with totally unique content to one of our sites. Both were about the same size, two paragraphs. One had adsense ads and one did not. Neither pages had any links to them.
We then viewed each page once a day, the adsense page crawled once a day when we visited it, cached and indexed within two weeks. The other page has yet to be crawled.
In the previous page of this thread, several people were talking about "speed of indexing".
Some were talking about how many pages per minute Google would spider on a site.
Others were talking about whether a page would be re-indexed several times per day or week, or whether it was only looked at once per fortnight or month.
>> Google evaluates the content section of a page differently from the rest of the template. <<
I have certainly seen it said that Google can evaluate a section of a page with a lot of internal links as being the site navigation, especially if the same links appear on multiple pages, and a large chunk of text as being the real page content.
>> If an algorithm is able to decide between such arbitrary clicking around and a highly informative site carefully guiding you through the topic <<
But once you leave the Google SERP for the real site, how will Googe track your path around that site?
Ah yes, from the Toolbar data, and from all the Adsense stuff that is served to you...
Also analytics if you have it installed....
|But once you leave the Google SERP for the real site, how will Googe track your path around that site? |
in addition to toolbar, desktop and analytics, don't forget the browser sync firefox extension..
gives google access to your entire browsing history, cookies etc. (if you leave those features enabled)
|OTOH, if you're saying that Google is using humans to individually tweak rank order for specific queries, then I seriously doubt that happens enough to notice, possibly not at all. |
I just posted an interview on my blog with Jon Glick, former Yahoo search manager, and one of the questions was about hand-editing of SERPs. I asked if it happens more than we're led to believe. Here's a portion of his reply:
|Search engine algos do a good job most of the time and are unmatched for scalability over billions of crawled pages and billions of unique queries. However, there are cases where the engines need to have better results for an important query. In these cases, the easiest thing for the search engine team to do is make human edits. |
[edited by: tedster at 1:53 am (utc) on Oct. 31, 2006]
|We then viewed each page once a day, the adsense page crawled once a day when we visited it, cached and indexed within two weeks. The other page has yet to be crawled. |
Not too long ago, Google announced that crawl data from the AdSense bot would be cached for use by Googlebot, and vice versa (the idea being to to save bandwidth for both Google and site owners). For logistical reasons alone, it wouldn't be unreasonable for Googlebot to be nudged into crawling data that the AdSense bot has cached on Google's own hard drives.
Its not about bots saving bandwidth, its about the page getting crawled and indexed. Adsense pages get crawled and indexed a lot quicker!
Google will penalize a domain for too many 404s. Opinion-Myth
While it might seem to make some sense on the surface, 404's alone couldn't be used to penalize an entire domain. Anyone -- your competition, for instance -- can create an unlimited number of 404 links to your domain. The only reason I didn't declare this idea as pure, unambiguously a myth, is that I'm not sure about the case where a mountain of previously resolving urls now all go 404 or 410.
However, I have a redevlopment case, now 3 weeks live, where thousands of urls have gone 404, and we only used 301 for maybe 60 urls that got the heaviest search engine traffic. The site as a whole is doing BETTER than before the redevelopment, so far at least. So if there is any truth hiding in this idea at all, I still see no evidence, even though some feel they do.
| This 117 message thread spans 4 pages: < < 117 ( 1  3 4 ) > > |