homepage Welcome to WebmasterWorld Guest from 54.167.75.155
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 117 message thread spans 4 pages: < < 117 ( 1 2 3 [4]     
Google SEO Mythbusters - what is proven and what is guesswork?
tedster




msg:3139078
 8:03 pm on Oct 29, 2006 (gmt 0)

There are lots of ideas that float around the SEO world. Some are true - either tested (best) or established by authoritative communication from Google spokespersons (second best). For some there is good suggestive evidence, but not 100% proof. Other SEO ideas may be opinion, or even just plain wrong.

What SEO "facts" fall into what category? What kind of testing would you do to move a given idea up a notch or two on an SEO "trustability" scale of:

True
Probable
Opinion
Myth

1. Google evaluates the content section of a page differently from the rest of the template. I say "True".
I've seen this effect several times in the power of a link in the body of an article compared to a link somewhere in a "related links" section. Links in the body content rock the house.

2. Google is using human editorial input to affect the SERP. I say that's "Probable".
I sure can't think of a test to make this one proven. The big fuss over eval.google.com is several years old, and Google has now filed a patent on how to integrate editorial input into the algorithm.

3. Using a dedicated IP address helps in ranking. I say that's "Opinion".
In fact, I have moved domains from shared IP to a dedicated IP and seen no obvious change. I think it depends on the company you keep when you share an IP. If you do your own hosting, this could be tested by starting a new domain on an IP address that already has a few banned sites -- assuming you have manged to get a few domin banned somewhere along the line.

4. Seeing any urls tagged as Supplemental Result means there is a problem. I say that's a "Myth".
In fact, g1smd has shared a lot of work in this area - and while seeing urls tagged as Supplemental MAY BE a problem, there are many reasons for Supplementals. In fact, a given url can appear as both Supplemental (with n older cache date) and as a regular result (with a more recent cache date.) This kind of thing is NOT a problem.

WHAT IS PROOF?
One true counter-example is enough, logically, to disprove any proposition. However, being sure you actually have a counter-example can be a challenge with something as complex as today's Google Search. It's a good idea to get a handle on formal logic if you want to untangle the mass of information available about Google. Here's a good resource on that: [nizkor.org...]

So what do others think? Any SEO myths you want to debunk? Tested "truths" you want to share? Testing ideas you want to propose?

How about common fallacies in logic when thinking about SEO?

[edited by: tedster at 5:29 pm (utc) on Oct. 31, 2006]

 

JudgeJeffries




msg:3141444
 7:55 pm on Oct 31, 2006 (gmt 0)

I am experimenting with outbound links this week.

Me too - This part of the discussion has definitely got me thinking as one of my sites (old site, changed content) does very well with just one (non reciprocal) out bound to 'authority' sites on each page, when compared to similar sites where I'm hoarding PR.

g1smd




msg:3141564
 9:54 pm on Oct 31, 2006 (gmt 0)

>> However, I have a redevelopment case, now 3 weeks live, where thousands of urls have gone 404, and we only used 301 for maybe 60 urls that got the heaviest search engine traffic. <<

I can offer a site with very bad duplicate content problems (multiple URLs leading to same content) that was slowly being fixed, when they decided to completely re-organise the site. Several thousand pages were moved to completely new URLs. Just a few dozen pages stayed at the same URL. So far Google seems very slow at picking up the new URLs, but it has only been a month or two.

I think the discussion of "amount of 404s to trigger something" probably revolves around the question: "does this look like a completly new site, after a change of ownership"?

In fact, I guess that most bits of the algorithm, basically ask of something: "what type of spam is that trying hard not to look like?"

.

>> Anyone -- your competition, for instance -- can create an unlimited number of 404 links to your domain. <<

I would hope that Google ignores a URL that the very first time they spider it, it returns a 404. I would hope they check it again a few times, then forget they saw it.

However, URLs that return 200 and content for a while, and then later go 404 are a completely different matter.

Again, one important property of a URL would be whether only internal links contain that URL, only external links contain that URL, or a mixture. I would hope that URLs synthesised only from outside a site would not carry as much weight as those found internally.

annej




msg:3141639
 11:20 pm on Oct 31, 2006 (gmt 0)

does very well with just one (non reciprocal) out bound to 'authority' sites on each page

I link liberally to pages on other sites that offer more information and/or are a source for each of my articles. I also link to pages like museum pages that have a picture that illustrates something in the article.

You would think I would be leaking PR like crazy but these articles almost always rank on the first page in Google for the topic and often are number one.

I'm not sure if these outbound links are helping me but it is a possibility. I also think it has helped me in getting links from academic and government sites because I list all my resources for an article and link to them when possible.

buckworks




msg:3141641
 11:22 pm on Oct 31, 2006 (gmt 0)

think it has helped me in getting links from academic and government sites

BINGO!

davidof




msg:3142023
 9:45 am on Nov 1, 2006 (gmt 0)

> Maybe I'm not reading this right but if there really were no links to the new pages then without AdSense there is no way Google would have found the page without finding it through AdSense.

Orpan pages. They can be found, for example, if someone with the Google Toolbar or running the Opera browser looks at the page as this sends the URL of the page to Google. Normally orphan pages don't stay in the index . This has caused some surprised where people have set up "private webs" which then get discovered and indexed by Google.

davidof




msg:3142032
 9:51 am on Nov 1, 2006 (gmt 0)

> Google will penalize a domain for too many 404s. Opinion-Myth

I'm not saying Google does penalize a domain but it would make some good sense to do so. Google doesn't have infinite resources to spider the web (it just seems that way). If they start getting a large number of 404s on a single domain the robot may just decide to give up assuming the site is very broken - at the very least I would expect much slower indexing in the future until the problem is fixed.

Making a connection just to get a 404 requires quite a lot of TCP/IP overhead, actually getting the data afterwards is almost a breeze.

It is like that duplicate pages experiment that someone did a while ago with duplicate content where the spider checked out a few pages in the directory then decided not to revisit because it had triggered some kind of duplicate threshold.

davidof




msg:3142039
 10:10 am on Nov 1, 2006 (gmt 0)

> Confirmed. Infact our hyphenated domain name ranks better than our non-hyphenated version. It's as though the search engines recognise the two words as being separate because of the hyphen.

If search results are an indicator of what Google (along with other search engines) can index then search engines cannot recognise words that are run together so:-

bluewidgets.com looks like bluewidgets dot com

however Google does recognise '-' as a word separator (but not '_') just like a space so

blue-widgets.com looks like blue widgets dot com

Google presumably doesn't do anything with the tld although certain tlds may confer more trust in the site (.org, .edu)

However it is important not to confuse an effect with its cause. Why should a keyword in the URL have a +ve effect on SERPS? Is it that Google puts a lot of weight on the domain and weight on the rest of the URL? Or is it that most people are lazy when linking to websites and just use the URL as the anchor. The hyphenated version as the anchor give an IBL boost?

Almost certainly a combination of the two.

Okay and now some science:

taken from the Expression Engine forums:-

[quote]> Derek Jones - 23 August 2006 04:03 PM

> the "Favour dashes rather than underscores as some search engines don't recognise these as word seperators" is a myth.

Actually Derek it is a myth that it is a myth and I suggest you try it.

Google, as an example of one search engine, sees underscores as underscores. Try the search with spaces, dashes and underscores and you will see markedly different results. Then find an Expression Engine site that uses "_" as the separator for URLS. Use Googles allinurl: directive to just search on what part of the URL Google has indexed, that way you avoid confusion for obscure search terms where H1 or even body text has enough weight to figure in results for example:

allinurl:<space separated keywords> - returns nothing

allinurl:<underscore separated keywords> - returns an EE site

That is because Google matches underscores to underscores. Try the same thing with a site using dashes as seperators:

allinurl:<space separated keywords>
allinurl:<dash separated keywords>

both searches match the page: <edited>

In fact with Google, putting dashes in the search term is the same as putting the phrase in quotes. At least as far as the URL is concerned. That means that the exact phrase must occur somewhere in the search term. So:

big-baby

would match

he-is-a-big-baby-boy

but would not match

the-baby-big-pram

This is actually not a bad thing from a SEO viewpoint as it makes our pages more specific to what is being searched for.

Obviously when you do a general search there are all sorts of things like stemming going which can change the results depending on dashes, spaces, word order etc. Other search engines operate differently. Last time I checked msn search treated underscores as word seperators. But then virtually no-one uses msn search so who cares?

<Sorry, no specifics.
See Forum Charter [webmasterworld.com]>

[edited by: tedster at 3:53 pm (utc) on Nov. 1, 2006]

trinorthlighting




msg:3142121
 12:47 pm on Nov 1, 2006 (gmt 0)

I forgot to mention, we submitted both of those pages to google via the url submit.

It shows that google has crawling priorities, adsense pages are always first to get crawled and indexed if they meet guidelines

plasma




msg:3142194
 2:25 pm on Nov 1, 2006 (gmt 0)

"adding outbound links to relevant sites makes a BIG difference in SERP results"

[x] TRUE

Proove: About 3 times I posted to a blog and edited wikipedia pages (ontopic of course), that always have been below my site in the serps for certain double keywords, and included a link with the keywords.
Now they're on top of me %-)

texasville




msg:3142198
 2:31 pm on Nov 1, 2006 (gmt 0)

About the underscore versus dash, this is a direct quote from Matt Cutts:

"So if you have a url like word1_word2, Google will only return that page if the user searches for word1_word2 (which almost never happens). If you have a url like word1-word2, that page can be returned for the searches word1, word2, and even gword1 word2.".
Now that's science.

photopassjapan




msg:3142217
 2:47 pm on Nov 1, 2006 (gmt 0)

Slash, hyphen underscore thingie...
That's just on-page info, meaning text, anchor text, alt tags, titles, descriptions and the like... right?

Not URLs. (?)

G recognises _ as word separator in URLs, but not in content...
At least that's my experience.

decaff




msg:3142281
 3:43 pm on Nov 1, 2006 (gmt 0)

hoarding PR is so "three years ago"

connect to authority resources in your sector and add value to your site visitor's (human) experience...the engines tend to view this as a positive

tedster




msg:3142300
 3:58 pm on Nov 1, 2006 (gmt 0)

G recognises _ as word separator in URLs, but not in content

No, the Expression Engine comments above are specifically about the inurl: operator results. The reason behind Google's unexpected treatment of the underscore is that there are many technical keyword searches that perform better when the underscore character is treated as a true chracter rather than as a word separator. Think of the way Front Page uses the underbar to begin their dedicated extension folders. There are a multitude of such technical examples.

tedster




msg:3142324
 4:10 pm on Nov 1, 2006 (gmt 0)

File size is a directly measured factor in the ranking algorithm
Opinion-Myth

I recently have been forced to downgrade my own take on the file size factor. Years back, it seemed like there was definitely a sweet spot for file size, and that big html files were taking a negative hit. But now I see many counter-examples. This includes a recently re-developed site whose new html files are WAY too big for traditional SEO wisdom (120 kb and more, on average) and yet the urls immediately moved UP the SERP immediately after launch.

In fact, part of the code bloat comes from a big chunk of javascript. And so I think I must downgrade my thinking about the value of external javascript for SEO to Opinion-Myth -- even though this is still a best practice for the important reasons reasons of download speed and caching.

What I now feel (Opinion) is that file size was never really an algo factor, and Google has greatly improved at ignoring this irrelevant sigal and isolating the content for ranking purposes, even on very bloated pages. In other words, the file size phenomonon was only peripherally related to ranking, but I was making a "post hoc ergo propter hoc" error in my logic.

photopassjapan




msg:3142374
 4:47 pm on Nov 1, 2006 (gmt 0)

G recognises _ as word separator in URLs, but not in content

No, the Expression Engine comments above are specifically about the inurl: operator results.

Ah. You're right, i just checked O.o

But...
I still don't get it.

When i do a search on a certain keyword combination ( city and district names )i get results from our site that highlight the keywords in the URL. Eventhough it's actually a folder, /cityname_-_districtname/. It does get recognized when doing but a simple search.

Now if i do what you said to do, the inurl: thing...
It's nowhere.

Um...
I could live with it not being visible for the inurl operator search, only normal searches, but this made me thinking for a moment...

What's the deal with this _ then?

tedster




msg:3142400
 4:56 pm on Nov 1, 2006 (gmt 0)

Highlighting is just a character string match done over the calculated SERP as a very last step. So it doesn't indicate that each highlighted term was actually used in the algorithm's calculation -- it's just a way of showing the end user that their search terms can be seen here, and here, and here in the search results. Nothing more than that.

[edited by: tedster at 5:07 pm (utc) on Nov. 1, 2006]

g1smd




msg:3142410
 5:06 pm on Nov 1, 2006 (gmt 0)

Ah, but the highlighting of URLs in the SERPs is simply a display process that highlights occurrences of whatever full or partial character strings you searched for. It is not related to ranking at all. I often see logic failures with it, with some words highlighted and others not, or even just part of a word highlighted.

[Heh, Tedster got there while I was on the phone.]

Lorel




msg:3142463
 5:40 pm on Nov 1, 2006 (gmt 0)


BTW, the sites that sit on top of this serps are

1. old and

2. are .gov sites or .edu

I would think the trust factor for them is high.

Hmmmm. This sounds like multilevel marketing:

First one's in get all the benefits.

Everyone else has to purchase AdWords.

davidof




msg:3142465
 5:42 pm on Nov 1, 2006 (gmt 0)

> Ah, but the highlighting of URLs in the SERPs is simply a display process that highlights occurrences of whatever full or partial character strings you searched for. It is not related to ranking at all.

but it is one of the most pervasive Google myths.

davidof




msg:3142469
 5:48 pm on Nov 1, 2006 (gmt 0)

> I recently have been forced to downgrade my own take on the file size factor.

If files are too long or too slow at downloading googlebot will not always download the whole of the file, I'm sure somewhere in Googlebot it has both a file and time limit otherwise the process could be tied up infinitely download one file.

In addition I haven't checked recently but there was a limit on how much of a page is indexed. Last time I looked it was around 500kb. Okay not many pages are that big but I suspect that this is actually a googlebot download limit.

photopassjapan




msg:3142525
 6:32 pm on Nov 1, 2006 (gmt 0)

> Ah, but the highlighting of URLs in the SERPs is simply a display process that highlights occurrences of whatever full or partial character strings you searched for. It is not related to ranking at all.

but it is one of the most pervasive Google myths.

I think i learned something new today.
Yaay...!
:)

Now all i want to know is whether the co-ops labeling websites will have any feedback to general ( ie. non refined ) queries.

RonnieG




msg:3142653
 7:54 pm on Nov 1, 2006 (gmt 0)

A debate I am having with other webmaster friends:

Google still penalizes the target page of GoDaddy 302 redirects (and also possibly the target pages of other "blackhat" 302 redirects).

My opinion: Myth, based on prior Googlebug 302 fiasco, since supposedly fixed.

Proofs:
1) Several 302 redirects from my Godaddy vanity domains to my primary home page, and it is still indexed, and not supplemental.
2) No 302 redirects listed to my site as shown with allinurl or inurl commands, including the GoDaddy redirects I know are in place.

Any other indications I should be looking for in this case?

[edited by: RonnieG at 7:56 pm (utc) on Nov. 1, 2006]

techstyled




msg:3145462
 10:12 pm on Nov 3, 2006 (gmt 0)

In the opening post, I mentioned logical fallacies that can corrupt your SEO process, and here's a big one that might be in play on the Sitemaps issue. The fallacy is called post hoc ergo propter hoc, or translated from the Latin, "after this, therefore because of this".

Sorry I'm not adding more to the discussion other than to say thanks but I just finished re-reading this entire thread and I wanted to specifically thank tedster for starting this discussion and also for pointing out the logical fallacies.

This thread has helped me firm up some of my current thoughts on SEO and allowed me to eliminate some of my "I think it is this way but don't really know" thoughts on the matter.

The logical fallacies has helped me re-think my position on several current real life issues, even some as mundane as what was causing my newborn's apparent late night feeding discomfort.

This thread is the exact example of why anyone serious about SEO/SEM should be on this board soaking up the knowledge.

Thanks all (and specifically you tedster)

Oliver Henniges




msg:3145500
 11:08 pm on Nov 3, 2006 (gmt 0)

In addition I haven't checked recently but there was a limit on how much of a page is indexed. Last time I looked it was around 500kb. Okay not many pages are that big but I suspect that this is actually a googlebot download limit.

I have a 70MByte pdf-version of one of my supplier's catalogues on the web and googlebot - after some initial hickups - has been regularly crawling it the past six months or so. At least the webmaster central didn't report any errors.

However: It is not really indexed and has no pagerank. Maybe too many images in it, not enough text. Or maybe there is indeed a limit for the indexing process.

alfawolf7




msg:3147210
 10:46 am on Nov 6, 2006 (gmt 0)

<This message was spliced on to this thread from another location.>

Black hat SEO sites promote the idea that one can boost one's ranking by placing one's link at various locations on the page. They sell links outside of the footer area for more. Upper left corner of the page, middle of the content are the best areas they claim.

What is your view?
I think it is myth.

[edited by: tedster at 7:10 pm (utc) on Nov. 6, 2006]

tedster




msg:3147843
 8:33 pm on Nov 6, 2006 (gmt 0)

I've seen evidence (and read articles) that Google assesses different "blocks" on the page in different ways. Rather than Myth, I would rate this idea as Probable - True

claus




msg:3147861
 8:53 pm on Nov 6, 2006 (gmt 0)

Regarding Google I think it's finally time to bury the 302 page hijack - it was a very real thing once, but that was a long time ago.

I have not seen any valid examples of it for several months although people keep writing me. In my experience it's always something else that is causing peoples sites to go AWOL in the SERPs these days.

--
And no I'm not available for consulting at all, in fact I'm not very available at all.

This 117 message thread spans 4 pages: < < 117 ( 1 2 3 [4]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved