homepage Welcome to WebmasterWorld Guest from 107.21.163.227
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 246 message thread spans 9 pages: < < 246 ( 1 2 3 4 5 6 7 [8] 9 > >     
Why Does Google Treat "www" & "no-www" As Different?
Canonical Question
Simsi




msg:3094365
 7:58 pm on Sep 23, 2006 (gmt 0)

...why does Google want to treat the "www" and non-"www" versions of a website as different sites? Isn't it pretty obvious that they are one site?

Or am I missing something?

 

g1smd




msg:3100350
 12:26 pm on Sep 28, 2006 (gmt 0)

>> domain.com/this.page.html
>> www.domain.com/this.page.html

>> This is the same page. Yes?

Yes, it is the same page. If both respond "200 OK" then that is the duplicate content problem.

If one reponds "301" and the other "200" then the problem is fixed.

Does your method "correct the URL to www" when you ask for non-www or does it just allow the server to serve the content, as is?

Simsi




msg:3100428
 1:39 pm on Sep 28, 2006 (gmt 0)

At the end of the day, web hosting companies should be setting up their servers correctly. Many do not. That is NOT Google's fault.

I don't deny hosting companies could supply the workaround. But the canonical issue wasn't a problem for a webmaster until Google made it a problem for the webmaster.

g1smd




msg:3100435
 1:43 pm on Sep 28, 2006 (gmt 0)

I see plently of canonical problems in Yahoo's SERPs but they don't cause such a big indexing effect because they don't have the same link-based (counting as part of the) scoring system that Google has. But I do see problems with many sites.

If Yahoo suddenly tagged some results with "alternative index", or something, you would soon see the big kick-off here in the forum. But some URLs are already relegated in Yahoo; there isn't just as many clues.

In other search engines they would throw additional and duplicate data away. Google adds it to a backup index and pulls that data when they run out of stuff in the normal index. That's a win-win situation because the site will appear for more searches that it otherwise would.

lmo4103




msg:3100451
 2:12 pm on Sep 28, 2006 (gmt 0)

ANAME RECORDS - what a cool solution.

Could algorithmguy or someone who understands how to mess with ANAME RECORDS start a thread discussing how to do this, or recommend a thread that already does?

I'm afraid I wouldn't know where to start.

theBear




msg:3100453
 2:17 pm on Sep 28, 2006 (gmt 0)

"I see plently of canonical problems in Yahoo's SERPs but they don't cause such a big indexing effect because they don't have the same link-based (counting as part of the) scoring system that Google has. But I do see problems with many sites."

You have that correct, MSN as well, Yahoo even after having done 301's.

Simsi




msg:3100506
 2:57 pm on Sep 28, 2006 (gmt 0)

So do Yahoo and MSN impose dupe content penalties for this aswell, even though its less of an impact theoretically?

g1smd




msg:3100515
 3:04 pm on Sep 28, 2006 (gmt 0)

Google doesn't apply a penalty, as such. It's just that as their algorithm depends on analysing links, that having both www and non-www active splits your PR and causes them both to perform less than a single URL for the same content would have done.

Same thing happens if you have multiple domains pointing at the same content, all returning "200 OK".

WolfLover




msg:3100717
 5:15 pm on Sep 28, 2006 (gmt 0)

Same thing happens if you have multiple domains pointing at the same content, all returning "200 OK".

How does one tell if a page is returning a 200 ok or a 301 or other pages. I mean it's obvious if your page returns a 404 but what about these others?

Now that I've a little better understanding of this situation than before this thread started, I want to ask some more advice.

Now that I have my 301 redirects in place, and most pages are in supplemental, which of the following would be the best tactic to getting my traffic back?

1. Delete the pages that are in supplemental and build new pages for those product pages? Of course, NOT using the same exact title, meta tags, or description.

2. Should I leave the supplemental pages as they are, hoping for their return and in the meantime just build new pages?

3. OR is there a better way than the above two ways?

If these supplemental pages stay in supplemental for a year, and being that there are many main index results so that they will likely never been seen by anyone searching for those particular products, I feel as though I should build new pages for the products that are in supplemental.

My question is would this be the wrong way to handle it? Meanwhile, my earnings have gone way south and I need to figure a way to get it back up and at this point cannot afford a huge AdWords account, so to get my free organic traffic back what is the best move?

theBear




msg:3100733
 5:23 pm on Sep 28, 2006 (gmt 0)

Are you certain that your 301 redirects are in fact 301 and not 302s or other things?

You can use a header checker, or TCP/IP logger, or some other tool to verify that things are as they should be.

Even those 404s should be checked, they may turn out to be 200 after a 302.

<added> I need a bigger key board or smaller paws. </added>

[edited by: theBear at 5:26 pm (utc) on Sep. 28, 2006]

g1smd




msg:3100735
 5:25 pm on Sep 28, 2006 (gmt 0)

To check the HTTP response code in the HTTP header, get the Mozilla, Firefox, or SeaMonkey web browser and add the Live HTTP Headers extension.

Alternatively get a stand-alone program like WebBug but do make sure that you click the HTTP/1.1 option when you use it. It defaults to HTTP/1.0 which often gives the wrong answer.

These checkers show you the server details that are sent out before the visible web page is sent to the browser. You don't normally get to see this information, but as a webmaster it is essential that you check it from time to time.

[edited by: g1smd at 5:28 pm (utc) on Sep. 28, 2006]

WolfLover




msg:3100736
 5:26 pm on Sep 28, 2006 (gmt 0)

Google doesn't apply a penalty, as such. It's just that as their algorithm depends on analysing links, that having both www and non-www active splits your PR and causes them both to perform less than a single URL for the same content would have done.

g1smd, so, since the entire 3+ years my site has been up and until two days ago there was never a 301 redirect done, so I have tons of both www and non-www pages out there, now that this part is corrected and just waiting on Google to update and eventually get rid of the supplementals, does this mean I should get a boost in PR as well? Yes, I know PR is not everything. My site has a home page PR of 5 and most inner pages have PR4 and PR3.

Now that the canonical issue is fixed, should I get a boost in PR? Hey, gotta get something out of this whole mess.

g1smd




msg:3100741
 5:33 pm on Sep 28, 2006 (gmt 0)

Yes. It is possible that some of your pages will go up one PR point.

It will take a few months for anything like that to kick in.

It will depend entirely on exactly what external links you already have pointing to your www pages and to your non-www URLs.

In the meantime, if anyone is currently linking to your non-www pages try and get some of them amended to point to the www instead. There is no big rush to get that done, but try to get a few changed each week.

WolfLover




msg:3100789
 6:07 pm on Sep 28, 2006 (gmt 0)

Ok, I already had Firefox but now have dowloaded the Live HTTP Headers on the sidebar.

This is what it showed me, (I changed my domain name of course per TOS).

[mysite.com...]

GET / HTTP/1.1
Host: mysite.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: __utmz=132089362.1150080267.5.2.utmccn=(organic)¦utmcsr=google¦utmctr=mysite+¦utmcmd=organic; __utma=132089362.465329161.1143099345.1159465459.1159466137.12; __utmb=132089362

HTTP/1.x 301 Moved Permanently
Date: Thu, 28 Sep 2006 18:01:05 GMT
Server: Apache/2.0.52 (Red Hat)
Location: [mysite.com...]
Content-Length: 325
Connection: Keep-Alive
Keep-Alive: 3
Content-Type: text/html; charset=iso-8859-1
---------------------------------------------

There was much more, but to keep this shorter, so if I am reading this correctly, it does indeed show that my site has the 301 Permanent Redirect rather than the 302 Temporary Redirect.

Am I correct? Does this look good?

theBear




msg:3100798
 6:13 pm on Sep 28, 2006 (gmt 0)

That looks good.

g1smd




msg:3100799
 6:14 pm on Sep 28, 2006 (gmt 0)

That looks good. Now try it for some internal pages.

Make sure that you haven't just got this for the root.

It needs to redirect all URLs, site-wide, and the redirect needs to preserve the full folder and file path in the redirect.

BigDave




msg:3100801
 6:16 pm on Sep 28, 2006 (gmt 0)

In the meantime, if anyone is currently linking to your non-www pages try and get some of them amended to point to the www instead. There is no big rush to get that done, but try to get a few changed each week.

This advice applies to your own site as well!

If you use relative linking, meaning any link withou the domain name, then switch to the full, absolute, preferred URL.

While it will not eliminate the problem, it will greatly reduce it.

By using relative URLs, any incoming links on the "wrong" domain will continue to support that wrong domain with PR, link count and anchor text.

If you change to absolute URLs, That one page still get the external vote, but all your internal votes go to the correct page.

I suspect that this one change would solve many of the canonical issues.

WolfLover




msg:3100804
 6:18 pm on Sep 28, 2006 (gmt 0)

Now, when I put in the search for www.mysite.com/index.cfm I get the below information. Note that it does change in the search bar to withOUT the index.cfm however, it does not note that it was a 301 redirect, but also does not say 302 redirect, does this mean it was a 301 redirect or not?

[mysite.com...]

HTTP/1.x 200 OK
Date: Thu, 28 Sep 2006 18:15:02 GMT
Server: Apache/2.0.52 (Red Hat)
Pragma: no-cache
Expires: {ts '2006-09-28 14:15:02'}
Content-Language: en-US
HppSecure: 0
Content-Type: text/html; charset=iso-8859-1
Connection: Keep-Alive
Keep-Alive: 3
Content-Length: 00041959

g1smd




msg:3100811
 6:19 pm on Sep 28, 2006 (gmt 0)

BigDave:

If the non-www to www redirect is in place there is no way for your site to serve a page at the non-www URL.

For sites that cannot add the redirect for whatever reason, your method can be a great help. It doesn't fully solve the problem, but can go a long way towards helping.

However, I do also like to confirm the domain in the internal links. I use the <base> tag to do that, and start all internal links with a / so that the URL counts from the root specified in the <base> tag.

WolfLover




msg:3100845
 6:46 pm on Sep 28, 2006 (gmt 0)

Great! I checked some internal pages and the 301 Redirect is on all pages that I checked! ;-)

However, I do also like to confirm the domain in the internal links. I use the <base> tag to do that, and start all internal links with a / so that the URL counts from the root specified in the <base> tag.

So, if you have the base tag reference in your header, then you do not need to have the FULL url for all page links, image links, etc. then? As long as the base tag is in place and the relative links begin with the / as in /images/123.jpg then that will be ok?

Currently, my site has the relative href without the / it has images/123.jpg

Looking at the view source, I do not see any base tag. Perhaps this is located in either a script or css page? Not sure how ColdFusion works but this is what my host has set up.

Is the lack of a base tag in the view source and the fact that there is a lack of the / before the relative href's, does this pose yet another issue with my site as far as duplicate content or some other penalty?

g1smd




msg:3100860
 6:55 pm on Sep 28, 2006 (gmt 0)

No. It's just another way of doing things.

BigDave was suggesting that internal links written like this can help a site:

<a href="http://www.domain.com/folder/page.html">link</a>

I merely added that I prefer to use this instead:

<base href="http://www.domain.com/"> - once in the header.

<a href="/folder/page.html">link</a>

They both do the same job in clarifying which domain is being talked about.

[edited by: g1smd at 7:06 pm (utc) on Sep. 28, 2006]

BigDave




msg:3100873
 7:05 pm on Sep 28, 2006 (gmt 0)

If the non-www to www redirect is in place there is no way for your site to serve a page at the non-www URL.

True, I was just trying to suggest a method for those that are scared of all that technical stuff. There are actually a whole pile of other reasons to go with absolute links only, as well as a pile of reasons not to.

However, I do also like to confirm the domain in the internal links. I use the <base> tag to do that, and start all internal links with a / so that the URL counts from the root specified in the <base> tag.

That works with the spiders of the big 3 search engines, but I actually like to have my sites in the wayback machine, which has serious problems with <base>. Early IE had problems with it as well, but I assume that is one bug that they just about had to fix.

When bandwidth was expensive, I used it all the time. Now that all my pages are PHP generated and bandwidth is cheap, i stick with absolute.

WolfLover




msg:3100976
 8:42 pm on Sep 28, 2006 (gmt 0)

When bandwidth was expensive, I used it all the time. Now that all my pages are PHP generated and bandwidth is cheap, i stick with absolute.

So, BigDave, what it appears to be then is that whether you use the base or the absolute urls, it is not an issue with duplicate content, just how it is viewed by the search engines.

This is if I am understanding you correctly. I value all of your opinions and like with most things everyone has a little bit different way they do it.

For me, I want to do it so that my site gets the best ranking, and is able to be found by search engines and users without any or at least not too many problems.

Thank you all again! Hats off to all those who are so willing to help others when they really do not have to, this makes you all very decent people in an age when there are so many that are only out for themselves and will not ever help anyone unless there is something in it for them.

AlgorithmGuy




msg:3100981
 8:49 pm on Sep 28, 2006 (gmt 0)

And congradulations to AlgorithmGuy who has the technical knowledge to back these claims and to all the guys who are helping us poor w****rs to fix our sites which will take a year and we'll be out of business.

mcskoufis,

Glad to be of service.

Note also that the google representative steered clear from this debate and did not challenge a single issue I raised.

This forum should be about helping webmasters with the right answers.

I do not have a hesitation to challenge google and its wrong doings.
.

AlgorithmGuy




msg:3100992
 8:56 pm on Sep 28, 2006 (gmt 0)

ANAME RECORDS - what a cool solution.
Could algorithmguy or someone who understands how to mess with ANAME RECORDS start a thread discussing how to do this, or recommend a thread that already does?

I'm afraid I wouldn't know where to start.

Sure,

I plan to make a page about how to purchase a domain and from whom with the best possible options to a webmaster.

Kill the canonical issue before it becomes a problem. I've described to you also, how to make an unethical link pointing to your website to become a good link.
.

[edited by: AlgorithmGuy at 8:57 pm (utc) on Sep. 28, 2006]

Simsi




msg:3101009
 9:04 pm on Sep 28, 2006 (gmt 0)

If the non-www to www redirect is in place there is no way for your site to serve a page at the non-www URL

Er....I must be doing something really stupid. I have that in place, and all URL's redirect to www as they now should. BUT...I have a phpBB directory with an .htaccess to mod-rewrite where the rule doesn't conform in a particular scenario...the .htaccess reads:


RewriteEngine on
Options +FollowSymLinks
RewriteRule ^post-([0-9]*).html&highlight=([a-zA-Z0-9]*) viewt.php?p=$1&highlight=$2 [L,NC]
RewriteRule ^post-([0-9]*).* viewt.php?p=$1 [L,NC]
RewriteRule ^view-poll([0-9]*)-([0-9]*)-([a-zA-Z]*).* viewt.php?t=$1&postdays=$2&postorder=$3&vote=viewresult [L,NC]
RewriteRule ^about([0-9]*).html&highlight=([a-zA-Z0-9]*) viewt.php?t=$1&highlight=$2 [L,NC]
RewriteRule ^about([0-9]*).html&view=newest viewt.php?t=$1&view=newest [L,NC]
RewriteRule ^about([0-9]*)-([0-9]*)-([a-zA-Z]*)-([0-9]*).* viewt.php?t=$1&postdays=$2&postorder=$3&start=$4 [L,NC]
RewriteRule ^about([0-9]*)-([0-9]*).* viewt.php?t=$1&start=$2 [L,NC]
RewriteRule ^about([0-9]*).* viewt.php?t=$1 [L,NC]
RewriteRule ^about([0-9]*).html viewt.php?t=$1&start=$2&postdays=$3&postorder=$4&highlight=$5 [L,NC]
RewriteRule ^mark-forum([0-9]*).html* viewforum.php?f=$1&mark=topics [L,NC]
RewriteRule ^updates-topic([0-9]*).html* viewt.php?t=$1&watch=topic [L,NC]
RewriteRule ^stop-updates-topic([0-9]*).html* viewt.php?t=$1&unwatch=topic [L,NC]
RewriteRule ^forum-([0-9]*).html viewforum.php?f=$1 [L,NC]
RewriteRule ^forum-([0-9]*).* viewforum.php?f=$1 [L,NC]
RewriteRule ^topic-([0-9]*)-([0-9]*)-([0-9]*).* viewforum.php?f=$1&topicdays=$2&start=$3 [L,NC]
RewriteRule ^ptopic([0-9]*).* viewt.php?t=$1&view=previous [L,NC]
RewriteRule ^ntopic([0-9]*).* viewt.php?t=$1&view=next [L,NC]

Now it does redirect fine - until you login to the phpbb. Then the re-direct disappears and you go back to widgets.com/etc - Can you see any obvious reason why this would not redirect widgets.com/phpbb to www.widgets.com/phpbb if all other directories etc work ok?

[edited by: Simsi at 9:06 pm (utc) on Sep. 28, 2006]

AlgorithmGuy




msg:3101013
 9:06 pm on Sep 28, 2006 (gmt 0)

These checkers show you the server details that are sent out before the visible web page is sent to the browser. You don't normally get to see this information, but as a webmaster it is essential that you check it from time to time.

gIsmd,

The ultimate reliable header checker is the servers raw logs.

Look for the status codes. Assuming the server is at optimum efficiency and configured.

If you see anything other that 200 GET or 304 UNCHANGED then asky yourself why is that a 302, or 404 etc.

A 404 may be a missed link pointing to your website. Check out that link. It maybe coming in from a high pagerank website. MAKE a page for it, comsume it and thank the webmaster that pointed it to you. Don't see 404 as all being no good.

If you see the raw logs give a 302, be alarmed. Note the referer, what page was asked. Sometimes you will see that a slash is missing and the same page called twice. First as a request, then given a 302 to the proper page. This is dangerous to your website.
.

AlgorithmGuy




msg:3101020
 9:12 pm on Sep 28, 2006 (gmt 0)

Now, when I put in the search for www.mysite.com/index.cfm I get the below information. Note that it does change in the search bar to withOUT the index.cfm however, it does not note that it was a 301 redirect, but also does not say 302 redirect, does this mean it was a 301 redirect or not?

Wolflover,

Sorry, you lost me. Please explain what you done to get that result.
.

AlgorithmGuy




msg:3101045
 9:29 pm on Sep 28, 2006 (gmt 0)

One of the most preposterous gimmiks I have ever seen.

If google's button to treat a website as one is not the most silliest and most misleading thing I have ever seen, then I will be a monkeys uncle.

Can you believe it?

It says make a choice as to how you want google to see your website.

With the two versions or as one.

PREPOSTEROUS. Misleading and a deceptive tactic against webmasters.

How on earth can a novice decipher that?

Is google going to do a 301? NO it is not, how on earth can it. It is so misleading. How is a novice not going to fall for that trick and get even more confused with this canonical issue.

The only ways to resolve a site are the methods disclosed in this thread and not the google button.

This kind of deception really has to stop.
.
.

[edited by: AlgorithmGuy at 9:48 pm (utc) on Sep. 28, 2006]

g1smd




msg:3101054
 9:31 pm on Sep 28, 2006 (gmt 0)

>> Now it does redirect fine - until you login to the phpbb. Then the re-direct disappears... <<

If this only occurs when you are logged in, then Google, etc, will never directly see those URLs. They can't log in.

However it would be nice to fix it. People often copy URLs seen in their browser over to content pages on the site or on other sites. In this case they will be copying the wrong URLs. Once pasted on to content pages, those will then be spidered by search engines, and that would be a problem.

BigDave




msg:3101081
 9:49 pm on Sep 28, 2006 (gmt 0)

So, BigDave, what it appears to be then is that whether you use the base or the absolute urls, it is not an issue with duplicate content, just how it is viewed by the search engines.

It is an issue with duplicate content. It does not stop there from being duplicate content, it stops *you* from voting for the duplicates and applies all your votes to the original.

Let us suppose that you have a home page with 3 internal links on it. Two different people link to that page differently, www.example.com and example.com.

www.example.com links would go to
www.example.com/page1.html
www.example.com/page2.html
www.example.com/page3.html
and all those pages will be linking back to www.example.com with their "home" link.

example.com links would go to
example.com/page1.html
example.com/page2.html
example.com/page3.html
and all those pages will be linking back to example.com with their "home" link.

Even if there is only that one link to example.com from outside, you just created 3 more links to example.com from *inside your own site*! Not only that, you are the one that created duplicates of 3 out of 4 of your pages!

On the other hand, if you use absolute, you get this on example.com:

example.com links would go to
www.example.com/page1.html
www.example.com/page2.html
www.example.com/page3.html
and all those pages will be linking back to www.example.com with their "home" link.

An external vote only creates one duplicate, it does not cause you to create an entire duplicate site.

It also only gives you that one vote of PR before passing it to the correct site, whereas with relative URLs, you pass it around to all those incorrect pages, and back to the incorrect home page.

The same goes for things like / vs. /index.html. Don't be guilty of splitting the votes.

Duplicate content IS a ranking issue. There are always going to be ways that someone can mess with you, but if you build your site robustly enough, it will make it very difficult. Get lots of links to the right URLs, and always link to the right URLs yourself, will get you a long way towards duplicate content proofing your site.

This isn't just about what google is doing now, it is about avoiding other problems in the future, and doing things the right way. The same goes for having your servers set up right.

If you do things right, you aren't guaranteed that you will never hit a slump in rankings, but you will float on past most of the times when people are gnashing their teeth about the latest update.

Simsi




msg:3101089
 9:53 pm on Sep 28, 2006 (gmt 0)

However it would be nice to fix it.

Yes indeed. I've also just found using "site:domain.com -inurl www" that there is one entry in Google for a non-www page in this directory that still for some reason will not redirect to www, even if you are logged out! Odd. yet a direct call into other pages of the phpbb directory redirects fine.

[edited by: Simsi at 9:54 pm (utc) on Sep. 28, 2006]

This 246 message thread spans 9 pages: < < 246 ( 1 2 3 4 5 6 7 [8] 9 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved