Forum Moderators: open

Message Too Old, No Replies

Is example.com Different From www.example.com?

Different PR for it.

         

alexandra

4:39 am on Feb 16, 2004 (gmt 0)

10+ Year Member



Hi

example.com has a pr4, but www.example.com has pr6, how does it come this way? what is the difference?

thanks for your input
alexandra

[edited by: ciml at 2:05 pm (utc) on Feb. 16, 2004]
[edit reason] Examplified URL. [/edit]

SyntheticUpper

11:20 pm on Feb 18, 2004 (gmt 0)

10+ Year Member



Two sides to every coin - but it isn't beyond Google to help sort out a very common problem. Problems with 301s are well documented - a live business just can't afford to watch its pages drop while Google figures it out (and "did you know that www is a sub-domain of.." is also very well documented ;)

It wouldn't be beyond Google's boffs, if they really wanted to sort it out, as well as free up some of their index space, to design a simple text document to be placed in the root directory:

e.g.

googlewww.txt

www true
notwww false

or some similar syntax. It is not a 301, just an instruction to Google to delete a dupe / amend an incorrect listing on the fly. It could even be added to the existing robots.txt within a commented out line.

Think it's worth stickying GoogleGuy with this?

BeeDeeDubbleU

11:29 pm on Feb 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes Brakkar. It has been said before and I will say it again. It should not be beyond G to look at "two" sites with the same domain name and if they have the same content than de-index the -www version. Even if they only compared the home pages in making this decision I think this would make far more sense than their current practise. C'mon Google - this is a no brainer! You can handle Latent Semantic Indexing so surely you can handle this?

GoogleGuy many people have asked you to comment on this. Any chance of doing so - pleeease :o)

It is obviously a SERIOUS issue and one which should be addressed ASAP.

Added (Sorry SyntheticUpper! I was typing this when your went in but at least we are of the same opinion, i.e. that G should fix this.)

bignet

12:55 am on Feb 19, 2004 (gmt 0)

10+ Year Member



It should not be beyond G to look at "two" sites with the same domain name and if they have the same content than de-index the -www version.

but what about folks who need their non-www site to be indexed, not their www site

[edited by: bignet at 1:06 am (utc) on Feb. 19, 2004]

SyntheticUpper

12:59 am on Feb 19, 2004 (gmt 0)

10+ Year Member



My small text file solution would allow this option.

bignet

1:10 am on Feb 19, 2004 (gmt 0)

10+ Year Member



robots.txt can do just that if you think about it

assume both site on same ip address, they can be set up to serve different content, put up robots.txt on whichever one you please

SyntheticUpper

1:23 am on Feb 19, 2004 (gmt 0)

10+ Year Member



Sorry BigNet - I don't understand that. My not-www and www pages are exactly the same pages - how can I apply robots.txt to one and not the other? There's only one root directory to put it in.

Stefan

1:28 am on Feb 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The redirect is the best approach, but something you can also do is change all your internal links to absolute from relative. When the bot comes in on site.org, any page, the only internal links it sees are [site.org...] It prevents the bot from merrily crawling through every page as site.org/whatever.htm and forces it to see the rest of the site with a www on the front of the URL

jdMorgan

3:37 am on Feb 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



An excellent point, Stefan -- and especially useful if you need to "repair" existing damage.

Jim

Stefan

4:18 am on Feb 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hey, thanks Jim.

I had problems in Dom/Esm with that www stuff because of one incoming link without the www. I got it changed, then everything was fine until about three months ago. Google suddenly decided to start crawling site.org. As far as I can tell, there are no incoming links in that form... I think I might have caused the problem myself via the toolbar, checking to see if there was anything in that form. Ironic, eh?

I have shared hosting, the company claims I can't do a server-side redirect, so I figured, I'll force the bloody bot to find nothing but www.site.org. It took several days pasting absolute links into all 200 pages. It worked though. I now have googlebot occasionally check for the index page on site.org and that's it. I get about 25% of the site crawled daily, all with a www on the front of the URL.

I kept meaning to mention this rather crude but effective approach before... finally got around to it.

kiril

4:50 am on Feb 19, 2004 (gmt 0)

10+ Year Member



Wow, this thread is pretty intimidating to a casual webmaster like myself.

I need to understand one thing a little better, because I also see example.com for my site as well as the usual www.example.com, and I don't want to get hit by the disasters discussed in this thread:

What does it take for Google to view example.com as the correct site and www.example.com as a duplicate? Does it simply take somebody mistakenly linking to example.com, which would cause Google to spider it? (Right now I have a white PR bar for my example.com)

I use only absolute URLs in my own internal links, so maybe I'm safe. I don't know what 301's and 404's and such things are, but I guess I'll have to learn.

Stefan

5:11 am on Feb 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Does it simply take somebody mistakenly linking to example.com, which would cause Google to spider it?

Yep.

In my case, I don't seem to have any incoming links to site.org. Jah only knows where Google got it from, (me and the toolbar maybe), so the change from relative to absolute links cleaned things up. I don't how much of a help this would be if you have incoming links from high PR pages to the wrong URL.

GoogleGuy

5:20 am on Feb 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Most of the time we can detect if example.com and www.example.com are the same. Message #7 by g1smd said pretty much exactly what I would say.

BUT: if people are seeing extreme problems, please drop me a report (either email to webmaster [at] google.com or via a spam report) with the keyword urlcanonicalization as one word. I'm happy to pass those reports on the the crawl group to make sure that they group them together, and check out if our canonicalization has developed any problems..

plumsauce

6:35 am on Feb 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



bignet


better yet, they should actually read about DNS

To realise that example.org is a subdomain of org, and www.example.org is as well a subdomain of subdomain example.org

Now who thinks all .com sub-domains are the same, thankfully G et al do not

A very selective interpretation on your part.
Did I say that all .com sub-domains are the
same? No, I did not.

Clearly, in the present case where the alias and
canonical name resolve to the same ip, and
the right hand hierarchy matches to at least two
levels, and the content is the same,
then it is a strong indication that these
are one and the same site. For this situation
to result in a duplicate penalty is clearly not
the fault of the site.

DNS&BIND, ch. 15, demonstrates a lookup on
by setting the query type to "cname" in nslookup.
The returned data, should the name be an alias,
returns the canonical properly noted. The code
to do this is widely published in source form,
as it is the code for nslookup.

I am suggesting that if and only if:


alias ip == canonical ip
&&
second level == second level
&&
alias content == canonical content

then Google should realise that this is not
a duplicate

I am not suggesting that all sites on a single
ip should be treated as the same. Although,
as dns is, generally, a means of mapping a
name to a numeric address, and a that numeric
address is the smallest unit of resolution, an
argument could be made for that case, name based
hosting of RFC 2616 notwithstanding.

Once again, for the umpteenth time,
this is not rocket science.

++

alexandra

8:32 am on Feb 19, 2004 (gmt 0)

10+ Year Member



thanks to all

BallochBD

9:06 am on Feb 19, 2004 (gmt 0)

10+ Year Member



Alexandra,

This .htaccess entry worked for me ...

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.mydomain\.co.uk
rewriterule (.*) [mydomain.co.uk...] [R=permanent,L]

Note that there is a space between ...HOST} and ^www\.mydomain. Which may not appear in this post.

BOL!

alexandra

9:18 am on Feb 19, 2004 (gmt 0)

10+ Year Member



thanks, I just used ThatAdamGuy`s script on message 15:

RewriteEngine On
Options +FollowSymlinks
RewriteBase /
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule ^(.*)$ [example.com...] [R=permanent]

it works! thanks to all!

vrtlw

11:22 am on Feb 19, 2004 (gmt 0)

10+ Year Member



GoogleGuy:
canonicalization

Does this infer that our DNS should be setup with the www. as a cname (canonical name)? Perhaps people should check with their hosting company about how their DNS is setup, this seems like a firm indication not to setup sub domains as A records. (as per DNS & BIND by O'Reilly)

SyntheticUpper

11:54 am on Feb 19, 2004 (gmt 0)

10+ Year Member



Good advice from Stefan - but 98% of my site is listed as not-www; despite absolute links throughout, and 500+ links (including a DMOZ and Yahoo one) *all* pointing to www. (It seems a single not-www incoming, now corrected, caused this problem well over a year ago.) Updates come and go, crawled every 1/2 days, but there's no sign of the bot getting wise. Only the index is listed as www, but it is duplicated by a not-www listing. The not-www is filtered out by Google, but unfortunately is the one Google associates with the ODP listing, so the DMOZ description never shows. I wrote to webmaster @ google about this, and I got a standard response to wait for a directory update etc. I explained that I'd been waiting for over a year for it to resolve and it clearly wasn't going to, but they replied that they would not manually alter anything and that was that :(

I tried a 301, but all my pages dropped and I chickened out. Do you think I could ask Google to manually remove some not-www pages - will this result in the www pages disappearing too?

bignet

11:55 am on Feb 19, 2004 (gmt 0)

10+ Year Member



My not-www and www pages are exactly the same pages - how can I apply robots.txt to one and not the other?

Most website are hosted virtually, ie they share ip address with other sites. Web servers do not confuse content installed on same server by having specific directroy to be served to one or more host header value

Though i am not familiar with apache, i am almost sure you can set it up to serve different content for your www and non-www site, if these have been created in dns. My earlier post would give you an idea how to do that on IIS. Tho that is not my suggestion. mine:

i would not offer a site @ example.org to force everyone to use/link to www.example.org

even if noone links to example.org, bots may crawl it by toolbar-enabled visits, or even mention of none-www in an email address. later sounds strange but Google did that to one site imo

very useful posts by Jim and Pageoneresults, please read them

vrtlw

12:06 pm on Feb 19, 2004 (gmt 0)

10+ Year Member



very useful posts by Jim and Pageoneresults, please read them

Agreed but there is a distinction between Jim's post and GoogleGuy's that is significant.

Jim:

They just "dumb down" the process, set up both domain variants with A records, and keep mum on the subject.

GoogleGuy:

BUT: if people are seeing extreme problems, please drop me a report (either email to webmaster [at] google.com or via a spam report) with the keyword urlcanonicalization as one word.

I have never had a problem with www versus non-www listing and do not employ a 301 redirect for a solution to the mentioned problem. However my dns is setup with canonical (cname) records rather host (A) records.

Hissingsid

2:18 pm on Feb 19, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

I just did a DNS lookup on www and non-www versions of the site that I have a problem with and these are the results.

www.domainname.co.uk
Type: A
Class: IN
TTL: 43200
Answer: *.171.193.8

domainname.co.uk
Type: A
Class: IN
TTL: 43200
Answer: *.171.193.8

Exactly the same!

Does this say anything to anyone here or is there something else I should be looking for?

Best wishes

Sid

PS Its a IIs server and I have very little options other than move to another server.

rogerd

2:29 pm on Feb 19, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Here is a discussion of how to redirect a domain such as example.com to www.example.com [xoc.net] on IIS.

If you want to get into the details, I'd suggest posting in the Microsoft forum here.

vrtlw

11:22 pm on Feb 19, 2004 (gmt 0)

10+ Year Member



Does this say anything to anyone here or is there something else I should be looking for?

Sid,

You seem to have answered the question yourself with those DNS lookups. Try speaking with the people who host your DNS or look at a 3rd party DNS provider rather than moving your entire hosting.

kaled

2:00 am on Feb 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If Google can understand that example.com is the same as example.com/index.html then it should be straightforward to extend the algo so that example.com is treated as www.example.com

Although the causes of the two problems are different, as I see it, the solutions should be essentially the same.

Kaled.

Stefan

2:10 am on Feb 20, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's not just Google that stumbles on this... check around in the new Yahoo serps.

millie

9:09 am on Feb 20, 2004 (gmt 0)

10+ Year Member



<It's not just Google that stumbles on this... check around in the new Yahoo serps.>

Exactly! And then go and check out ATW.

bignet

10:45 am on Feb 20, 2004 (gmt 0)

10+ Year Member



i think goodle is pretty good at dealing with

  • www.example.org
  • example.org and
  • www.example.org/~index.html

    others, notably ink, pollute their serps with such dup listings

  • Hissingsid

    12:14 pm on Feb 20, 2004 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Here is a discussion of how to redirect a domain such as example.com to www.example.com on IIS.

    Thanks for that Roger.

    The problem is that I have absolutely no way of gaining access at that sort of level. I have a couple of sites on this server which date back to when I had someone working on an asp solution for a couple of projects I had on the go. He was a reseller and I just got him to add a couple of sites really cheaply for me to use in experiments. One of these sites did well in SERPs and started sending my main site significan amounts of referrals, enough to make it well worthwhile moving server to one running Apache which I at least understand to a limited extent.

    Those referrals will no doubt now dissapear as the site has dropped from Google so I may as well bite the bullet.

    Best wishes

    Sid

    Roel

    9:46 pm on Feb 23, 2004 (gmt 0)

    10+ Year Member



    Other thread relating to this: [webmasterworld.com...]

    pgudge

    4:41 pm on Feb 26, 2004 (gmt 0)

    10+ Year Member



    I know this is different to non www. address but I think its the same.

    I have
    www.example.com
    www.example.co.uk

    The DNS for both domains is pointing to my server, and there I have a apache host setting for .co.uk and the same for .com with the ".com" prefix instead of .co.uk

    Now obviously both pages are the same,

    www.example.co.uk has a PR of 6, and www.example.com has a PR of 5. Am I best off doing the 301 on the .htaccess file for the site, so that Google is told to only goto one address?

    Thanks.

    This 72 message thread spans 3 pages: 72