Forum Moderators: open

Message Too Old, No Replies

Google and URL's

         

born2drv

8:43 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I know Google views the following URL's different since the page ranks are different....

[mysite.com...]
[mysite.com...]

... and reading from the other thread about the PR11 thing, I realize now, that it sees these URL's different too....

[mysite.com...]
[mysite.com...]

and maybe it views these URL's differently too?

[mysite.com...]
[MySiTe.CoM...]

..... SO the point to all this is.... If I have inbound links from different URLs that are basically the same page or directory (with/without www, with/witout "/" or diffent case characters)... Will Google allocate PR to them all individually or give the one with the most inbound links all the PR from its related pages?

If it gives them all pagerank individually, how do you guys avoid this? Just by making sure the URL is perfect everywhere? Making all URL's work, but resolve to one set only in .htaccess rules?

Thanks.

rfgdxm1

8:56 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>I know Google views the following URL's different since the page ranks are different....
[mysite.com...]
[mysite.com...]

Of course Google does, as those are 2 completely different URLs. It would be trivially easy to put a children's site at [mysite.com,...] and a hard core porn one at [mysite.com....] I've actually seen cases where people have point these 2 at different data. Google normally just combines the 2. You could redirect one of the above to the other if you wanted.

born2drv

9:01 am on Jan 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



thanks for the quick reply ;)

OK so I knew abou the "www" thing which is why I have almost all my inbounds with "www" ... but what about the "/" at the end of directory URL's? what about case-sensitivity, do I need to worry about this as well?

NickCoons

9:23 am on Jan 23, 2003 (gmt 0)

10+ Year Member



born2drv,

As mentioned already, Google does see:

[mysite.com...]
[mysite.com...]

...as two different URLs. Generally, the two different URLs above would point to the same content. So from Google's point of view, these are two different sites with duplicate content.

You suggested that it may apply a PR value to each site. I think it's even worse than this -- That Google will drop one of the sites from its index because they provide duplicate content. That means that all of he inbound links pointing to the dropped site, from a PR standpoint, would be useless.

When I post my URL on someone's site, or contact them to have it posted, I make sure they include the "www", and then I verify that they've posted it correctly.

You can also use Google to search for link:www.mysite.com and link:mysite.com to see if anyone is linking to your site "incorrectly", and then contact them to ask them to change it.

As for case-sensitivity.. domain names are not case-sensitive, and the rest of the URL is dependent on the operating system that the server is running. If I'm running a Linux server with Apache, the following:

[mysite.com...]
[mysite.com...]

...will yield different content (generally, one of them would give a 404). So there are a couple ways Google could respond:

1) Google could see them as different URLs, crawl both of them, see different content, and index both of them.

2) Google could see them as different URLs, crawl both of them, see the same content, and drop one because of duplicate content (degrading your PR).

3) Google could see them as the same URL, and crawl the wrong one (the one that gives the 404 error), and index it.

If you are having people link right to your main page, that is.. just the "www" and the domain, it wouldn't matter because they are not case-sensitive (and its probable that Google knows this). But if people are linking to sub-pages, then I'd verify that they are linking correctly.

As for the slash at the end of sub-directories, a quick search on Google resulted in every page (without the filename in the URL) ending in a slash.. so I don't think this is an issue.