Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Links to /index.html

         

malasorte

10:12 pm on Jan 5, 2006 (gmt 0)

10+ Year Member




Most of the links to my page from other sites are to www.mysite.com, but I have several links to www.mysite.com/index.html.
Is this a problem in Google? Could this cause the search engine to see the page as two separate pages and split page rank?

Thank you!

steveb

2:05 am on Jan 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes. Yes.

g1smd

2:11 am on Jan 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Duplicate Content. Big Problem.

Link to http://www.domain.com/ or http://www.domain.com/folder/ or /folder/ when linking to an index page - omit the index file filename.

DO include a trailing / on the end of the domain or folder URL.

DO use the <base> tag to signal either the "real" URL for the page (when using relative linking), or the URL for the root domain the page is associated with (when using full absolute URLs on internal links).

Rainie

5:30 am on Jan 6, 2006 (gmt 0)

10+ Year Member



I agree that it will be a problem. I did encounter this situation. Try to contact the webmasters and politely ask them to change the link to your preferred URL. Many of them will comply. Site scrapers, well... you can always ask for the link to be removed but for the most part, you'll probably be ignored.

My site is on an apache server so I did a 301 redirect from www.mysite.com/index.html to www.mysite.com/ using my .htaccess file. I found answers in the apache web server forum.

doc_z

11:05 am on Jan 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Normally, this isn't a problem for Google because these pages are merged, i.e. they have the same PR, backlinks etc. You can check this by using info:www.mysite.com/index.html - if the pages are merged, 'www.mysite.com' will be displayed as URL.

When these pages are already merged, I wouldn't change anything. However, when creating a new site I would just link to '/' as suggested by g1smd.

g1smd

7:57 pm on Jan 6, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Several entries on Matt Cutts blog for 2006-Jan-04 are worth reading in parallel with this thread.

jake66

5:24 pm on Jan 8, 2006 (gmt 0)

10+ Year Member



is [somesite.com...] considered the same/duplicate as [somesite.com?...]

g1smd

7:13 pm on Jan 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Most servers automatically redirect [domain.com...] to [domain.com...] anyway, so that is not usually a problem at all.

However [domain.com...] is a duplicate of [domain.com...] and you should set up a 301 redirect for that and all the internal pages too.

Ellio

7:20 pm on Jan 8, 2006 (gmt 0)

10+ Year Member



As programs like Dreamweaver MX automaticaly set up internal links to the short address such as /index.htm for the homepage etc but external links are inevitably to the full url www.mysite.co.uk - should we change the internal links to absolute links or leave them be?

g1smd

7:38 pm on Jan 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Internal links to index pages should be to http://www.domain.com/ or to http://www.domain.com/folder/ if you want to use the full URL.

You could instead link to / or to /folder/ and these latter options, when paired with a <base href="http://www.domain.com/"> tag, do exactly the same job.

Don't include the index file filename in the link. Let the server find it and serve it without revealing it's real name.

Ellio

10:58 pm on Jan 8, 2006 (gmt 0)

10+ Year Member



Do you think a change in internal linking from /index.html to full URL on a large site could possibly trigger a new link flag/penalty?

Just a thought.

g1smd

11:33 pm on Jan 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Doubtful, but use the <base> tag on every page if you can - it will save a lot of bandwidth, as the links (and image, CSS and JS file URLs) will only need to start with a / and then count from the root of the domain.

Ellio

11:50 pm on Jan 8, 2006 (gmt 0)

10+ Year Member



thanks

Key_Master

11:58 pm on Jan 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



g1smd, I'd like your input on the base element and why you believe Googlebot uses it.

[webmasterworld.com...]

g1smd

12:32 am on Jan 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




Done (in that thread).

.

Worms.

Can of.

Opened

tedster

12:44 am on Jan 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



why you believe Googlebot uses it

One good indicator is the fact that the Google cache will ADD a base href element at the top of the mark-up.

Key_Master

12:54 am on Jan 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Poor example tedster. More likely, it's dynamically added when a visit to the cached copy occurs.