homepage Welcome to WebmasterWorld Guest from 54.237.134.62
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 36 message thread spans 2 pages: 36 ( [1] 2 > >     
Relative Links vs. Absolute links
Does it make a diff. to the search engines?
F_Rose




msg:3205733
 2:16 pm on Jan 2, 2007 (gmt 0)

I was wondering if having relevant links would be a problem for search engines?

 

skweb




msg:3205895
 4:27 pm on Jan 2, 2007 (gmt 0)

No, it doesn't. However, Google says that stick to one format rather than having both. Having said that, one of my old websites now has both and there is no penalty.

europeforvisitors




msg:3205898
 4:31 pm on Jan 2, 2007 (gmt 0)

Relevant links are good. Relative links (which is what I think you mean) are also fine, if the search engine isn't having problems crawling them.

I base my comment on what GoogleGuy once said in a discussion of this topic. (He did suggest that absolute links were less likely to have problems with search-engine crawlers.)

theBear




msg:3205917
 4:46 pm on Jan 2, 2007 (gmt 0)

There are several reasons to not use relative links, they all have to do more with your server setup than the links being relative.

Absolute (fully specified) links on a site prevent the site itself from allowing an inbound link to split a site via duplication even if the server would otherwise allow it to happen.

This is true for the current round of why oh why did Google index my site under https as well and cause my site to tank in the serps.

Just my slant on the situation.

[edited by: theBear at 4:47 pm (utc) on Jan. 2, 2007]

F_Rose




msg:3205975
 5:25 pm on Jan 2, 2007 (gmt 0)

Thanks for your response..

We do have supplemental issues with our site.

Our site does not have absolute links, it's set up using relative links only.

I did notice,Google has some of our pages as https and some not.

Could this possibly be the cause of having our pages dropped into supplemental?

theBear




msg:3205990
 5:44 pm on Jan 2, 2007 (gmt 0)

"I did notice,Google has some of our pages as https and some not.

Could this possibly be the cause of having our pages dropped into supplemental?"

Yes it could be. I would have expected several possible outcomes from the http/https form of the duplicate content situation. Pick some or all of the following for outcomes:

1: One of the pages would be filtered out for a specific search (I would expect the low PR one to be filtered) you can see which ones get filtered by doing the searches.

2: The site gets considered as a spam site. Too many new pages all looking like other existing pages on the same domain.

3: That if the situation isn't fixed in a timely manner lots of pages entering the index, then going supplemental, and finally exiting the index. Along with all of this page blinking comes link blinking which can affect the pages that are linked to from the blinking pages. In short not exactly the signal you want to send.

F_Rose




msg:3205995
 5:53 pm on Jan 2, 2007 (gmt 0)

So what would be the solution?

Should we start hard coding all of our links?

It's a lot of work, just want to make sure it's worth it.

tedster




msg:3206003
 6:02 pm on Jan 2, 2007 (gmt 0)

Consider using the <base href=""> element in the head section of each page. Make the value of the href attribute the fully qualified absolute url of the page. Then you can leave the rest of the anchor tags an src attributes on the page as relative. This is what the base element was created for -- as an aid to user agents trying to cope with relative paths.

F_Rose




msg:3206005
 6:04 pm on Jan 2, 2007 (gmt 0)

Tedster,

This is how we have it set up, however Google is still indexing the https version.

theBear




msg:3206009
 6:10 pm on Jan 2, 2007 (gmt 0)

Well, I would certainly correct the situation.

It isn't helping your site any.

There is more than one way to correct the situation.

You may need to ask for reinclusion if things are really messed up.

I think if you look into what can be done with rewrite rules you can in effect (over time) clean up your site.

I've never had to deal with such a situation, however I'd expect it to play out similar to the www/non-www form of duplicate content cleanup.

This last form I have dealt with, with a few parked domains thrown in for good measure complete with query string garnish as the whipped cream and cherries on top.

Good luck and let us know how things turn out.

F_Rose




msg:3206017
 6:16 pm on Jan 2, 2007 (gmt 0)

theBear,

As per Tedster: Consider using the <base href=""> element in the head section of each page. Make the value of the href attribute the fully qualified absolute url of the page. Then you can leave the rest of the anchor tags an src attributes on the page as relative. This is what the base element was created for -- as an aid to user agents trying to cope with relative paths."

Our site is set up this way, so why is Google still indexing our https version?

AndyA




msg:3206037
 6:44 pm on Jan 2, 2007 (gmt 0)

I use relative linking on my sites, except when linking back to the home page or a directory. In those cases, I use absolute links to avoid the "directory/index.html" vs. "/directory/" issue, which causes a duplicate content issue for Google.

This seems to make sense to me.

tedster




msg:3206049
 6:55 pm on Jan 2, 2007 (gmt 0)

F_Rose, the main reason is that your server is still returning pages at those https: urls, and since Google already has those urls, they keep spidering them. There's also a good chance that your secure pages are "leaking" from relative links back into the site. The base href is a good preventative measure, but after the problem has started, it's just a band aid, not a fix.

The best practices are:
1) only serve https pages from a dedicated subdomain, like secure.example.com
2) place a second robots.txt file on the secure port

These are true fixes.

theBear




msg:3206056
 6:56 pm on Jan 2, 2007 (gmt 0)

Base hrefs have been known to have been incorrectly setup.

Without seeing through the eyes of the bot I can't really say.

You could run a link walker on your site and see what it says about things.

I just looked at a site that had relative hrefs in its menu and on its cart pages. That allowed the https pages to get indexed.

Edit: In fact I'ved looked at several such sites lately.

[edited by: theBear at 6:58 pm (utc) on Jan. 2, 2007]

F_Rose




msg:3211021
 6:04 am on Jan 7, 2007 (gmt 0)

Tedster,

"The best practices are:
1) only serve https pages from a dedicated subdomain, like secure.example.com
2) place a second robots.txt file on the secure port

These are true fixes."

What about doing a 301 redirect for pages that do not need the https versions (the pages are fine with the http version only) but we would want the http version to get out of supplemental and get listed. Will this help for Google duplicate issues?

tedster




msg:3211073
 8:36 am on Jan 7, 2007 (gmt 0)

I don't know how to set up a 301 just for the https: protocol and I've never seen it done, so I have no experience to draw from. If you can pull it off somehow, without establishing any looping, please let us know how you did it and how it works out.

piatkow




msg:3211218
 1:29 pm on Jan 7, 2007 (gmt 0)

I do the same as AndyA and my site seems to be getting good positions on relevant search terms with internal pages being found as well as the home page.

Of course that might be nothing to do with it but I am not risking my SERPS to find out.

theBear




msg:3211256
 2:28 pm on Jan 7, 2007 (gmt 0)

F_Rose,

The https: situation gets complicated by the actual setup of the cart software which still has to use https: so a blanket 301 won't do the trick, it has to be crafted to the site setup.

I believe that member johnhh has one that works for IIs.

tedster,

I believe that you have access to the protocol via the rewrite engine and that should change when you do the rewrite so looping should not be an issue. Remember though I haven't tried this however IIRC folks who tried blanket redirects to correct www/non-www issues and had carts on the same domain had to correct their rewrite rule sets so the cart was still usable ;-).

Edit:This however should not be interpreted as being the only thing you should do in regards to this situation. Those relative hrefe that helped cause the issue should still be corrected to reduce the number of 301s the search engines need to deal with. The timeing of the changes to the relative hrefs is after the 301 has been acted upon for each such link.

[edited by: theBear at 2:34 pm (utc) on Jan. 7, 2007]

F_Rose




msg:3212200
 4:27 pm on Jan 8, 2007 (gmt 0)

As mentioned we have most of our pages in supplemental.

However, 41 of our pages are not in supplemental.

If we have relative linking issues or an https issues with Google why are these pages that are not in supplemental not affected?

F_Rose




msg:3212209
 4:49 pm on Jan 8, 2007 (gmt 0)

piatkow,

What is that you are doing?

Please explain..

theBear




msg:3212243
 5:16 pm on Jan 8, 2007 (gmt 0)

F_Rose,

The state of each page at any given point is related to how far along the bots have gotten in retrieving the pages under each form of the url and where in the process the indexing is. Then the ranking system takes over.

It is all a matter of timing. This is unique for each url.

Sites with on domain duplicate issues can remain productive for sometime before the hammer seems to drop.

F_Rose




msg:3213988
 10:54 pm on Jan 9, 2007 (gmt 0)

1. What about if we set up our shopping cart to use a sub domain as recommnended.

2. From our shopping cart we set up all of our url's that go back to our main domain using absolute links.

3. We block our shopping cart through robots.txt.
4. We block all of our https pages through robots.txt.

Will the above work to solve our https duplicate issue?

tedster




msg:3214058
 12:15 am on Jan 10, 2007 (gmt 0)

Sounds like a perfect plan to me.

F_Rose




msg:3214814
 4:00 pm on Jan 10, 2007 (gmt 0)

Our IT guy is strongly against a subdomain, he claims it will involve a lot of work and aggrevations.

He recommends blocking the shopping cart through robots.txt.

All of our https pages are in the shopping cart directory.(which we want to block using robots.txt)

If we block our shopping cart directory will all the https pages be blocked from the bots as well?

tedster




msg:3214831
 4:07 pm on Jan 10, 2007 (gmt 0)

No, they will not.

F_Rose




msg:3214834
 4:09 pm on Jan 10, 2007 (gmt 0)

Could you please explain why they won't be blocked?

WW_Watcher




msg:3214864
 4:33 pm on Jan 10, 2007 (gmt 0)

IMHO
It is not enough just to block the search engine spiders from a directory that contains the pages you use for SSL.

Putting the SSL on it's own subdomain is the proper way to do it& then block the spiders on that subdomain.

For many of us that were clueless(the more I learn, the more I find out I do not know) when we built our websites & established our HTTPS pages on the same domain as our HTTP pages, in the long run we need to correct these issues, because of possibility of duplicate content issues. Just because you block the spiders from your HTTPS pages & the pages that link to those pages with the HTTPS urls, you are still leaving your site open for others to link to any page on your site with the HTTPS in the url & create the same duplicate content issues.(Think about that for a moment)

Remember, I am one of the clueless that did this. (theBear, have you been looking at my site? ;-) )

Back to Watching,
WW_Watcher

theBear




msg:3214935
 5:15 pm on Jan 10, 2007 (gmt 0)

F_Rose,

Once the https: like the www vs non-www bug bites you have a case where the pages on the domain link to others using the https: form so it will probably ( I haven't followed this exact situation long enough to say with certainity ) validate the url in Googles index as being valid and thus it will be recrawled. Thus the problem will continue.

There are other issues besides just the duplicated content at work.

WW_Watcher,

I've been working with 'puters for a living since 1969, unexpected results and errors are directly related in mathematical terms. Commiting an error is independent of subject area time in grade.

In other words stuff happens.

I also haven't been looking in on your site, has your situation improved or not?

F_Rose




msg:3214936
 5:16 pm on Jan 10, 2007 (gmt 0)

"Putting the SSL on it's own subdomain is the proper way to do it& then block the spiders on that subdomain. "

I was wondering how much weight Google puts on https pages and if it is a real major issue for duplicate contents.

Google should really know that https pages are part of the SSL and should really not consider it as duplicate.

My question is, is it worth the hassle of changing into a sub domain?

theBear




msg:3214942
 5:22 pm on Jan 10, 2007 (gmt 0)

I think Google calls it a url and treats it exactly like any other url. In other words a url is a url.

This 36 message thread spans 2 pages: 36 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved