homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Meta tag Rel="canonical" & double content
How to use this?

 9:44 pm on May 18, 2012 (gmt 0)

I've read to put this tag on dup pages you don't want indexed. If BING is indexing the https and non-www version of certain pages, should I add this tag on the http://www.example.com/ page since the others don't exist (don't know how or why they are being picked up).



 12:11 am on May 19, 2012 (gmt 0)

No. Redirect the other versions to the canonical form. Make sure that for any unwanted request, the user then arrives at the correct URL for that content after exactly one redirect action and not a chain of more than one redirects.


 3:14 am on May 19, 2012 (gmt 0)

How do I redirect them if they do not exist?


 9:10 am on May 19, 2012 (gmt 0)

Exactly the same as if they did exist. The server neither knows nor cares, unless you have explicitly asked it to check whether a file exists. (The notorious !-f and !-d found in every boilerplate htaccess ever distributed.)

Now, we're really supposed to make you either figure it out for yourself or find one of the 62,000 earlier threads, but...

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

:: fingers crossed ::

This goes after all other redirects in your htaccess. Why? Because all those others will already say http://www.example.com as part of their targets. This final generic redirect is only to pick up the leftovers.


means "exactly www.example.com or exactly nothing". The "nothing" is to allow for HTTP/1.0 requests. The leading ! means "if the requested domain is not exactly" et cetera.


 5:26 pm on May 19, 2012 (gmt 0)

Redirects don't look at files on the server. A RewriteRule configured as a redirect merely looks at what URL was requested by the user or bot and sends a response back to that user to suggest they make a new request for a different URL.


 6:12 pm on May 19, 2012 (gmt 0)

I have the code Lucy24 provides in my htaccess. But BING is not recognizing it. That's the problem...


 7:28 pm on May 19, 2012 (gmt 0)

How do you know that Bing don't recognise it?

What do your server logs say?


 7:42 pm on May 19, 2012 (gmt 0)

Bc bing shows https and the non-www version indexed in WMT tools.


 8:15 pm on May 19, 2012 (gmt 0)

If they are anything like Google it can take three months or more for the data to catch up with reality.

[edited by: g1smd at 8:30 pm (utc) on May 19, 2012]


 8:28 pm on May 19, 2012 (gmt 0)

Frustrating b/c these htaccess files have been in place for years.


 8:31 pm on May 19, 2012 (gmt 0)

You should check the headers very carefully for errors for a variety of URL requests.


 8:40 pm on May 19, 2012 (gmt 0)

Is there a way to run through the entire site and check headers of all pages instead of checking one at a time?


 9:11 pm on May 19, 2012 (gmt 0)

Xenu Linksleuth might be useful (Windows).


 9:21 pm on May 19, 2012 (gmt 0)

I just ran my homepages through http header checker and my htaccess is not working properly. I am getting a 200 response for https://www.mysite.com. I tried stickying you g1smd to see if I can contract you to help me fix this. But your box is full.


 1:59 am on May 20, 2012 (gmt 0)

Oh, wait. https versus http is a different issue. It isn't covered by %{HTTP_HOST}. For this you need still another line: one looking at the protocol.

Please be assured that nobody-- not even Bing, no, not even google!-- can ignore htaccess. You can look in your raw logs and see them getting 301 or 403. But they don't instantly remove something from their index just because they can't get to it.

g1's mailbox is always full. Been full for years ;)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved