Forum Moderators: open
I read with interest an old post about url forwarding
[webmasterworld.com...]
It comes close to answering my question... but not quite...
I have four similarly named urls registered with three different companies that I want to point to one place. The reason for having four is (at the moment anyway) simply so that people who go looking for me and type in the wrong address will still find me.
For obvious reasons (I hope) I only want the one that has the real content crawled: [example.com.au...] and I'm happy that bots and humans see the same thing (other than excluded directories).
The following may seem like a tangent but it seems to me to be relevant....
I used the following robot checker a few moments ago [searchengineworld.com...] to check the validity of my robots.txt on:
1) [example.com.au...] - no probs
....and for the hell of it I decided to check the other domians with interesting results....
2) [example.com...] (full forwarding including emails held with the same company as main site) - no probs - same file (although I don't think I want this - see below)
3) [example.net...] - hmmmmm - it read my index.htm and gave a lot of errors (of course)
4) [example.net.au...] - hmmmmm2 - found and happily validated a robots.txt file that I have never seen before that excludes f.php and f2.php - files I have never heard of.
Back to post
[webmasterworld.com...]
In that post WebGuerilla states "Yes, the 301 permanent redirect is seamless. Visitors are automatically forwarded to the new url. At the same time, spides are notified that the URL has been permantly moved to a new location."
My questions:
1) What do I ask for when I approach my domain companies? "Can I have a permanent 301 on that one please mate?"
2) How do I know they have done it? (as far as I can tell I cannot see any weblogs for the .net domains).
3) Perhaps the answer to (2) is to run the robots.txt cgi again although that did not appear to be a permanent 301 as it happily read my real robots.txt? If not - what is the explanation for the weird robots.txt that I see with the last of the above.
Many thanks, Sam Calder.
[edited by: WebGuerrilla at 6:14 am (utc) on June 13, 2003]
[edit reason] wigetized [/edit]
Welcome to WebmasterWorld
If the domains are set up through a registrar as pointers that forward to your actual site, you probablly won't be able to get 301's set up on those domains.
However, if your main site is running on Apache, you can use a little Mod Rewrite to serve the 301's on the receiving end.
With this kind of setup, the domains point to the IP of your main site. When a request for one of the type-in domains hits your server, the requesting agent is given a 301 and the proper domain is loaded. Visitors using one of these domains will get to your site, but they won't ever see the url they typed.
It's a good setup because it prevents a bot from indexing the same site under different urls.
here is what you would put in your .htaccess file if you have mod rewrite on your server:
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} domain1.com$ [OR]
RewriteCond %{HTTP_HOST} domain2.net$ [OR]
RewriteCond %{HTTP_HOST} domain3.com$ [OR]
RewriteCond %{HTTP_HOST} domain4.org$
RewriteRule ^(.+) [domain.com...] [L,R=301]
Welcome to WebmasterWorld [webmasterworld.com]!
First, find out why the robots.txt validator is getting confused by requesting robots.txt from that domain using the WebmasterWorld server headers checker [webmasterworld.com]. See if there is a misdirected redirect going on by comparing to your domains where robots.txt works normally.
Also, note what kind of server your hosted site is on if you don't know; The solutions available to you depend on your server type.
If there is a bad redirect, and it is being done by your domain registrar/parking service, then contact them with the details (which you'll get from the headers checker).
Once that is fixed, I suggest you use one of several options (depending on your server type) to intercept and check the requested {HTTP_HOST} on your server. For your "main" domain, do nothing. For the others, generate a 301-Moved Permanently redirect to your main domain name. This will "fix" all your potential problems with duplicate content and duplicate search engine listing penalties, and cause the search engines to focus all your domains' link popularity and page rank on the pages in your main domain, while letting your visitors with old bookmarks find you seamlessly.
Do a WebmasterWorld site search for "domain 301 redirect" for lots of useful threads.
HTH,
Jim
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} example.com$ [OR]
RewriteCond %{HTTP_HOST} example.net$ [OR]
RewriteCond %{HTTP_HOST} example.net.au$
RewriteRule ^(.+) [example.com.au...] [L,R=301]
?
I was actually going to post something on a very similar note to this.
I run a site which has many domains that end up in the same place... reason: I work for an ISP that has taken on some of these domains from ISP's that it has bought out, anyway, I'd like to, in the nicest way possible make sure everything goes to our main site so what I thought about doing was this:
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.preferreddomain\.sld\.tld [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^/(.*) http://www.preferreddomain.sld.tld/$1 [L,R=permanent]
does that look right to those of you that are apache gurus...
ie. its supposed to do the following: if the domain name doesn't match the preferred one, then do a permanent redirect to the preferred one.
Welcome to WebmasterWorld [webmasterworld.com]!
Your code looks correct, with two notes:
First, if you omit the [NC] flag from your first RewriteCond, you can correct "uppercase-URL-linking problems" before they start.
And second, the form of your RewriteRule is correct for use in httpd.conf. For use in per-directory .htaccess files, omit the leading slash from the pattern.
Ref: Introduction to mod_rewrite [webmasterworld.com]
Jim
You may need to precede the entire code block with an Options directive as shown below. In rare cases, adding this option causes an error - try it both ways.
I advise using the (www\.)? start-anchoring pattern shown; anchoring helps to speed up processing and also helps to simplify trapping a few common exploits. I also advise not end-anchoring the domains - I've have that cause problems, possibly due to port numbers being appended, i.e. www.example.com:80
If you wish to match periods in regular-expressions patterns, they must be escaped -- that is, they must be preceded by a backslash as shown in the domain names below. Only special characters in patterns need to be escaped to be treated as literals; this does not apply to the destination URL of the redirect.
A final note: Using ^(.+) as posted above may cause problems; Users who request only "/" from any domain will not be redirected. This may cause problems when they request your home page. I suggest using ^(.*)
Options +FollowSymLinks
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_HOST} ^(www\.)?example\.com [OR]
RewriteCond %{HTTP_HOST} ^(www\.)?example\.net [OR]
RewriteCond %{HTTP_HOST} ^(www\.)?example\.net\.au
RewriteRule ^(.*) http://www.example.com.au/$1 [L,R=301]
Ref: Introduction to mod_rewrite [webmasterworld.com]
Jim
I'll do a bit more reading (it's Saturday AM here in Oz) but the suggested redirects caused looping on all 4 domain names - at least I know mod_rewrite is enabled I guess :)
Perhaps this is one of the rare cases.
By the way - how do I know it's worked? Should all 3 in the header test above show 301? I ask as the example.com url (delegated on the site as the main site) shows 200 in the WW header check but 301 on samspade?
Thanks all, Sam.
> but the suggested redirects caused looping on all 4 domain names - at least I know mod_rewrite is enabled I guess.
Doh! I'm sorry - I am not at all used to working with ".au" - format URLs!
Oh well - In for a penny, in for a pound...
Options +FollowSymLinks
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_HOST} !^www\.example\.com\.au
RewriteRule ^(.*) http://www.example.com.au/$1 [L,R=301]
After installing this, re-check with Sam Spade. I really don't know; Sam Spade may be following a subsequent meta-refresh redirect on your main domain, or seeing action from one of your scripts... It depends on what's going on on your site, and I don't know. I would trust the WebmasterWorld server headers checker, simply because it is a straigthforward, simple script.
Jim