Page is a not externally linkable
- Code, Content, and Presentation
-- Apache Web Server
---- Rewrite rule troubles


phpmaven - 1:34 pm on Jul 30, 2008 (gmt 0)


Thanks Jim for the clarification. You are a great help as always.

Perhaps I should explain why I'm doing this.

I'm using my mydm.com domain in my AdWords ads. I have the DNS setup on that domain so that I can create ads that have "keyword-keyword.mydm.com" as the display url and landing page. I'm just playing around to see how that affects my click through rate on some ads I'm testing. Obviously I don't want Google or Yahoo or MSN crawling those urls.

The full rules that I have are as follows:

RewriteCond %{HTTP_USER_AGENT} googlebot¦Msnbot¦Slurp [NC]
RewriteCond %{HTTP_USER_AGENT} !AdsBot-Google [NC]
RewriteCond %{HTTP_HOST} !^(www\.)?mydomain\.com [NC]
RewriteCond %{HTTP_HOST} !^(www\.)?mydm\.com [NC]
RewriteRule !bad_domain\.html$ bad_domain.html [L]

RewriteCond %{query_string} !AW=
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.mydomain\.com [NC]
RewriteRule ^(.*)$ [mydomain.com...] [R=301,L]

The ads all have the "AW=" as part of the query string so that they will be allowed through. Also I need to allow Google's AdBot through as well. Once the user hits the landing page any further requests get redirected.

I already have "Disallow: /*?" in my robots.txt file so that none of these urls get crawled, but if they come looking to crawl those *.mydm.com domains, I want to return a 404 so that they don't think they exist. The file "bad_domain.html" doesn't exist so it returns a 404.

If there is a better way to handle this or if you think I'm going to create problem for myself, please let me know what you think. I highly value your opinion. Like I said, I'm just experimenting and could easily turn this all of.

Thanks,

Mark


Thread source:: http://www.webmasterworld.com/apache/3710219.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com