homepage Welcome to WebmasterWorld Guest from 54.211.34.105
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
.htaccess confusion
Battling to sort out my Sitemap...
Mikroz



 
Msg#: 4588565 posted 7:06 pm on Jun 28, 2013 (gmt 0)

Hi everyone,

Although I enjoy dabbling with .htaccess files, I'll readily and quickly admit to them defeating more often than not. Now is one of those times...

When you my website, the .htaccess in the root changes your destination to my blog, located at blog.example.com.

I have an .htaccess in the root and a robots.txt in the root. There is another one of each in the 'blog' folder.

ROOT .htaccess (applicable lines)

RewriteCond $1 !^(sitemap\.xml|sitemap\.xml\.gz|robots\.txt)$
# All other requests to...
RewriteRule ^(.*)$ http://blog.example.com/$1 [R=301,L]

# Sitemap redirection
Redirect 301 /sitemap.xml http://blog.example.com/sitemap.xml
Redirect 301 /sitemap.xml.gz http://blog.example.com/sitemap.xml.gz

ROOT robots.txt (applicable line)

#Sitemap
Sitemap: http://blog.example.com/sitemap.xml.gz

==============================

BLOG .htaccess (applicable lines)

# Sitemap redirection
Redirect 301 /sitemap.xml http://blog.example.com/sitemap.xml
Redirect 301 /sitemap.xml.gz http://blog.example.com/sitemap.xml.gz

BLOG robots.txt (applicable line)

# Sitemap
Sitemap: http://blog.example.com/sitemap.xml.gz


Can one of you brainiacs kindly help me decipher why I [or Google] cannot access my Sitemap, please?

[edited by: phranque at 7:12 pm (utc) on Jun 28, 2013]
[edit reason] unlinked & exemplified urls [/edit]

 

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4588565 posted 7:31 pm on Jun 28, 2013 (gmt 0)

what response are you getting?

you shouldn't mix mod_alias and mod_rewrite directives within the same configuration.

http://httpd.apache.org/docs/2.2/rewrite/avoid.html [httpd.apache.org]:
when there are Redirect and RewriteRule directives in the same scope, the RewriteRule directives will run first, regardless of the order of appearance in the configuration file.

Mikroz



 
Msg#: 4588565 posted 7:47 pm on Jun 28, 2013 (gmt 0)

Hi phranque,

Firefox gives me this:

Firefox has detected that the server is redirecting the request for this address in a way that will never complete.

You've lost me on the alias vs. rewrite bit... Isn't what I have a rewrite directive?

EDIT: I did read the link! Promise. :D

Thanks for the reply.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4588565 posted 8:01 pm on Jun 28, 2013 (gmt 0)

Why are you redirecting the sitemap? The chances that someone will have an old bookmark for the sitemap are vanishingly small. Just put its current actual URL in robots.txt. Or put the sitemap in the root; search engines will look for it there even if robots.txt doesn't say anything about it.

No reason to muck about with explicit .gz either. Let the server take care of compression if it feels so inclined.

The main problem is that your question mixes up two different things. One is the URL; the other is physical location. Your blog may live in a directory within a directory within a directory, but the only thing visitors-- including the googlebot-- need to know is that its URL is blog.example.com.

If you have a subdomain, you need one robots.txt at
www.example.com/robots.txt
and another at
blog.example.com/robots.txt
regardless of where they physically live. If the two robots.txt happen to be identical, you can rewrite -- NOT redirect -- requests for one so they point to the other. Not even the googlebot knows when it has been rewritten.

Same goes for sitemaps. As far as a visitor is concerned,
blog.example.com
is the root.

Mikroz



 
Msg#: 4588565 posted 8:09 pm on Jun 28, 2013 (gmt 0)

Thanks lucy. That makes a lot of sense to me.

My worry was that when Google came looking for the sitemap, at website.com, it wouldn't find it, so I was trying to tell it [Google] to look at blog.website.com for it. :)

The blog is powered by Wordpress, which uses a plugin to generate, optionally compress and list the sitemap, so it sits at the blog.website.com UR, not in the root.

The root and the blog subdomain run seperate robots files.

I will remove the .htaccess entries and see what happens.

Mikroz



 
Msg#: 4588565 posted 8:17 pm on Jun 28, 2013 (gmt 0)

It's working! :D

Thank you. Consider me well schooled.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved