Forum Moderators: phranque
http://www.example.com/architectural-photography/page.html
to:
http://example.com/photos/architecture/page/
So there are three changes going on in the URL...I already have a rewrite code to drop the www. from any reuqest hitting the site so that's taken care of:
# For use in document-root .htaccess
# Enable mod_rewrite
Options +FollowSymLinks
# Turn on the rewriting engine
RewriteEngine on
# If request Host header is non-blank (HTTP/1.0 requests don't send this header and #can't be redirected based on it)
RewriteCond %{HTTP_HOST} .
# And if requested domain is NOT the canonical domain
RewriteCond %{HTTP_HOST}!^example\.com
# redirect to requested page in canonical domain
RewriteRule (.*) http://example.com/$1 [R=301,L] And then there's the url rewrite code that my website adds to the htaccess page on the fly:
# BEGIN Url Rewrite section
# (Automatically generated. Do not edit this section)
<IfModule mod_rewrite.c>
Options +FollowSymlinks
RewriteEngine On RewriteBase /
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d [OR]
RewriteCond %{REQUEST_FILENAME} gallery\_remote2\.php
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . - [L]
RewriteCond %{THE_REQUEST} \ /sitemap(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=sitemap.Sitemap [QSA,L]
RewriteCond %{THE_REQUEST} \ /admin/(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.SiteAdmin [QSA,L]
RewriteCond %{THE_REQUEST} \ /photos/([^?]+)/([0-9]*)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.ShowItem&g2_path=%1&g2_page=%2 [QSA,L]
RewriteCond %{THE_REQUEST} \ /d/([0-9]+)-([0-9]+)/([^\/\?]+)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.DownloadItem&g2_itemId=%1&g2_serialNumber=%2&g2_fileName=%3 [QSA,L]
RewriteCond %{THE_REQUEST} \ /([^?]+)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=rewrite.FileNotFound [QSA,L]
RewriteCond %{THE_REQUEST} \ /([^?]+)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_controller=permalinks.Redirect&g2_filename=%1 [QSA,L]
</IfModule>
# END Url Rewrite section
When i add the 301 redirect code it works but then I get a page not found error anyway. Perhaps this url rewrite code is conflicting with the 301s?
RedirectPermanent /architectural-photography/nyc-rockefeller-atlas.html http://example.com/photos/architecture/nyc-rockefeller-atlas/
Keeping that in mind, is it possible to construct a rule which matches the pattern rather than line by line entry?
Also note that all Redirect and RedirectMatch directives will be executed before any mod_rewrite directives (on most servers). This is because the modules each read .htaccess in turn, and execute only the directives they recognize. So the execution order of directives belonging to different modules is set by the modules' load order, and not by the directives' relative position in your .htaccess file.
Jim
#1 Server Response: http://www.example.com/photos/cityscapes-skylines/woolworth-building.html
HTTP Status Code: HTTP/1.1 301 Moved Permanently
Date: Sat, 30 Sep 2006 03:49:44 GMT
Server: Apache/2.0.52 (CentOS)
Location: http://example.com/photos/cityscapes-skylines/woolworth-building.html
Content-Length: 368
Connection: close
Content-Type: text/html; charset=iso-8859-1
Redirect Target: http://example.com/photos/cityscapes-skylines/woolworth-building.html
#2 Server Response: http://example.com/photos/cityscapes-skylines/woolworth-building.html
HTTP Status Code: HTTP/1.1 301 Moved Permanently
Date: Sat, 30 Sep 2006 03:49:44 GMT
Server: Apache/2.0.52 (CentOS)
Location: http://example.com/photos/cityscapes-skylines/woolworth-building/
Content-Length: 360
Connection: close
Content-Type: text/html; charset=iso-8859-1
Redirect Target: http://example.com/photos/cityscapes-skylines/woolworth-building/
#3 Server Response: http://example.com/photos/cityscapes-skylines/woolworth-building/
HTTP Status Code: HTTP/1.1 200 OK
Date: Sat, 30 Sep 2006 03:49:45 GMT
Server: Apache/2.0.52 (CentOS)
X-Powered-By: PHP/4.3.9
Last-Modified: Sat, 30 Sep 2006 03:49:45 GMT
Connection: close
Content-Type: text/html; charset=UTF-8
The first rule redirects to the non www version then that version redirects again to the proper url. My hunch is that's probably not a good thing. Do you have any experience with that?
I found this post you wrote regarding another double 301 redirect:
"You are using directives from two different Apache modules, mod_alias, and mod_rewrite.Each of those modules will parse your .htaccess file in turn, executing the directives that it understands, with the order determined by the server configuration. Therefore, it makes no difference what order yo put your two directives -- re-arranging them does not change their execution order.
This applies to all other directives in config files -- You cannot control order of execution on a per-directive basis unless those directives are all implemented in the same module. The module execution order is set by the reverse LoadModule order in Apache 1.x, and by an internal priority scheme in Apache 2.x
The solution is to use either mod_alias or mod_rewrite, but not both, for sequence-critical rewrites and redirects. Since mod_alias cannot do conditional redirects and cannot do internal rewrites, I'd suggest using mod_rewrite for your folder redirect."
I suppose that this is the case here as well. Can you suggest a way to merge the two functions of the first rewrite rule:
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^example\.com
RewriteRule (.*) http://example.com/$1 [R=301,L]
And the second set:
RedirectMatch permanent /cityscapes-skylines/(.*)\.html http://example.com/photos/cityscapes-skylines/$1/
RedirectMatch permanent /photos/cityscapes-skylines/(.*)\.html http://example.com/photos/cityscapes-skylines/$1/
Thanks
http://example.com/photos/cityscapes-skylines/photos//
Here is a snapshot of what my htacess looked like when it returned the problem. I also tried removing the canonicalizing (a new word!) rewrite rules completely but got the same error:
# For use in document-root .htaccess
# Enable mod_rewrite
Options +FollowSymLinks
# Turn on the rewriting engine
RewriteEngine on
RewriteRule ^(photos/)?cityscapes-skylines/([^.]+)\.html$ http://example.com/photos/cityscapes-skylines/$1/ [R=301,L]
# If request Host header is non-blank (HTTP/1.0 requests don't send this header and #can't be redirected based on it)
RewriteCond %{HTTP_HOST} .
# And if requested domain is NOT the canonical domain
RewriteCond %{HTTP_HOST}!^example\.com
# redirect to requested page in canonical domain
RewriteRule (.*) http://example.com/$1 [R=301,L]
# BEGIN Url Rewrite section
# (Automatically generated. Do not edit this section)
<IfModule mod_rewrite.c>
Options +FollowSymlinks
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d [OR]
RewriteCond %{REQUEST_FILENAME} gallery\_remote2\.php
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . - [L]
RewriteCond %{THE_REQUEST} \ /sitemap(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=sitemap.Sitemap [QSA,L]
RewriteCond %{THE_REQUEST} \ /admin/(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.SiteAdmin [QSA,L]
RewriteCond %{THE_REQUEST} \ /photos/([^?]+)/([0-9]*)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.ShowItem&g2_path=%1&g2_page=%2 [QSA,L]
RewriteCond %{THE_REQUEST} \ /d/([0-9]+)-([0-9]+)/([^\/\?]+)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.DownloadItem&g2_itemId=%1&g2_serialNumber=%2&g2_fileName=%3 [QSA,L]
</IfModule>
# END Url Rewrite section
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
<Files ~ "\.(inc¦class)$">
Deny from all
</Files>
#<Limit GET POST>
#order deny,allow
#deny from all
#allow from all
#</Limit>
#<Limit PUT DELETE>
#order deny,allow
#deny from all
#</Limit>
AuthName example.com
AddHandler server-parsed .html
AddHandler server-parsed .htm
DirectoryIndex index.php index.htm index.html
I suggest you experiment with the RewriteRule to correct it, since it's an easy fix.
For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].
Jim
Was the fix to change $1 to $2?
RewriteRule ^(photos/)?cityscapes-skylines/([^.]+)\.html$ http://example.com/photos/cityscapes-skylines/$2/ [R=301,L]
That seems to work...but there's another issue now. The last set of mod alias rules produced two 301 redirects and a 200 OK response. This rule generates one 301 redirect and no 200 OK, even though the page is loading fine. Is it necessary to get to 200 OK response every time? I am using the seoconsultants server header check tool. Thanks again for the help
Use the Live HTTP Headers [livehttpheaders.mozdev.org] extension for Firefox/Mozilla/SeaMonkey/Netscape... I *know* it reports all transactions, and it will likely show you the 'missing' 200-OK response -- The browser can't show a page without a 200-OK, 304-Not Modified, or *some* server response, so the fact that this second response is not shown just indicates a shortcoming in the headers tool.
Jim
[mysite.com...] OK
[mysite.com...] BAD
When I try a RedirectMatch it also doesn't work. It seems like the URL is being translated fine, but the site grabs anything in that directory and adds:
?g2_view=core.ShowItem&g2_path=new-york-city&g2_page=
Those directories are the most important ones of all to redirect since google is showing that direcory as having the highest page rank on the site now. Also, for some reason, today all the pages in Google's index for my site went supplemental. The first listing is [mysite.com...] and the second is [mysite.com....] Even though I have the rewrite rule to merge the two google is seeing two sites and penalizing me for dupe content! Here's what my htaccess looks like now:
# For use in document-root .htaccess
# Enable mod_rewrite
Options +FollowSymLinks
# Turn on the rewriting engine
RewriteEngine on
RewriteRule ^(photos/)?cityscapes-skylines/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?skylines-cityscapes/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?architectural-photography/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?architecture/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?landscape-photography/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?landscapes/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?new-york-photography/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?new-york-city/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?new-york/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?washington-dc-photography/([^.]+)\.html$ [mysite.com...] [R=301,L]
RewriteRule ^(photos/)?washington-dc/([^.]+)\.html$ [mysite.com...] [R=301,L]
# If request Host header is non-blank (HTTP/1.0 requests don't send this header and #can't be redirected based on it)
RewriteCond %{HTTP_HOST} .
# And if requested domain is NOT the canonical domain
RewriteCond %{HTTP_HOST}!^mysite.com\.com
# redirect to requested page in canonical domain
RewriteRule (.*) [mysite.com.com...] [R=301,L]
# BEGIN Url Rewrite section
# (Automatically generated. Do not edit this section)
<IfModule mod_rewrite.c>
Options +FollowSymlinks
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d [OR]
RewriteCond %{REQUEST_FILENAME} gallery\_remote2\.php
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . - [L]
RewriteCond %{THE_REQUEST} \ /sitemap(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=sitemap.Sitemap [QSA,L]
RewriteCond %{THE_REQUEST} \ /admin/(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.SiteAdmin [QSA,L]
RewriteCond %{THE_REQUEST} \ /photos/([^?]+)/([0-9]*)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.ShowItem&g2_path=%1&g2_page=%2 [QSA,L]
RewriteCond %{THE_REQUEST} \ /d/([0-9]+)-([0-9]+)/([^\/\?]+)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.DownloadItem&g2_itemId=%1&g2_serialNumber=%2&g2_fileName=%3 [QSA,L]
</IfModule>
# END Url Rewrite section
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
<Files ~ "\.(inc¦class)$">
Deny from all
</Files>
#<Limit GET POST>
#order deny,allow
#deny from all
#allow from all
#</Limit>
#<Limit PUT DELETE>
#order deny,allow
#deny from all
#</Limit>
AuthName mysite.com
AddHandler server-parsed .html
AddHandler server-parsed .htm
DirectoryIndex index.php index.htm index.html
RedirectMatch permanent ^/architectural-photography/$ [mysite.com...]
RedirectMatch permanent ^/landscape-photography/$ [mysite.com...]
RedirectMatch permanent ^/cityscapes-skylines/$ [mysite.com...]
RedirectMatch permanent ^/skylines-cityscapes/$ [mysite.com...]
RedirectMatch permanent ^/new-york-photography/$ [mysite.com...]
RedirectMatch permanent ^/new-york-photography/skylines/$ [mysite.com...]
RedirectMatch permanent ^/washington-dc-photography/$ [mysite.com...]
RedirectMatch permanent ^/photos/new-york-city/skylines/$ [mysite.com...]
RedirectMatch permanent ^/photos/new-york-city/$ [mysite.com...]
And because your RedirectMatch directives do not contain a "photos" or ".html" pattern, they won't apply to the URLs with "photos" in them, or to URLs with ".html" on them.
Jim
It appears that that is because the URLs with no ".html" on the end are redirected by the (redundant) RedirectMatch directives at the end of your file, and so neither of your example URLs will be handled by mod_rewrite.And because your RedirectMatch directives do not contain a "photos" or ".html" pattern, they won't apply to the URLs with "photos" in them, or to URLs with ".html" on them.
The redirect match directives are not redundant...when I remove them the urls that they handle stop being redirected. The redirects at the top don't seem to handle them. Also, if you look at the last RedirectMatch it does have the "photos/" in the directory and it still isn't working. Or at least, the URL is matched properly but for some reason it triggers the appended dynamic url at the end. I am wondering if this is more an issue with the app than with URL rewrite. Should the last Redirect Match directive work or does the "/photos/" need to be isolated somehow? I looked for that in the tutorial but didn't see it.
The scope of the posted code is too big for a forum venue. Please narrow down your code to only those lines related to problems with one or two URL-types, then test the reduced code and ask about problems with that -- Divide and conquer.
Jim
#1 Server Response: [mysite.com...]
HTTP Status Code: HTTP/1.1 301 Moved Permanently
Date: Thu, 05 Oct 2006 22:14:18 GMT
Server: Apache/2.0.52 (CentOS)
Location: [mysite.com...]
Content-Length: 339
Connection: close
Content-Type: text/html; charset=iso-8859-1
Redirect Target: [mysite.com...]
#2 Server Response:
[mysite.com...]
HTTP Status Code: HTTP/1.1 301 Moved Permanently
Date: Thu, 05 Oct 2006 22:14:18 GMT
Server: Apache/2.0.52 (CentOS)
Location: [mysite.com...]
Content-Length: 391
Connection: close
Content-Type: text/html; charset=iso-8859-1
Redirect Target: [mysite.com...]
#3 Server Response: [mysite.com...]
HTTP Status Code: HTTP/1.1 404 Not Found
Date: Thu, 05 Oct 2006 22:14:18 GMT
Server: Apache/2.0.52 (CentOS)
X-Powered-By: PHP/4.3.9
Content-Length: 972
Connection: close
Content-Type: text/html; charset=ISO-8859-1
I have turned off any module handling permalinks, etc. The only thing I can think of is that I have the directory "photos" set in the URL rewrite module for every page view. The pages don't actaually reside in a directory called photos, the rewrite module just ads the directory to the URL. It's prbably the apps code that is conflicting not our rewrites. I suspect it's this:
RewriteCond %{THE_REQUEST} \ /photos/([^?]+)/([0-9]*)(\?.¦\ .)
RewriteCond %{REQUEST_URI}!/index\.php$
RewriteRule . /index.php?g2_view=core.ShowItem&g2_path=%1&g2_page=%2 [QSA,L]
Since what is being appended is this:
?g2_view=core.ShowItem&g2_path=new-york-city&g2_page=
So, the question is, what is then doing another external 301 redirect after that internal rewrite, thus 'exposing' the query string to the client?
I may have missed it, but I see no redirects in your code that would affect requests for index.php.
Be certain that the domain name in your headers report is always correct -- If you have UseCanonicalName on, and the 'canonical name' it 'uses' is not the one your canonical domain redirect is set up to redirect to, then that is one thing that could be exposing the query; UseCanonicalName could force a redirect to the 'wrong' canonical domain, in other words, and then your code would redirect it back... Just a possibility, and it would be obvious in the header checker report if we didn't obscure domain names here.
Jim