Forum Moderators: phranque

Message Too Old, No Replies

Need help fine-tuning .htaccess file

         

uk_martin

1:39 pm on Jan 20, 2011 (gmt 0)

10+ Year Member



Hi

I have the Error Pages MOD installed on the forum of my site, but I am having problems making it work properly with "Short URL's". The problem seems to stem from the htaccess file in the root of my site.

My forum is in the "/main" directory.

If I currently type in this correct short URL - [brummiesfans.com...] it will correctly redirect to [brummiesfans.com...]

If I type in [brummiesfans.com...] which is a "nonsense" (short) URL, I correctly get the "nice" 404 page, which is what the MOD creates.

HOWEVER, if I type in [brummiesfans.com...] I get a totally messed up page (see for yourself)

In the root of my site, the htaccess file is:
]
RewriteEngine on
RewriteCond %{HTTP_HOST} ^brummiesfans.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.brummiesfans.com$
RewriteRule ^blogs$ "http\:\/\/www\.brummiesfans\.com\/main\/blog\.php" [R=301,L]

<Files "main/config.php">
Order Allow,Deny
Deny from All
</Files>

<Files "main/common.php">
Order Allow,Deny
Deny from All
</Files>

# th23 start - error pages
ErrorDocument 400 /main/error.php?e=400
ErrorDocument 401 /main/error.php?e=401
ErrorDocument 403 /main/error.php?e=403
ErrorDocument 404 /main/error.php?e=404
ErrorDocument 500 /main/error.php?e=500
# th23 end - error pages


What I did notice, a while ago, when experimenting with the SEO MOD, I had a completely different htaccess file. IF I were to use the following htaccess file, something in it, would stop the nonsense page from appearing, and instead the "nice" 404 page would appear.
<Files "config.php">
Order Allow,Deny
Deny from All
</Files>

<Files "common.php">
Order Allow,Deny
Deny from All
</Files>

<IfModule mod_rewrite.c>
RewriteEngine on

# th23 start - error pages
ErrorDocument 400 /main/error.php?e=400
ErrorDocument 401 /main/error.php?e=401
ErrorDocument 403 /main/error.php?e=403
ErrorDocument 404 /main/error.php?e=404
ErrorDocument 500 /main/error.php?e=500
# th23 end - error pages

Rewriterule ^blog/(.+)/(.+).html$./blog/view/blog.php?page=$1&mode=$2 [NC]
Rewriterule ^blog/(.+).html$./blog/blog.php?page=$1 [NC]
Rewriterule ^blog/(.+)/$./blog/view/blog.php?page=$1 [NC]
Rewriterule ^blog/$./blog/blog.php [NC]

RewriteCond %{REQUEST_FILENAME} !-f
Rewriterule ^blog/(.+)/(.+)$./blog/view/blog.php?page=$1&mode=$2 [NC]

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^blog/(.+)$ ./blog/blog.php?page=$1 [NC]
</IfModule>


DirectoryIndex portal.php index.php index.html index.htm

# You may need to un-comment the following lines
# Options +FollowSymlinks
# To make sure that rewritten dir or file (/|.html) will not load dir.php in case it exist
# Options -MultiViews
# REMEBER YOU ONLY NEED TO STARD MOD REWRITE ONCE
RewriteEngine On
# REWRITE BASE
RewriteBase /
# HERE IS A GOOD PLACE TO FORCE CANONICAL DOMAIN
# RewriteCond %{HTTP_HOST} !^www\.brummiesfans\.com$ [NC]
# RewriteRule ^(.*)$ http://www.brummiesfans.com/$1 [QSA,L,R=301]

# DO NOT GO FURTHER IF THE REQUESTED FILE / DIR DOES EXISTS
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule . - [L]
#####################################################
# PHPBB SEO REWRITE RULES ALL MODES
#####################################################
# AUTHOR : dcz www.phpbb-seo.com
# STARTED : 01/2006
#################################
# FORUMS PAGES
###############
# FORUM INDEX
RewriteRule ^forum\.html$ /index.php [QSA,L,NC]
# FORUM ALL MODES
RewriteRule ^(forum|[a-z0-9_-]*-f)([0-9]+)/?(page([0-9]+)\.html)?$ /viewforum.php?f=$2&start=$4 [QSA,L,NC]
# TOPIC WITH VIRTUAL FOLDER ALL MODES
RewriteRule ^(forum|[a-z0-9_-]*-f)([0-9]+)/(topic|[a-z0-9_-]*-t)([0-9]+)(-([0-9]+))?\.html$ /viewtopic.php?f=$2&t=$4&start=$6 [QSA,L,NC]
# GLOBAL ANNOUNCES WITH VIRTUAL FOLDER ALL MODES
RewriteRule ^announces/(topic|[a-z0-9_-]*-t)([0-9]+)(-([0-9]+))?\.html$ /viewtopic.php?t=$2&start=$4 [QSA,L,NC]
# TOPIC WITHOUT FORUM ID & DELIM ALL MODES
RewriteRule ^([a-z0-9_-]*)/?(topic|[a-z0-9_-]*-t)([0-9]+)(-([0-9]+))?\.html$ /viewtopic.php?forum_uri=$1&t=$3&start=$5 [QSA,L,NC]
# PHPBB FILES ALL MODES
RewriteRule ^resources/[a-z0-9_-]+/(thumb/)?([0-9]+)$ /download/file.php?id=$2&t=$1 [QSA,L,NC]
# PROFILES THROUGH USERNAME
RewriteRule ^member/([^/]+)/?$ /memberlist.php?mode=viewprofile&un=$1 [QSA,L,NC]
# USER MESSAGES THROUGH USERNAME
RewriteRule ^member/([^/]+)/(topics|posts)/?(page([0-9]+)\.html)?$ /search.php?author=$1&sr=$2&start=$4 [QSA,L,NC]
# GROUPS ALL MODES
RewriteRule ^(group|[a-z0-9_-]*-g)([0-9]+)(-([0-9]+))?\.html$ /memberlist.php?mode=group&g=$2&start=$4 [QSA,L,NC]
# POST
RewriteRule ^post([0-9]+)\.html$ /viewtopic.php?p=$1 [QSA,L,NC]
# ACTIVE TOPICS
RewriteRule ^active-topics(-([0-9]+))?\.html$ /search.php?search_id=active_topics&start=$2&sr=topics [QSA,L,NC]
# UNANSWERED TOPICS
RewriteRule ^unanswered(-([0-9]+))?\.html$ /search.php?search_id=unanswered&start=$2&sr=topics [QSA,L,NC]
# NEW POSTS
RewriteRule ^newposts(-([0-9]+))?\.html$ /search.php?search_id=newposts&start=$2&sr=topics [QSA,L,NC]
# THE TEAM
RewriteRule ^the-team\.html$ /memberlist.php?mode=leaders [QSA,L,NC]
# HERE IS A GOOD PLACE TO ADD OTHER PHPBB RELATED REWRITERULES

#####################################################
# GYM Sitemaps & RSS
# Global channels
RewriteRule ^main/rss(/(news)+)?(/(digest)+)?(/(short|long)+)?/?$ /gymrss.php?channels&$2&$4&$6 [QSA,L,NC]
# HTML Global news & maps
RewriteRule ^main/(news|maps)/?(page([0-9]+)\.html)?$ /map.php?$1&start=$3 [QSA,L,NC]
# END GYM Sitemaps & RSS
#####################################################

# FORUM WITHOUT ID & DELIM ALL MODES (SAME DELIM)
# THESE THREE LINES MUST BE LOCATED AT THE END OF YOUR HTACCESS TO WORK PROPERLY
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-z0-9_-]+)/?(page([0-9]+)\.html)?$ /viewforum.php?forum_uri=$1&start=$3 [QSA,L,NC]
# FIX RELATIVE PATHS : FILES
RewriteRule ^.+/(style\.php|ucp\.php|mcp\.php|faq\.php|download/file.php)$ /$1 [QSA,L,NC,R=301]
# FIX RELATIVE PATHS : IMAGES
RewriteRule ^.+/(styles/.*|images/.*)/$ /$1 [QSA,L,NC,R=301]
# END PHPBB PAGES
#####################################################

#####################################################
# GYM Sitemaps & RSS
# HTML Module additional modes
RewriteRule ^main/(news|maps)/([a-z0-9_-]+)(/([a-z0-9_-]+))?/?(page([0-9]+)\.html)?$ /map.php?$2=$4&$1&start=$6 [QSA,L,NC]
# Main feeds & channels
RewriteRule ^main/rss(/(news)+)?(/(digest)+)?(/(short|long)+)?(/([a-z0-9_-]+))?/([a-z0-9_]+)\.xml(\.gz)?$ /gymrss.php?$9=$8&$2&$4&$6&gzip=$10 [QSA,L,NC]
# Module feeds
RewriteRule ^main/[a-z0-9_-]*-[a-z]{1,2}([0-9]+)(/(news)+)?(/(digest)+)?(/(short|long)+)?/([a-z0-9_]+)\.xml(\.gz)?$ /gymrss.php?$8=$1&$3&$5&$7&gzip=$9 [QSA,L,NC]
# Module feeds without ids
RewriteRule ^main/([a-z0-9_-]+)(/(news)+)?(/(digest)+)?(/(short|long)+)?/([a-z0-9_]+)\.xml(\.gz)?$ /gymrss.php?nametoid=$1&$3&$5&$7&modulename=$8&gzip=$9 [QSA,L,NC]
# Google SitemapIndex
RewriteRule ^main/sitemapindex\.xml(\.gz)?$ /sitemap.php?gzip=$1 [QSA,L,NC]
# Module cat sitemaps
RewriteRule ^main/[a-z0-9_-]+-([a-z]{1,2})([0-9]+)\.xml(\.gz)?$ /sitemap.php?module_sep=$1&module_sub=$2&gzip=$3 [QSA,L,NC]
# Module sitemaps
RewriteRule ^main/([a-z0-9_]+)-([a-z0-9_-]+)\.xml(\.gz)?$ /sitemap.php?$1=$2&gzip=$3 [QSA,L,NC]
# END GYM Sitemaps & RSS
#####################################################

#########################################################
# ALBUM REWRITE RULES #
#########################################################
# AUTHOR : dcz http://www.phpbb-seo.com/
# STARTED : 2009/01/15
########################
# ALBUM INDEX
RewriteRule ^main/gallery/album.html$ /gallery/index.php [QSA,L,NC]
# ALBUM PERSONAL INDEX
RewriteRule ^main/gallery/user-albums/?(page([0-9]+)\.html)?$ /gallery/index.php?mode=personal&start=$2 [QSA,L,NC]
# ALBUM
RewriteRule ^main/gallery/[a-z0-9_-]*-a([0-9]+)/?(page([0-9]+)\.html)?$ /gallery/album.php?album_id=$1&start=$3 [QSA,L,NC]
# PIC PAGE
RewriteRule ^main/gallery/[a-z0-9_-]*(-a([0-9]+)/)?[a-z0-9_-]*-p([0-9]+)(-([0-9]+))?\.html$ /gallery/image_page.php?album_id=$2&image_id=$3&start=$5 [QSA,L,NC]
# JGP
RewriteRule ^main/gallery/[a-z0-9_-]*(-a([0-9]+)/)?[a-z0-9_-]*-i([0-9]+)\.jpg$ /gallery/image.php?album_id=$2&image_id=$3 [QSA,L,NC]
# JPG THUMBNAILS
RewriteRule ^main/gallery/[a-z0-9_-]*(-a([0-9]+)/)?[a-z0-9_-]*-t([0-9]+)\.jpg$ /gallery/image.php?mode=thumbnail&album_id=$2&image_id=$3 [QSA,L,NC]
# JPG MEDIUM
RewriteRule ^main/gallery/[a-z0-9_-]*(-a([0-9]+)/)?[a-z0-9_-]*-m([0-9]+)\.jpg$ /gallery/image.php?mode=medium&album_id=$2&image_id=$3 [QSA,L,NC]
#########################################################

#########################################################
# ALBUM REWRITE RULES#
#########################################################
# AUTHOR : dcz http://www.phpbb-seo.com/
# STARTED : 2009/01/15
########################
# ALBUM INDEX
RewriteRule ^gallery/album.html$ /gallery/index.php [QSA,L,NC]
# ALBUM PERSONAL INDEX
RewriteRule ^gallery/user-albums/?(page([0-9]+)\.html)?$ /gallery/index.php?mode=personal&start=$2 [QSA,L,NC]
# ALBUM
RewriteRule ^gallery/[a-z0-9_-]*-a([0-9]+)/?(page([0-9]+)\.html)?$ /gallery/album.php?album_id=$1&start=$3 [QSA,L,NC]
# PIC PAGE
RewriteRule ^gallery/[a-z0-9_-]*(-a([0-9]+)/)?[a-z0-9_-]*-p([0-9]+)(-([0-9]+))?\.html$ /gallery/image_page.php?album_id=$2&image_id=$3&start=$5 [QSA,L,NC]
# JGP
RewriteRule ^/gallery/[a-z0-9_-]*(-a([0-9]+)/)?[a-z0-9_-]*-i([0-9]+)\.jpg$ /gallery/image.php?album_id=$2&image_id=$3 [QSA,L,NC]
# JPG THUMBNAILS
RewriteRule ^gallery/[a-z0-9_-]*(-a([0-9]+)/)?[a-z0-9_-]*-t([0-9]+)\.jpg$ /gallery/image.php?mode=thumbnail&album_id=$2&image_id=$3 [QSA,L,NC]
# JPG MEDIUM
RewriteRule ^gallery/[a-z0-9_-]*(-a([0-9]+)/)?[a-z0-9_-]*-m([0-9]+)\.jpg$ /gallery/image.php?mode=medium&album_id=$2&image_id=$3 [QSA,L,NC]
#########################################################
#
# Uncomment the statement below if you want to make use of
# HTTP authentication and it does not already work.
# This could be required if you are for example using PHP via Apache CGI.
#
#<IfModule mod_rewrite.c>
#RewriteEngine on
#RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]
#</IfModule>

<IfModule mod_rewrite.c>
RewriteEngine on


# For redirecting www.yourdomain.com to yourdomain.com,
# uncomment the following 2 lines and edit domain
#RewriteCond %{HTTP_HOST} ^www.yourdomain.com$
#RewriteRule ^(.*)/?$ http://yourdomain.com/$1 [QSA,R=301]

# For redirecting yourdomain.com to www.yourdomain.com,
# uncomment the following 2 lines and edit domain
#RewriteCond %{HTTP_HOST} ^yourdomain.com$
#RewriteRule ^(.*)/?$ http://www.yourdomain.com/$1 [QSA,R=301]
</IfModule>

<Files "config.php">
Order Allow,Deny
Deny from All
</Files>

<Files "common.php">
Order Allow,Deny
Deny from All
</Files>


The problem with this htaccess file though, is that it would not allow any other short URL's to work at all. With this htaccess file, if you typed in a short URL would also get a "nice" 404 page.

What I would like, is if someone who knows about htaccess could take what it is that redirects a nonsense "root" (short) URL (we know that the "/main/nonsense" URL's are OK) and point them at the "nice" 404 pages, whilst still allowing short URL's to continue to work.

Can it be done?

I hope so, and look forward to hearing from anyone who can help.

Thanks

Martin

g1smd

7:08 pm on Jan 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There are a number of serious issues with the file above, which should be cleared up before thinking about anything else:

- RewriteEngine On should appear just once, right at the beginning,
- there should be no "escaping" in target URLs, only in patterns,
- all of the external redirects must be listed before any of the internal rewrites,
- every rule should have the [L] flag added,
- QSA is the default action; you only need QSA when you have specified an alternative query string value within the rule and you want to re-append the original data,
- the <ifmodule> logic should be dumped, but keeping the rules so enclosed.

uk_martin

10:42 pm on Jan 20, 2011 (gmt 0)

10+ Year Member



Thanks very much for the reply. I start from a knowledgebase of zero on matters to do with htaccess. Various scripts add things in to the htaccess file, and as long as the site is visible, I have never touched it.

That said, the "big" htaccess file has been dumped. Now I have only the following:

RewriteEngine on

# th23 start - error pages
ErrorDocument 400 /main/error.php?e=400
ErrorDocument 401 /main/error.php?e=401
ErrorDocument 403 /main/error.php?e=403
ErrorDocument 404 /main/error.php?e=404
ErrorDocument 500 /main/error.php?e=500
# th23 end - error pages

<Files "main/config.php">
Order Allow,Deny
Deny from All
</Files>

<Files "main/common.php">
Order Allow,Deny
Deny from All
</Files>

RewriteCond %{HTTP_HOST} ^brummiesfans.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.brummiesfans.com$
RewriteRule ^blogs$ "http\:\/\/www\.brummiesfans\.com\/main\/blog\.php" [R=301,L]
RewriteCond %{HTTP_HOST} ^brummiesfans.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.brummiesfans.com$
RewriteRule ^shop$ "http\:\/\/www\.brummiesfans\.com\/main\/ppshop\.php" [R=301,L]


So addressing your advice:

- RewriteEngine is on, only once now.
- I don't know what you mean by "escaping" in target URL's in this context.
- Are the ErrorDocument lines classified as "external redirects"? In which case they are there before the short URL's which I take it are the "internal rewrites" (I'm learning the terminology as I go with this lol)
- By "[L] Flag" I take it that you mean something like the [R=301,L]? In which case every line that begins with "RewriteRule" does now end with [R=301,L], so that's OK
- There are no more <ifmodule>'s

HOWEVER, I'm still in the position that the /main/error.php produces a very pleasant "404" page if you type in - www.brummiesfans.com/main/nonsense - whereas typing in just www.brummiesfans.com/nonsense produces a broken page.

Having cleaned up the htaccess to something which I hope is (or is nearing) a better start point, what can be done to it to resolve the issue of fixinb the broken "404" page?

Thanks again in anticipation.

Martin

g1smd

11:54 pm on Jan 20, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Escaping. Delete all the \ from target URLs; i.e. delete from here:
http\:\/\/www\.brummiesfans\.com\/main\/blog\.php

Additionally, this code:

RewriteCond %{HTTP_HOST} ^brummiesfans.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.brummiesfans.com$
RewriteRule ^blogs$ "http\:\/\/www\.brummiesfans\.com\/main\/blog\.php" [R=301,L]
RewriteCond %{HTTP_HOST} ^brummiesfans.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.brummiesfans.com$
RewriteRule ^shop$ "http\:\/\/www\.brummiesfans\.com\/main\/ppshop\.php" [R=301,L]


simplifies to:

RewriteRule ^blog$ http://www.example.com/main/blog.php [R=301,L]
RewriteRule ^shop$ http://www.example.com/main/ppshop.php [R=301,L]


You should add the standard non-www to www redirect code after the above redirects.

ErrorDocument directives can go anywhere. Just keep them all together in one block.

Yes to the [L] flag. Make sure there is one, in, or as part of, every RewriteRule.

[edited by: g1smd at 12:31 am (utc) on Jan 21, 2011]

wilderness

12:08 am on Jan 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



don't know what you mean by "escaping" in target URL's


From the Forum Library [webmasterworld.com] Mod Rewrite & Regular Expressions [webmasterworld.com]
Quote:
\ is called an escaping character, this removes the function from a 'special character' (EG if you needed to match index.php?, which has both a . (dot) and a ?, you would have to 'escape' the special characters . (dot) and ? with a \ to remove their 'special' value it looks like this: index\.php\?)
end of quote

Escaping characters is require in all lines that begin with "RewriteCond" (and perhaps even other instances.

Lines which begin with "RewriteRule" and define URL's do NOT require escaping.

Thus your RewriteRule lines are in error and need correcting.
EX (please note; domain name changed within forum policy):
RewriteRule ^shop$ "http://www.example.com/main/ppshop.php" [R=301,L]
end of quote

You'll need to correct the other RewriteRule URL.

uk_martin

11:26 am on Jan 21, 2011 (gmt 0)

10+ Year Member



Thanks for the latest info.

The code was actually created by the site host's "Redirect" facility in CPanel. It can delete as well as create redirects, so I'll have to remember to delete any of these redirects manually in future if it fails to recognise what is there, now that it's getting changed from how it was written into the htaccess file...

...anyway, on with the changes...

uk_martin

12:00 pm on Jan 21, 2011 (gmt 0)

10+ Year Member



OK, so how's this looking, apart from a lot tidier?

  RewriteEngine on

# th23 start - error pages
ErrorDocument 400 /main/error.php?e=400
ErrorDocument 401 /main/error.php?e=401
ErrorDocument 403 /main/error.php?e=403
ErrorDocument 404 /main/error.php?e=404
ErrorDocument 500 /main/error.php?e=500
# th23 end - error pages

<Files "main/config.php">
Order Allow,Deny
Deny from All
</Files>

<Files "main/common.php">
Order Allow,Deny
Deny from All
</Files>

RewriteRule ^announcement$ http://www.brummiesfans.com/main/viewforum.php?f=2 [R=301,L]
RewriteRule ^arcade$ http://www.brummiesfans.com/main/arcade.php [R=301,L]
RewriteRule ^announcements$ http://www.brummiesfans.com/main/viewforum.php?f=2 [R=301,L]

RewriteCond %{HTTP_HOST} ^brummiesfans.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.brummiesfans.com$


I hope I've interperted correctly what you meant by the standard www or non-www code. It seems to be working anyway.

Thanks again.

g1smd

7:42 pm on Jan 21, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Replace last two lines with:

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]



The two "announcement(s)" lines can be replaced with one.

RewriteRule ^announcements?$ http://www.example.com/main/viewforum.php?f=2 [R=301,L]


The question mark makes the "s" optional.

uk_martin

12:11 am on Jan 22, 2011 (gmt 0)

10+ Year Member



Ahh-haa! automatic plural handling. Excellent. What you didn't know is that the three short URL redirect's listed are just a sample of about 40 in total (no point in boring you with all of them on here), many of which are duplicated to deal with plurals, so this will be a great time and effort saver for the future.

So just to clarify then, the rewrite rule should state the plural of a word, with the "?" after the "s", thereby making the "s" that preceeds the "?" an optional letter?

Any word on what it is that can make the "nice" 404 page appear if a nonsense URL pointing at the root of the domain is typed into the browser, as described in the first post?

Thanks again

Martin

g1smd

12:57 am on Jan 22, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Automatic "anything-you-like" handling:

jpe?g matches jpg and jpeg.

s?html? matches htm, html, shtm and shtml.

t(his|hat) matches this or that.

[cpt]hat matches chat, phat and that.

Always look for ways to simplify multiple rules.

uk_martin

6:07 pm on Jan 23, 2011 (gmt 0)

10+ Year Member



Good to know. Thanks for the tips.

Now for the big one...finding out what it takes to redirect all the 404 pages to the correct "friendly 404! page that was in the original post.

I could realy do with that as the site software has been upgraded with changes needed to the URL structure. Anyone using old bookmarks may inadvertantly head to the wrong page, so I'd rather they got the friendly 404 page with an active menu structure, than the nonsense page that 404 ccalls currently come up with.

Thanks

Martin

g1smd

8:56 pm on Jan 23, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What is the real location of the friendly error page inside the server?

Use this code to serve it for a 404 error:

ErrorDocument 404 /that-location


You do NOT ever "redirect" to an error page. That would serve a 301 status followed by a 200 OK status.

You must directly serve a 404 status for pages not found.

uk_martin

10:49 pm on Jan 23, 2011 (gmt 0)

10+ Year Member



/main/error.php is a file that creates the nice looking "404" pages etc. You can see one of these by typing in www.brummiesfans.com/main/nonsense (or any other incorrect URL in the /main/ directory

The lines in the htaccess file that look like /main/error.php?e=404 (ect) are the ones that redirect 404's 403's etc towards error.php, which in turn provides a nice politely worded page that explains to the viewer what the problem is. The page has a menu with correct links, so that they stay on site and find what they are looking for.

The problem is that incorrect URL's in the root of the domain, do not work this way e.g. a mis-spellt short URL. So if you tried typing in www.brummiesfans/nonsense you get totally the wrong sort of 404 page.

As I said in the first post, there was an old htaccess page, that DID redirect incorrect URL's in the root of the domain to /main/error.php BUT that was at a cost of any other short URL functioning.

Having slowly learned a bit more about this, I suspect that there may be a wording in that htaccess file that redirects every URL, that points to the root, (including "proper" short URL's) as if it was nonsense, and connects it to /main/error.php instead

If my suspicion is correct, then maybe it can allow for exceptions, i.e. those short URL's that we have listed. Would this be a possibility?

g1smd

11:41 pm on Jan 23, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If the requested URL resolves to your main script, the script must internally "make" the HTML and content for the error page and send it with a 404 header. There is no way for .htaccess to serve an error page for this, because as far as .htaccess is concerned, there is no error - the URL HAS resolved to an internal file. That script must send the error messages and 404 status when there is no content matching the requested URL.

If a URL request does not resolve to any internal file, then, and only then, will .htaccess send the ErrorDocument document and the HTTP 404 header.

jdMorgan

12:18 am on Jan 25, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I didn't have time for anything but a very quick review of this thread, but the basic problem is that (as g1smd points out), you are externally redirecting to your script instead of simply rewriting requests for the /blogs URL-paths to the /main/blog.php filepath.

This is in fact the difference between your 'broken' code and the code that you said works; Your broken code specifies external client redirect syntax, while the working code simply takes an incoming client request for a URL and internally rewrites it to a script.

A redirect is invoked in response to a URL request, and sends a response to the HTTP client (e.g. browser or search engine robot) saying, "The resource you requested has moved to a different URL. Please ask for it again at this new URL: http:example.com/new-url-here.xyz"

This ends the current HTTP transaction, and the client must make a new HTTP request using that provided URL in order to see the content it was originally asking for. This means that the HTTP client must make two request in a row for the page that it wants -- highly-inefficient, and "very not good" for search ranking...

In contrast, an internal rewrite simply modifies the filepath associated with a particular requested URL, changing it from the default associated filepath to something different. This occurs entirely within the context of the original client-initiated HTTP transaction and, correctly-implemented, is completely invisible to the client.

Understanding the vast differences between a URL-path and a file-path and between an external URL-to-URL redirect and an internal URL-to-filepath rewrite will be critical to your success in comprehending the responses above, and in fixing this problem.

Jim

uk_martin

12:49 am on Jan 25, 2011 (gmt 0)

10+ Year Member



The current htaccess file is a tidied up version of that one that was described in the post at 12:00 pm on Jan 21, 2011

Correct me if I am wrong, but as I see it, are not the "external" redirects, those which would include badly formed URL's that would ordinarilly produce "404" pages, and which are being handled by the following lines of the htaccess file?

# th23 start - error pages
ErrorDocument 400 /main/error.php?e=400
ErrorDocument 401 /main/error.php?e=401
ErrorDocument 403 /main/error.php?e=403
ErrorDocument 404 /main/error.php?e=404
ErrorDocument 500 /main/error.php?e=500
# th23 end - error pages


I know that taking this block of instruction out, the 404's revert to plain vanilla black text on white background, 404 pages.

I am beginning to wonder if there is something in the PHP script within error.php...but I keep on coming back to the fact that with the old htaccess file, irrespective of the "blogs" reference, any mis-spellt URL was getting the custom 404 page, correctly constructed...and now that's no longer happening.