Welcome to WebmasterWorld Guest from 35.173.48.224

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

404 - possible to use a generic redirect

and tell the bots the file's moved permanently?

     
7:00 pm on Dec 17, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


I have a new version of a site that had about 50 pages in the old version. When I publish the new site I want to catch all the existing links to these outdated pages and redirect them to the new home page. So I'm thinking of using something like:

ErrorDocument 404 / [R=301]

in hopes that this will tell the bots that the link they're checking (and which doesn't exist) has been moved to (the top page for the site).

What problems will this cause if any?

7:15 pm on Dec 17, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 22, 2002
posts:2546
votes: 0


For fifty pages, I would think you could redirect them individually.

Search engine wise, I think it would be wiser to 301 each old page to its counterpart, rather than to the home page. Just my 2c, however.

BTW, I was just reading Shopping Carts 101 [webmasterworld.com]...Nice!

7:28 pm on Dec 17, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 22, 2002
posts:2546
votes: 0


Also, after reading:

httpd.apache.org/docs/mod/core.html#errordocument

I don't think the ErrorDocument directive allows for optional flags(R,L,etc).

Birdman

7:42 pm on Dec 17, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


Re: individual pages - yeah I suppose so but I was hoping to avoid it. Boring at best and it just fattens up the htaccess file.

Just tested it and you're right - the errordocument line does not like the flag.

7:54 pm on Dec 17, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 22, 2002
posts:2546
votes: 0


If you prefer to just let the spiders find the new pages via your new homepage, then a standard errorDocument 404 / should do the job.

Are the old pages in folders or root?

8:00 pm on Dec 17, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


301-redirecting that many files to your home page may look like a scam to our friends at the 'plex, too.

I'd suggest adding some explanatory (and apologetic) text, plus a text link to your home page on your custom 404-page, and also a 5-to-15 second meta-refresh redirect to your home page. This is a 302 redirect and so makes no implicit claim that the home page is a replacement for the custom error page or for the missing pages.

I've used that technique for years without ill effect in the search engines.

As Birdman says, it would be better to redirect each page to a close counterpart if one is available, and then use the method above for those that have no reasonable replacement.

You can minimize the number of code lines needed for the redirect by using the regex inline 'OR' function:


RewriteRule ^(file1¦file2¦file3¦file4)\.html$ http://www.example.com/newwidgets.html [R=301,L]

Replace the broken pipe "¦" characters with the solid pipe character from your keyboard.

Jim

8:39 pm on Dec 17, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


Thanks for the help folks.

One last follow up. I mapped the missing pages and then added my 404 line:

ErrorDocument 404 /404.php

But the server chokes when I add this line. I've tried it immediately after the 'RewriteEngine on' and after my last mask (hiding dynamic pages) 'RewriteRule ^(.*)\.html$ $1.php [L] [T=application/x-httpd-php]'. When I first tested this without the additional mappings it worked. An idea as to why it's not working now?

9:05 pm on Dec 17, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 22, 2002
posts:2546
votes: 0


Have you tried moving the line above "rewriteEngine on"?
9:30 pm on Dec 17, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


Doesn't it need to be within the Rewrite directive?
9:45 pm on Dec 17, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


No,

ErrorDocument is an Apache 'core' directive, while Rewrite**** is a mod_rewrite directive. They are handled separately.

Also, look at your server error log to see what the problem was. You may need to add an AddHandler directive in order to parse php error pages if you *only* use that RewriteRule to redirect to php files, and do not reference them in any other way. In that case, you may not have a handler defined for php files, and any php file accesses (including ErroDocument) not done using the RewriteRule will not work.

This directive would take the form:


AddHandler server-parsed php

You could also add

AddType application/x-httpd-php php

Jim
9:45 pm on Dec 17, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 22, 2002
posts:2546
votes: 0


I don't think so. It is an Apache core feature.

too quick for me jd

10:02 pm on Dec 17, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


Understood.

I think the issue is the wildcarded rules for *.htm and *.html which were designed to serve up php files. These rules seem to be taken before a 404 is detected (which makes sense). SO the test file foo.html initiates a rewrite to foo.php but foo.php does not exist. So the server delivers a generic 404 error. Seems like I may have to nail down the specific files I'm depending on the rewrite rule to handle in order to use the errordocument directive.

10:47 pm on Dec 17, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Hmmm...

The process does indeed go as you have surmised, but the server should not deliver a 'generic' 404 error page, it should deliver your custom 404.php page. You are not redirecting 404.php, you are redirecting <anything>.html.

Is it possible that you have some other Redirect, RedirectMatch, or RewriteRule directives that are interfering?

Jim

11:31 pm on Dec 17, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


Other than the long regex to capture the old pages the 2 lines that follow it are:

RewriteRule ^(.*)\.htm$ $1.php [T=application/x-httpd-php]
RewriteRule ^(.*)\.html$ $1.php [L] [T=application/x-httpd-php]

I verified the 404.php file is there and have tried both a relative and absolute path.

12:40 am on Dec 18, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


That last line has a problem - you should combine both flags within one set of square brackets, separated by a comma. As a matter of fact, you can combine both RewriteRules into one by making the "l" in "html" optional (follow it with "?") :

RewriteRule ^(.*)\.htm[b]l?[/b]$ $1.php [T=application/x-httpd-php[b],L[/b]]

I don't see why that would cause your problem, though... :(

What kind of error are you getting?

Jim

12:56 am on Dec 18, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


re: flags - got it

re: errors. The bogus file I'm asking for - which I know doesn't exist - is foo.html.

The requested URL /foo.php was not found on this server.

Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

1:08 am on Dec 18, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


This says that /404.php does not exist in your web root directory. (?)

Jim

1:09 am on Dec 18, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


Well that's what I thought but I can see it right there in the root. Should I use a file system path or a webserver path?
4:31 am on Dec 18, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


No, ErrorDocument requires a local path, such as /404.php in order to return the proper status. If you put a full URL in there, you'll get a 302-Moved Temporarily status, which can be a search engine nightmare (It's also the single-most common mistake made with ErrorDocument).

Your ErrorDocument code is correct!

What happens of you request /404.php directly (from your browser address bar)?

If you're going crazy, be assured I'm going with you! :o

Jim

2:34 pm on Dec 18, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


HA! You're not going to believe this but here goes the sordid truth.

I entered the URI for the 404 page and got it - but with errors. Seems I forgot to remove the references to my development directory for things like the CSS, includes, and images.

Changed the paths on these and the file was delivered fine. Tested for foo.html and voila - the 404 page appears.

What do you make of that?

5:33 pm on Dec 18, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


> What do you make of that?

Success!

6:02 pm on Dec 18, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 31, 2002
posts:7577
votes: 4


Ayup - I owe you two beers now.