Forum Moderators: phranque

Message Too Old, No Replies

n00b rewrite question

just want to check I'm doing it right!

         

HelenDev

3:45 pm on Jan 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



# mod_rewrite in use
RewriteEngine On
RewriteRule (.*)$ http://www.example.com/

I am taking my first baby steps in this and am using the above code to redirect any links which point to a particular domain, to another place.

I just wanted to make sure this is the correct way to do this, and that I'm not in danger of creating any infinite loops or other chaos which might incur the wrath of the server guy.

HelenDev

4:46 pm on Jan 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



# mod_rewrite in use
RewriteEngine On
RewriteRule .*$ http://www.example.com/

I've done a bit more reading... I guess I don't need the brackets in there? But I still do need the $ right?

Finally found a tutorial on regular expressions which isn't complete greek to me, so I'm getting there!

jdMorgan

8:08 pm on Jan 15, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I just wanted to make sure this is the correct way to do this

To do what?

That code will certainly loop if it is placed and executed on the www.example.com server. It will happily redirect requests for all pages and objects on www.example.com to the home page of www.example.com and then redirect that home page request to itself repeatedly.

If www.example.com is on another server, then it won't loop. It will simply redirect all page and object requests made to the server on which it resides to the home page on the www.example.com server.

So the answer depends on what you are trying to accomplish and where this code is located.

Jim

[edited by: jdMorgan at 8:10 pm (utc) on Jan. 15, 2008]

HelenDev

2:47 pm on Jan 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the reply jd.

Sorry if I didn't give enough info...

I generally want to use it for places where documents have been moved or removed in the past. I'm getting a few errors from external sites in the Google webmaster tools, which say people are trying to go to pages like

www.example.com/mydirectory/someoldfile.pdf
www.example.com/mydirectory/someotherfile.pdf

So I'm putting my code in an htaccess file within www.example.com/mydirectory/ to send all such requests straight to the home page or another main page as appropriate.

Just wanted to make sure I'm not doing anything st00pid :)

jdMorgan

3:40 pm on Jan 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The proper way to handle that is to return a 410-Gone response, not a 301 redirect.

Then use a custom 410 error document to explain the situation, and provide alternatives to help the visitor find what they were looking for.

Be aware that a client requesting a .pdf document may not be able to display an HTML error page. Nonetheless, the server should still signal that the requested document is gone, since that is the factually-correct response.

301-redirecting *all* requests for *any* document in /mydirectory creates a massive-duplicate-content problem for you home page, and opens you up to willful abuse by competitors or malicious ne'er-do-wells. A 410 is the proper response to avoid this.

404 and 410 error documents should contain an explanation --apologetic in tone-- that the document is missing for unknown reasons (404) or has been intentionally removed (401), and that no direct replacement is available. They should also provide text links to as many as the following resources as possible:

  • Home page
  • Related-category page (requires scripting)
  • Site map
  • Site search facility

    Optionally, an on-page HTML meta-refresh (with a timer long enough for all of the above text and links to be read, understood, and considered) can be used to forward the visitor to the home page if no choice is made. If a meta-refresh is to be used, a notice of this should also be included ("Please choose one of the links above. If you do not choose a link, you will be automatically forwarded to our home page in 30 seconds").

    The syntax for a 410-Gone RewriteRule and ErrorDocument looks like this:


    ErrorDocument 410 /path-to-410-error-document.html
    RewriteRule .* - [G]

    Note that ErrorDocument requires a local URL-path, not a canonical URL. If a complete URL is specified, a 302-Found response will be issued, which is incorrect. See the ErrorDocument documentation for more details on this behaviour.

    The above code (yours and mine) assumes that all documents in /mydirectory have been removed -- All requests to that directory will result in a 410-Gone response.

    If possible, it would be better to specify all of the documents which have been removed, returning a 410-Gone only for those; Any other requests should get a 404-Not Found. In that case, you'll likely need several RewriteRules, using specific old URL-paths or patterns which only match valid-but-removed URLs.


    RewriteRule ^old-doc1\.pdf$ - [G]
    RewriteRule ^older-doc3\.pdf$ - [G]

    etc.

    Jim

  • HelenDev

    5:03 pm on Jan 16, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Excellent, that was just the sort of advice I needed, thanks Jim :)

    I'll get onto it...