Forum Moderators: phranque

Message Too Old, No Replies

The Ultimate 404 File

How can I parse the URL as variables?

         

WannaKnowSEO

9:34 pm on Oct 31, 2008 (gmt 0)

10+ Year Member



Let's assume that we can treat / - ¦ _ all as "spaces", we'd have a url like:

www.example.com/records/jimi-hendrix.html translate to:

records jimi hendrix

Now let's suppose that the URL doesn't exist, so it's a 404, but that there is actually content on the site that includes that text and would be relevant. What would the htaccess code be to redirect that url (under the condition that it's a 404) to:

http://www.example.com/?cx=013924456556940616359:i4bhjtblwt8&cof=FORID:9&q=records+jimi+hendrix&sa.x=0&sa.y=0&sa=Search

The latter url, here, is then using Google's Custom Search to show results for the user's theoretical intention.

[edited by: tedster at 9:44 pm (utc) on Oct. 31, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]

jdMorgan

10:01 pm on Oct 31, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What code have you tried? Please be aware that we can discuss your code here, but not write it for you.

The key would be to take the requested filename and test it for "does not resolve to an existing file" before redirecting it. You would do this with mod_rewrite's "RewriteCond %{REQUEST_FILENAME} !-f" directive.

In order to avoid doing a (slow, inefficient) filesystem check on every single request to your server, you'd also want to make the URL-path pattern in the RewriteRule as specific as possible; Note that the RewriteCond will not be processed if the RewriteRule pattern does not match, so this can save an awful lot of unnecessary work for your CPU.

If you're not familiar with mod_rewrite, see the resources cited in our Apache Forum Charter.

Jim

mayest

8:26 pm on Nov 1, 2008 (gmt 0)

10+ Year Member



Google's Webmaster Tools can create some javascript for you to add to your 404 page to do exactly what you are asking (unless I misunderstood). Take a look at this post [googlewebmastercentral.blogspot.com].