Forum Moderators: phranque

Message Too Old, No Replies

Dealing with improper-case requests

         

RobBroekhuis

3:34 pm on Oct 21, 2004 (gmt 0)

10+ Year Member



Can I set things up in .htaccess so that requests for a url that differs only in capitalization from the proper one go through without generating a 301?
Rob

jdMorgan

3:44 pm on Oct 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes and no.

You can use mod_speling to correct *one* incorrect character. Using this module is rather slow.
You can use a recursive mod_rewrite routine to correct case errors, but this is even slower (horribly slow).
Neither of these is a good solution if you get a lot of incorrect-case requests.

If you have access to httpd.conf, you can use a RewriteMap to call the system toupper or tolower functions. This is much better, efficiency-wise.

Jim

Nutter

4:14 pm on Oct 21, 2004 (gmt 0)

10+ Year Member



What about having a 404.php (or .pl, or .cgi, or whatever) that...
- Gets the page the user typed in
- Converts that page name to lower case (or upper if that's how your files are stored)
- Looks for the page name in the correct case
- If it finds it, show it (301 maybe); if not, show a 404 error;

Never done it, but it seems like it should work. It would also only have to run when a page isn't found instead of for every page.

- Ryan

jdMorgan

4:31 pm on Oct 21, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, a script-based approach would probably be more efficient if you don't have httpd.conf access.

However, you should never use a 404 error-handler for this purpose, because the 404 response must not be made visible to the client (you can damage your search engine rankings badly if you allow the client to see a 404).

Instead, it would be better to either detect a missing file using mod_rewrite (See RewriteCond -f flag) or detect the uppercase characters, and then do an internal rewrite to the script.

Jim

RobBroekhuis

5:07 pm on Oct 21, 2004 (gmt 0)

10+ Year Member



Thanks for the suggestions. I doubt I have access to httpd.conf (I'm on a shared server), don't even know if I can do mod-rewrite (never tried - I guess I should, just to know for sure). Nutter's approach is in essence what the built-in functionality is now: the server figures out that my visitor really wanted the properly capitalized version, and serves back a 301. Most browsers handle this nicely, and collect the correct version in response. Some idiotic bots don't, and even get stuck in (near)infinite loops. Unfortunately, my filenames are mixed-case, so I can't apply a simple rule to generate the correct name. I guess I'll live with the current minor annoyances.
Rob

Nutter

7:13 pm on Oct 21, 2004 (gmt 0)

10+ Year Member



I think I may have misunderstood. I thought these were cases when people typed in your address, not when they clicked on it. If they are just mistyped, it shouldn't matter about search engines. They'd (search engines) presumably never see it.

On mixed case: If you don't have too many files, it may be possible to loop through files and compare case insensitivly (sp). Of course, with too many files this may be too intensive to be practical. Maybe a 404 w/ just a site map would be best for that.

- Ryan