Forum Moderators: phranque

Message Too Old, No Replies

I need some help with this rewrite rule

         

darthmalis

5:14 am on Jul 17, 2007 (gmt 0)

10+ Year Member



I am trying to create a rewrite rule in my .htaccess to allow me to change image names on the fly. I am useless when it come to RegEx and even more so when it come to rewrite rules. Here is the rule I have written:

RewriteRule ^([0-9]+)/.*--(.*)\.(jpg¦png¦gif)$ /images/$1-$2.$3

Basically I am trying to change the actual image name like 12345-name_of_the_file.gif in my images directory into a an neat URL like 12345/some_dynamic_keyword_related_text--name_of_the_file.gif

I think the rule looks good but I am getting a 500 Internal Server Error message with it in my .htaccess. I am probably missing something stupid here (like maybe it's impossible to do what I'm trying to do.)

vincevincevince

5:49 am on Jul 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Basically, you have it back-to-front - you need to provide the 'false' URL first and the 'real' URL second.

darthmalis

5:11 pm on Jul 17, 2007 (gmt 0)

10+ Year Member



The false URL would be ^([0-9]+)/.*--(.*)\.(jpg如ng夙if)$
ie:
58289/keywords--filename.png
91046/more_keywords--another_filename.gif
00376/cool_description--not_so_cool_filename.jpg

and the "real" URL to each of these files would be /images/$1-$2.$3
images/58289--filename.png
images/91046--another_filename.gif
images/00376--not_so_cool_filename.jpg

I think I got that part right. Am I even more confused than I thought? :?

darthmalis

5:18 pm on Jul 17, 2007 (gmt 0)

10+ Year Member



EDIT:

Ok, as I was reading my last post I realized that I had /images/$1-$2.$3 instead of /images/$1--$2.$3

That fixed the the 500 error but the rewrite doesn't work. when I go to 12345/text--the_file_name.gif I get a 404 event though the file images/12345--the_file_name.gif exists.

jdMorgan

11:48 pm on Jul 17, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do you have other working RewriteRules, or is this the first?

If this is the first, you'll need to set an Option to enable mod_rewrite and also start the Rewrite Engine


Options +FollowSymLinks
RewriteEngine on

before you can use RewriteRule.

Note that the Options directive above is either
1) Required and allowed by your host
2) Not required, but allowed by your host
3) Not required, and not allowed by your host
4) Required, but not allowed by your host

Only testing will determine which of these is the case. For case (4), you cannot use mod_rewrite.

Take a look at your server error log; The message associated with the 404 will likely be quite useful in diagnosing the problem, since it will tell you where the server is actually trying to find the file.

Jim

darthmalis

6:40 am on Jul 18, 2007 (gmt 0)

10+ Year Member



Thank you guys so much for your help. But EUREKA I fixed it. It is now

^([0-9]{5}).*/--(.*)\.(.*)$ images/$1--$2.$3

I don't exactly know what was wrong before. I just think it wanted to be difficult. It is my first custom rewrite rule after all.

Thanks again for your help guys. :)

g1smd

1:43 pm on Jul 18, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The (.*) entries are "greedy" matching the whole string and then backing off one character at a time. You are often advised to find another alternative or have only one of those per rule. Some extra tweaking may therefore be in order, later, to improve performance.

darthmalis

12:26 am on Jul 19, 2007 (gmt 0)

10+ Year Member



Ok, I'll go back to
^([0-9]{5})/.+--(.+)\.(jpg¦gif¦png¦bmp¦jpeg)$
to eliminate one and change the * to + to make sure there is something in there but I think they need to be greedy unless it might help to allow only characters that could exist after urlencode() but I think in the end it wouldn't change much. What do you think?

Thanks for the advice. :)

jdMorgan

1:04 am on Jul 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Using negative-matching and making the "e" in "jpeg" optional would be even more efficient:

^([0-9]{5})/[^-]+--([^.]+)\.(jpe?g夙if如ng在mp)$

You won't be able to use the "[^-]+" subpattern if you have any hyphens in that portion of the URL, though.

Jim

darthmalis

2:29 am on Jul 19, 2007 (gmt 0)

10+ Year Member



About the jpe?g part I slapped myself on the forehead. With the rest, I am completely lost. If I am reading this correct, (I'd be surprised) This would not allow a - in the first part or a . in the second part. So 12345/this-word--file.name.gif would would fail to match twice right?

I think that I would need to allow - but I think I am starting to see why the . in there could cause a problem (which also leads me to see why the - could be a problem.) The dot shouldn't be necessary so I can str_replace() it with - and maybe replace -- with - when saving the image and use ^([0-9]{5})/[^.]+--([^.]+)\.(jpe?g夙if如ng在mp)$

Am I getting this yet?

darthmalis

2:34 am on Jul 19, 2007 (gmt 0)

10+ Year Member



Woah! Now I think I'm really confused. Wouldn't [^.]+ not allow anything? I would think [^\.]+ would would disallow a stray . in there which is how I read it before. I could be totally lost now. lol

jdMorgan

2:59 am on Jul 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The meaning of some regex tokens within [grouped alternates] changes from the usual.

"[^.]+" means, "Accept one or more characters not equal to a literal period." Or alternately in this project's context, "match all characters up to the period before the filetype."

The only characters that need to be escaped within [groups] are "]", "\", <space>, and in some positions, "^" and "-". At the beginning of a [group], "^" means "NOT".

Jim

darthmalis

3:14 am on Jul 19, 2007 (gmt 0)

10+ Year Member



That makes sense. I thought it looked right but after reading it again I suddenly got real confused. I've found that the biggest problem with understanding RegEx is not all the separate pieces. That part is (almost) easy. But when you put all those pieces together in a string, it starts looking ridiculous and very confusing.

Anyway, Thank you so much jd. I read in another thread that somebody says you don't know what you're talking about. For the record, that guy's an idiot.

P.S. thanks to this thread, I finally [understand] this: /(bb¦[^b]{2})/
[from a T-shirt at thinkgeek] lol

[edited by: jdMorgan at 3:37 am (utc) on July 19, 2007]
[edit reason] No URLs, please. [/edit]

jdMorgan

3:42 am on Jul 19, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, too bad that the logical expression of ( bb OR NOT(bb) ) will always evaluate as true, and is therefore not useful.

In language, we use OR differently than in formal logic. In logic, the result is true if either or both is true. But in speech we assume that both will not be true, in a way more like the logical XOR (exclusive OR) operator.

Jim