Forum Moderators: phranque

Message Too Old, No Replies

Maximum RewriteRules in httpd.conf?

         

helpnow

3:57 am on Jul 9, 2008 (gmt 0)

10+ Year Member



Greetings! My head is bloodied from banging it against the wall on the following issue:

I have thousands and thousands of URLs. Let's say 100,000. They all include an ID#, like xyz.com/page1, xyz.com/page2, etc.

Most of them are fine. But some are duplicate content, so I need to collapse some into the primaries. Let's say, page2 is a duplicate of page4, so I need to 301 page 2 into page 4. And so on...

I have, say, 5,500 duplicates that I need to 301 into other pages.

I cannot use a generic rule, because most URLs will stay as is.

It is all database driven, so it is easy for me to create a long list of the rewrites I need, and then cut-and-paste them into the httpd.conf, and then take it live, like this:

RewriteRule (.*)/xyz/p100231(.*) $1/xyz/p140828$3 [R=301]

The above works perfectly, for 1 rule, or even 100 or more...

The problem is, when I put all 5,500 into my httpd.conf file, and restart apache, it gives me errors. The line number it says has the error makes no sense - it is exactly the same as all the rest of the lines, except for the ID#s.

I also have concerns that even if I get this to work, I will run into performance issues. I've never had a httpd.conf file so big - it is now 665 K in size.

-> Is there a limit to the # of RewriteRules I can get into an httpd.conf file - is this why I am getting errors when I restart? Everything I try leads me to believe the problem is the size of the file... Is 665 K really all that big for an httpd.conf file?

-> Is there another approach I can use that might be smarter? Am I going to have performance issues even if I get this to work?

Thank you in advance for your insight! I have been unable to solve this on my own all day... Driving me insane... ; )

jdMorgan

3:19 pm on Jul 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would recommend that you look into using RewriteMap to call a script that looks at your database to find out if the URL is a duplicate, and then looks up the 'replacement' URL if so.

This allows you to use one single rule, and to administrate the duplication issue from within your main database -- a big plus from a maintenance standpoint.

Jim