Welcome to WebmasterWorld Guest from 54.224.78.106

Forum Moderators: Ocean10000 & incrediBILL & phranque

Questions about rewriterules for 500+ pages

     
5:25 pm on Jan 12, 2017 (gmt 0)

New User

joined:Jan 12, 2017
posts: 7
votes: 0


I'm working on a site that has 500+ product pages where the product ID is the last part of the url. The client wants to change the last part to the product name so they can more easily identify items in Google Analytics. An example old url is:
www.example.com/bn/products/product-details/21

The new url is:
www.example.com/bn/products/product-details/MC622

I have created all of the redirects manually like this:
RewriteRule ^bn/products/product-detail/21$ http://www.example.com/bn/products/product-detail/MC622 [R=301]

Since Google has indexed all 500+ product urls, the client doesn't want to lose any SEO, and they definitely don't want any 404 errors.

My question is this. Is having 500+ rewriterules in the htaccess file OK? Is there some other way this can be accomplished?
Also, I have found that in order for these rewriterules to work, they must be placed above all other rewriterules. This makes sense as the new urls are redirected by a later rewriterule. So back to my question. Is having the 500+ product page rewriterules above all other rewriterules bad? Will it affect the page response? Is the htaccess file normally cached?

Thanks for any help.
9:37 pm on Jan 12, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13620
votes: 421


[R=301]

Yikes. The [R] flag doesn't carry an implied [L], so always always use them as a package: [R=301,L]

Is having 500+ rewriterules in the htaccess file OK?

Do you mean, from the POV of server performance? Sure. With htaccess, the main issue is simply having htaccess in the first place--or, more exactly, allowing htaccess (config file setting), because on every single request the server has to go all the way up the chain, checking to see whether there's an htaccess and, if it finds one, seeing what it says for each separate module.

An htaccess, as such, is not cached. Browsers do cache 301 responses to specific requests, but that's generally to the good; it means if the human user requests the same page two days in a row, the browser will "remember" the redirect and make the request directly.

So the mere existence of 500 separate rules is pretty insignificant, assuming you've arranged them in the most optimal order. You might incorporate something with an [S] flag for non-page requests, or requests involving non-product pages, so the server doesn't have to go through the whole list every time. (Disclaimer: I have never personally used an [S] flag. Just sayin.)

There is an alternative approach that goes like this:
RewriteRule ^bn/products/product-detail/ /example.com/fixup.php [L]
where "fixup.php" is a little script that contains the names of all your old URLs and their new targets. The script performs the lookup and issues the redirect--or issues a 404 if it can't find a rule for the requested URL. (You could optionally capture the request and attach it to the php as a query, but it shouldn't make any difference, since php can find the Request-URI unaided.) Note that even though the RewriteRule itself has only an [L] flag, it needs to be located among other RewriteRules that create redirects.

I have found that in order for these rewriterules to work, they must be placed above all other rewriterules.

Just the other day we were talking in another thread about rule ordering. The rough-and-ready version--there are exceptions, like the "fixup.php" I just mentioned--is: first group your rules in order of severity. Access-control rules ([F] flag) go before [G] rules which go before [R] rules which go before [L] rules which go before rules that have no flag at all. (Your site may not have all these groups.) And then, within each category, arrange rules from most specific to most general.
11:03 pm on Jan 12, 2017 (gmt 0)

New User

joined:Jan 12, 2017
posts: 7
votes: 0


Thanks for your reply lucy24. I saw your replies on many questions, and I was hoping you would answer mine too.

I know just enough about redirects to be dangerous<grin>. I thought the L flag indicated the Last Rule, and therefore everything below it was ignored. I guess that's what I really want. If the L flag terminates the search when an exact match is found, that would be desirable because there is always an exact match for each product item. Thanks for mentioning the importance of pairing the R with L flag. I didn't know that.

If having 500 rules is insignificant, would there be any advantage to using your alternate method? I'm inclined to stay with what I have plus adding the L flag.

first group your rules in order of severity


Thanks again for your help.
You mean like [F], [G], and on down the list you provided? If so, I get it.
2:06 am on Jan 13, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13620
votes: 421


You mean like

Yup. The idea is that there's no point in, for example, issuing a redirect [R] if the visitor is going to end up getting blocked outright.

would there be any advantage to using your alternate method?

Mneh. Six of one, half a dozen of the other. It reduces clutter in your htaccess; and if you've made a mistake, it only affects selected files, rather than bringing the whole site crashing to the ground. On the other hand you have to learn 2 or 3 words of php if you happen not to know them already, and you have to keep track of two different documents.

All of that is for you, the human user. The server does pretty much the same work either way.
2:16 pm on Jan 13, 2017 (gmt 0)

New User

joined:Jan 12, 2017
posts: 7
votes: 0


Thanks again lucy24.

I have another slightly related question. How long should these redirects be kept in the htaccess file. At some point, I would imagine, the search engines will update all links. Any idea how long that might take? A month, a year, a decade? Do spiders look at an htaccess file, or is it invisible to them until a redirect sends the 301? If spiders do not "see" the htaccess file, I would think that the redirects would need to be in a long time as some products may be much less popular than others.

Thanks for sharing you considerable knowledge.
3:20 am on Jan 14, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13620
votes: 421


How long should these redirects be kept in the htaccess file.
Forever. The rate of requests will soon drop off, but they never stop entirely. This applies both to 301 and to 410 responses.

At some point, I would imagine, the search engines will update all links. Any idea how long that might take? A month, a year, a decade?
It depends on how popular the page is--anywhere from hours to months.

Do spiders look at an htaccess file, or is it invisible to them until a redirect sends the 301?
Nobody sees the actual htaccess file. (Try it. Request example.com/.htaccess at, ahem, your own site. If you do not get a 403 response you have the world's worst host and should change right away.) Each request passes through the htaccess file, which analyzes the request and makes any needed changes, up to and including issuing redirects.

Further detail will have to wait, as my computer is in the shop and typing on an iPad is Not Fun.
2:49 pm on Jan 16, 2017 (gmt 0)

New User

joined:Jan 12, 2017
posts: 7
votes: 0


Thanks again lucy24. You have cleared up several concerns.
8:38 pm on Jan 16, 2017 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:7325
votes: 477


Related, but not related to apache. Had a customer who wanted the same thing (couldn't match urls to product). Instead of messing with the server, OR google and 301, I wrote a report conversion that took the raw report, labeled the products and produced a final tally per period selected, etc.

I mention this only in that sometimes you don't need to mess with things that are already working fine and avoid any chance of error for either the user, the search engines, or you.
10:21 pm on Jan 16, 2017 (gmt 0)

New User

joined:Jan 12, 2017
posts: 7
votes: 0


Agreed.
12:54 am on Jan 17, 2017 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10815
votes: 57


Do spiders look at an htaccess file, or is it invisible to them until a redirect sends the 301? If spiders do not "see" the htaccess file, I would think that the redirects would need to be in a long time as some products may be much less popular than others.

spiders cannot crawl the rewrite directives on a properly configured server.
spiders can only request urls from the server, which processes those directives before providing an appropriate response.

Nobody sees the actual htaccess file.

even if poor server security practices allowed a spider to crawl the ,htaccess file or a copy thereof, it wouldn't know to process that file as server directives nor would the spider necessarily have enough of the server configuration available to determine the appropriate response for any given url.
2:31 pm on Jan 17, 2017 (gmt 0)

New User

joined:Jan 12, 2017
posts: 7
votes: 0


Thanks phranque. Your comment reconfirms in my mind the need to keep the redirects in place for a long time. I'm glad I asked.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members