Forum Moderators: phranque

Message Too Old, No Replies

Using .htaccess to rewrite dynamic to static

Help with session id's

         

RoseMarie

2:58 pm on Dec 9, 2003 (gmt 0)

10+ Year Member



Hello,

I have a shopping cart that uses session id's to keep track of the shopper. I have been experimenting with mod_rewrite to change the look of the url from dynamic to static. I started out learning using the long way of doing things. Now I would like to learn how to shorten things up and make it work more for me.

What I want it to be:
www.mycompany.com/advanced_plan.shtml

my real link is:
www.mycompany.com/web/index.cgi?page=advanced_plan.htm&cart_id=123456.1234

This is what I have so far:

RewriteRule on
RewriteRule ^(.*)\.shtml$ /web/index.cgi?page=$1.htm [L]

How do I get the cart_id number into the rewrite? I've read about RewriteCond, but I just can't seem to get a grasp on it. Any advice would be appreciated.

Thank You,
RoseMarie

jdMorgan

8:54 pm on Dec 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RoseMarie,

The first issue to address is where do you *want* the session ID to appear in the URL?

You could do something like:

www.example.com/advanced/<cart_id_number>/cart.shtml
-or-
www.example.com/<cart_id_number>/advancedcart.shtml
-or even-
www.example.com/cart/<cart_id_number>/shop.shtml

where <cart_id_number> is the numeric value as you showed above, maybe substituting "_" for ".".

Personally, I'd choose something like the latter, as it would allow you to use robots.txt to disallow some spiders from crawling the "cart directory" if they proved troublesome. It also avoids the use of any contraction of "advanced" that might look like "adv" - this might trigger an "advertising" block in some proxy servers.

There are any number of variants that would work technically; There need be no relationship between the requested URL and the actual script path and query string, except that the requested URL must be unique enough to know you want "advanced" versus (I presume) "regular", "cart" versus "welcome_page" or something else, and it must pass the cart number. You'll also need to avoid URLs that might look like actual (existing or planned) directories and files. But as long as you can come up with a consistent naming approach, it'll work.

You will also need a method to support search engine spiders if you want those pages indexed, but you must not give them a cart number. Or, if you do, it must be a fixed number always; it must not ever change. Otherwise the spider will try to index every "page" with every cart number it can find -- a huge load on your server, and a good way to get your site dropped by the spider, because they detect these 'infinite' URL-spaces and quit indexing. Technically, this is user-agent- and remote-IP-address-dependent behaviour, called 'cloaking' by some. It is not cloaking with the intent of fooling anyone, but you must be careful to get it right, otherwise it might look like it is.

On the other end, your script will need to be modified to output these "friendly" URLs to browsers and search engine spiders, and your choice of URL format might be influenced by those script-modification factors as well.

The main challenge here is define how you want to format the URLs and figure out how to modify the script on the back-end. Implementing the front-end URL translation in mod_rewrite will usually be pretty easy.

Jim

theblade24

1:24 am on Dec 17, 2003 (gmt 0)

10+ Year Member



Jim, I am in a similar situation with a shopping cart that has dynamic pages that I'd like to somehow show as html in the url. I have a static html index page, however from that point you enter the cart and everything is dynamically created at that point.

It is my understanding that this url may be too long and contain items that would cause the spiders to not index the page.

Here is an example of one of my product pages that I'd like to have the url worked on to make this friendly enough that every product we offer, information page(ie. about us, FAQ, etc) will be spidered well even though it is dynamically produced by the cart.

This is the store front page, I guess it has a user id in it that must stay somehow?

http://www.example.com/shop/cgi-bin/cp-app.cgi?usr=51F5854380&rnd=2107776&rrc=N&affl=&cip=&act=&aff=&pg=store

And this is a specific product page:

http://www.example.com/shop/cgi-bin/cp-app.cgi?usr=51F5854380&rnd=9807449&rrc=N&affl=&cip=&act=&aff=&pg=
prod&ref=SS-JETEPHS008029&cat=&catstr=

I would think ideally it would be good for the product page to show as SS-JETPHS008029.html....the productid.html?

I'm very very new to this, no nothing of how this works. I dont kow if I need some file with directions on the server, in what directories etc etc... but have been told it could be done via this route and still have the usr, cookie, etc etc work properly but have the url reduced to something the spiders like. I have an awesome cart but really want to somehow make it search engine friendly....

Help?

Marty

PS. If you visit the site via the above urls, it is sort of under const. I am moving from another cart to this one and this is the new cart. If you stay in the product area etc and not click on the browse store menu or static page links youll stay in this cart.

[edited by: jdMorgan at 6:23 am (utc) on Dec. 17, 2003]
[edit reason] No personal URLs, please. Sidescroll fix. [/edit]

jdMorgan

6:41 am on Dec 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Marty,

Welcome to WebmasterWorld [webmasterworld.com]!

Your best bet is to eliminate the user ID for spiders, or to assign each spider a permanent unique spider-ID. You will then have to detect these spider IDs when real visitors follow links from search engines into your site, and assign them real user-IDs.

Since you are in the process of changing carts, I suggest you take the opportunity to discuss this issue with customer support for the new cart. If they cannot provide assistance, you might consider selecting a cart that already provides search-engine-spider-friendly URLs.

The intricacies of the various shopping carts are well beyond my range of experience, so I cannot comment on them further. However, the general requirements are that a cart script must output friendly URLS. It must not require session IDs or cookies to function on the pages you want spidered. It must allow spiders to crawl those pages using the same URL and query string (if any) for each page every time. It is a bonus if the script will accept search-engine-friendly URLs when called but if it doesn't, then mod_rewrite on Apache and ISAPI filters on MS-based servers can be used to convert friendly URLs to script URLs plus query strings after each HTTP request is received but before the script is called. The script then outputs more friendly URLs for use in subsequent browser/spider requests.

Doing a site search here on WebmasterWorld for e-commerce, shopping cart, friendly URL, and mod_rewrite will turn up a wealth of information. We also have a Library available through the link at the top of each page. Each forum has a charter to describe what subjects are posted in it, and the Terms of Service (link below) defines most of our rules. Have a look around and see if this will help you understand/define your cart/URL problems better and to make the best use of WebmasterWorld.

Jim